# **DesignConEast 2005**

Track 6: Board and System-Level Design (6-TA4)

# Performance Model for Inter-chip Busses Considering Bandwidth and Cost

**Authors:** Brock J. LaMeres,

**University of Colorado** /



Sunil P. Khatri

**Texas A&M University** 

### **Problem Statement**

- Performance in VLSI Systems is Limited by Noise from the Package
- An Analytical Model for System Performance is needed for:
  - 1) CAD/CAE
  - 2) Quick Hand Calculations

# Agenda

- 1) Problem Motivation
- 2) Analytical Model Development
- 3) Simulation Results
- 4) Example Use Model

• Transistor Technology is Faster than Package Technology



• Today's Packages Have Inductive Parasitics



QFP – Wire Bond (~4.5nH)



**BGA** – Wire Bond (~3.7nH)

• Inductive Interconnect Causes Noise When Signals Switch:



**Simultaneous Switching Noise (SSN)** 

### 1) Supply Bounce

• Switching current through inductive packaging induces voltage:

$$V_{bnc} = L \cdot \left(\frac{di}{dt}\right)$$

L = Inductance of pwr/gnd pin that current is being switched through.

• Multiple Signals Switching Increase the Problem:

$$V_{bnc} = L \cdot \sum_{i}^{n} \left( \frac{di}{dt} \right)$$

n = # of drivers
sharing the power/gnd
pin (L).

### 2) Pin-to-Pin Coupling

Switching Signals Couple Voltage onto Neighbors:

• Multiple Signals Switching Increase the Problem:

$$V_{couple} = \sum_{1}^{k} M_{1k} \cdot \left(\frac{di_{k}}{dt}\right)$$

• Package Inductance Creates Simultaneous Switching Noise



• SSN in Package Limits di/di

 $\frac{di}{dt} \propto SSN$ 

Aggressive Package Design will Reduce Inductance



- But is Expensive
  - 95% of VLSI design-starts are wire-bond

- Modern Design Practice
  - 1) Acceptable SSN Limits are Defined.
  - 2) Fastest (di/dt) is selected that doesn't violate limits.
- Limitations of Approach
  - SPICE is used to evaluate SSN.
  - This takes too much time.
  - The entire range of variables cannot be evaluated quickly (package, # of pwr/gnd, bus width, etc...).

- We need an *Analytical Model* to Evaluate Off-Chip Bus Performance
  - 1) Package Parasitics
  - 2) Package Cost
  - 3) Bus Width
  - 4) # of Power/Grounds
- This can be used to find Optimal Bus Configuration

### "Desired Performance for the Least Cost"

• Test Circuit Topology

- 0.1um CMOS Tx/Rx
- +1.5v Vdd, 0.35 Vt
- 25mA Drive Strength
- Series Terminated

• Failure Modes

**Power Supply Droop** 

**Signal Coupling** 

**Ground Bounce** 

**Power Supply Droop = Ground Bounce** 

Bus Parameters



**W**<sub>BUS</sub>: # of Signals Per Bus Segment of Interest

• Bus Parameters



N<sub>G</sub>: # of Grounds Per Bus Segment of Interest

Bus Parameters



**Repetitive Pattern of Signal, Power, and Ground Pins** 

**SPG**: (# of Signals): (# of PWR's): (# of GND's)

**SPR: SPG Ratio** 

#### • Bus Parameters



# **Example:**

 $\mathbf{W}_{\mathrm{BUS}}$ : 4

 $N_G$ : 1

**SPG: 4:1:1** 

**SPR** : 4

• Bus Performance Description

#### **Slewrate**



$$slewrate = \left(\frac{dv}{dt}\right) = \left(\frac{di}{dt}\right) \cdot Z_{load}$$

• Bus Performance Description

#### **Risetime**



$$t_{rise} = \frac{(0.8) \cdot V_{DD}}{slewrate}$$

• Bus Performance Description

#### **Minimum Unit Interval**



### • Bus Performance Description

### **Bus Throughput**



$$TP_{\max} = W_{BUS} \cdot DR_{\max}$$

• Bus Performance Limits



**L**<sub>11</sub> : Self Inductance of Ground Path

$$V_{bnc_{self}} = L_{11} \cdot \sum_{1}^{W_{bus}} \left( \frac{di_1}{dt} \right)$$

Bus Performance Limits



**M**<sub>1k</sub>: Mutual Inductance Between Pins

$$V_{bnc_{couple}} = \sum_{2}^{W_{bus}} M_{1k} \cdot \left(\frac{di_{k}}{dt}\right)$$

#### Bus Performance Limits

### **Maximum Acceptable Ground Bounce**



$$V_{bnc-MAX} = p \cdot V_{DD}$$

$$(p_{typical} = 5\%)$$

Model Development

#### **Maximum Ground Bounce**

$$V_{gnd-bnc} = p \cdot V_{DD} = \begin{pmatrix} W_{bus} \cdot L_{11} \\ N_{g} \end{pmatrix} \begin{pmatrix} di \\ dt \end{pmatrix} + \sum_{k=2}^{W_{bus}} \begin{pmatrix} M_{1k} \frac{di}{dt} \end{pmatrix}$$

$$Self \qquad Coupling$$

$$Contribution \qquad Contribution$$

Model Development

### **Maximum Slewrate**

$$\left(\frac{dv}{dt}\right)_{\text{max}} = \frac{p \cdot V_{DD} \cdot Z_{load}}{\left(\frac{W_{bus} \cdot L_{11}}{N_g}\right) + \sum_{k=2}^{W_{bus}} M_{1k}}$$

- pull out (di/dt)
- convert to (dv/dt)

Model Development

#### **Minimum Risetime**

$$t_{rise-min} = \frac{\left(0.8\right) \cdot \left[\left(\frac{W_{bus} \cdot L_{11}}{N_g}\right) + \sum_{k=2}^{W_{bus}} \left(M_{1k}\right)\right]}{p \cdot Z_{load}}$$

#### - convert slewrate to risetime

Model Development

#### **Maximum Datarate**

$$DR_{\text{max}} = \frac{p \cdot Z_{load}}{(1.5) \cdot (0.8) \cdot \left[ \left( \frac{W_{bus} \cdot L_{11}}{N_g} \right) + \sum_{k=2}^{W_{bus}} M_{1k} \right]}$$

- convert Risetime to Datarate

### **Maximum Throughput**

$$TP_{\max} = W_{BUS} \cdot DR_{\max}$$

### • SPICE Simulations were Performed on Three Packages



**QFP** – Wire Bond



**BGA - Wire Bond** 



**BGA – Flip-Chip** 

| Package | $L_{11}$ | K <sub>12</sub> | $K_{13}$ | $K_{14}$ | K <sub>15</sub> | K <sub>16</sub> |
|---------|----------|-----------------|----------|----------|-----------------|-----------------|
| QFP-wb  | 4.550n   | 0.744           | 0.477    | 0.352    | 0.283           | 0.263           |
| BGA-wb  | 3.766n   | 0.537           | 0.169    | 0.123    | 0.097           | 0.078           |
| BGA-fc  | 1.244n   | 0.630           | 0.287    | 0.230    | 0.200           | 0.175           |

| Package | Cost Per-Pin |
|---------|--------------|
| QFP-wb  | \$0.22       |
| BGA-wb  | \$0.34       |
| BGA-fc  | \$0.63       |

### • QFP Wire-Bond Package Simulations



- Throughput reaches an asymptotic limit as channels are added

### BGA Wire-Bond Package Simulations



- Level 1 : BGA Increases Performance Over QFP

• BGA Flip-Chip Package Simulations



- Level 2: Flip-Chip Increases Performance Over Wire-Bond

Cost Must Also Be Considered in Analysis

#### **Bandwidth Per Cost**

$$BPC = \left(\frac{TP}{Cost_{bus} \cdot 1e^{6}}\right) \quad \text{Units} = \text{(Mb/\$)}$$

• This Metric Represents "Cost Effectiveness of the Bus"

### • Cost per Bus Configuration

|                    |      | Number of Channels |      |       |       |  |
|--------------------|------|--------------------|------|-------|-------|--|
| Bus Configuration  | 1    | 2                  | 4    | 8     | 16    |  |
| QFP-WB 8:1:1       | 0.66 | 0.88               | 1.32 | 2.20  | 4.40  |  |
| QFP-WB 4:1:1       | 0.66 | 0.88               | 1.32 | 2.62  | 5.28  |  |
| QFP-WB 2:1:1       | 0.66 | 0.88               | 1.76 | 3.52  | 7.04  |  |
| \$<br>BGA-WB 8:1:1 | 1.02 | 1.36               | 2.04 | 3.40  | 6.80  |  |
| BGA-WB 4:1:1       | 1.02 | 1.36               | 2.04 | 4.08  | 8.16  |  |
| BGA-WB 2:1:1       | 1.02 | 1.36               | 2.72 | 5.44  | 10.88 |  |
| BGA-FC 8:1:1       | 1.89 | 2.52               | 3.78 | 6.30  | 12.60 |  |
| BGA-FC 4:1:1       | 1.89 | 2.52               | 3.78 | 7.56  | 15.12 |  |
| BGA-FC 2:1:1       | 1.89 | 2.52               | 5.04 | 10.08 | 20.16 |  |

• Performance Increases with Cost (Package, SPG)

#### • Bandwidth Per Cost Results

|                   | Number of Channels |      |      |      |     |
|-------------------|--------------------|------|------|------|-----|
| Bus Configuration | 1                  | 2    | 4    | 8    | 16  |
| QFP-WB 8:1:1      | 612                | 722  | 505  | 309  | 152 |
| QFP-WB 4:1:1      | 1188               | 1122 | 1036 | 532  | 289 |
| QFP-WB 2:1:1      | 2245               | 2165 | 1515 | 758  | 379 |
| BGA-WB 8:1:1      | 503                | 594  | 402  | 234  | 112 |
| BGA-WB 4:1:1      | 1188               | 1032 | 747  | 390  | 304 |
| BGA-WB 2:1:1      | 2179               | 1961 | 1153 | 577  | 327 |
| BGA-FC 8:1:1      | 1764               | 1323 | 1085 | 847  | 385 |
| BGA-FC 4:1:1      | 2016               | 2116 | 2016 | 1411 | 743 |
| BGA-FC 2:1:1      | 2822               | 3527 | 2785 | 1924 | 920 |

**Faster Narrower Busses = More Cost Effective** 



### **On-Chip**

- 8 bit Data Bus
- 300 Mb/s

### **Package**

- Need

$$(8)(300M) = 2400 Mb/s$$

**Need:** 

2400 Mb/s



**QFP** – Wire Bond

- 4 bits wide, SPG=2:1:1

**BGA - Wire Bond** 

- 1 bit wide, SPG=2:1:1
- 16 bits wide, SPG=4:1:1

#### **BGA** – Flip-Chip

- 1 bit wide, SPG=2:1:1
- 1 bit wide, SPG=4:1:1
- 1 bit wide, SPG=8:1:1

### Cost of Each Bus Configuration

|                   | Number of Channels |      |      |       |       |  |
|-------------------|--------------------|------|------|-------|-------|--|
| Bus Configuration | 1                  | 2    | 4    | 8     | 16    |  |
| QFP-WB 8:1:1      | 0.66               | 0.88 | 1.32 | 2.20  | 4.40  |  |
| QFP-WB 4:1:1      | 0.66               | 0.88 | 1.32 | 2.62  | 5.28  |  |
| QFP-WB 2:1:1      | 0.66               | 0.88 | 1.76 | 3.52  | 7.04  |  |
| BGA-WB 8:1:1      | 1.02               | 1.36 | 2.04 | 3.40  | 6.80  |  |
| BGA-WB 4:1:1      | 1.02               | 1.36 | 2.04 | 4.08  | 8.16  |  |
| BGA-WB 2:1:1      | 1.02               | 1.36 | 2.72 | 5.44  | 10.88 |  |
| BGA-FC 8:1:1      | 1.89               | 2.52 | 3.78 | 6.30  | 12.60 |  |
| BGA-FC 4:1:1      | 1.89               | 2.52 | 3.78 | 7.56  | 15.12 |  |
| BGA-FC 2:1:1      | 1.89               | 2.52 | 5.04 | 10.08 | 20.16 |  |

#### **Most Cost Effective:**

- BGA-WB
  - $\mathbf{W}_{\text{bus}} = \mathbf{1}$
  - -SPG = 2:1:1

• Bandwidth-per-Cost of Each Bus Configuration





**Higher BPC = More Headroom** 

### Summary

- 1) Package Noise Limits System VLSI Performance
- 2) An Analytical Model was Presented to Predict Bus Performance
- 3) Datarate Approaches an Asymptotic Limit as Channels are Added
- 4) Throughput Can be Achieved Using Different Bus Configurations

# Questions?