You are on page 1of 423

BD03: Digital Physical Design

Version 1.0

STUDENT
HANDOUT

June 16, 2008

Legal Notices
Copyright Notice
© 1990-2008 Cadence Design Systems, Inc. All rights reserved.

e
When printed on paper, this presentation qualifies as a STUDENT HANDOUT.
This course and the material in it is owned by Cadence Design Systems, Inc. (Cadence), 2655 Seely Avenue, San Jose, CA
95134, USA. Unless you have received express written approval directly from Cadence, you are not allowed to copy, scan,

c
replicate, disclose, distribute, or publish this document, or any part of it.

Confidentiality Notice

n
No part of this publication may be reproduced in whole or in part by any means (including photocopying or storage in an
information storage/retrieval system) or transmitted in any form or by any means without prior written permission from
Cadence Design Systems, Inc. (Cadence).

e
Information in this document is subject to change without notice and does not represent a commitment on the part of Cadence.
The information contained herein is the proprietary and confidential information of Cadence or its licensors, and is supplied
subject to, and may be used only by Cadence’s customer in accordance with, a written agreement between Cadence and its

d
customer. Except as may be explicitly set forth in such agreement, Cadence does not make, and expressly disclaims, any
representations or warranties as to the completeness, accuracy or usefulness of the information contained in this document.
Cadence does not warrant that use of such information will not infringe any third party rights, nor does Cadence assume any

a
liability for damages or costs of any kind that may result from use of such information.
RESTRICTED RIGHTS LEGEND Use, duplication, or disclosure by the Government is subject to restrictions as set forth in

c
subparagraph (c)(1)(ii) of the Rights in Technical Data and Computer Software clause at DFARS 252.227-7013.
UNPUBLISHED This document contains unpublished confidential information and is not to be disclosed or used except as
authorized by written contract with Cadence. Rights reserved under the copyright laws of the United States.

6/16/08 BD03: Digital Physical Design 2


Copyrights and Trademarks
Cadence Trademarks
Trademarks and service marks of Cadence Design Systems, Inc. (Cadence) contained in this document are attributed to
Cadence with the appropriate symbol. For queries regarding Cadence’s trademarks, contact the corporate legal department at

e
the address above or call 800.862.4522 from the US or +1.408.943.1234 internationally.
Allegro® HDL-ICE® Silicon Ensemble®

c
Accelerating Mixed Signal Design® Incisive® Silicon Express™
Assura® InstallScape™ SKILL®
BuildGates® IP Gallery™ SoC Encounter™

n
Cadence® (brand and logo) NanoRoute® SourceLink® online customer support
CeltIC® NC-Verilog® Specman®

e
Conformal® NeoCell® Spectre®
Connections® NeoCircuit® Speed Bridge®
Diva® OpenBook® online documentation library UltraSim®

d
Dracula® OrCAD® Verifault-XL®
ElectronStorm® Palladium® Verification Advisor®

a
Encounter® Pearl® Verilog®
EU CAD® PowerSuite® Virtuoso®
Fire & Ice® PSpice® VoltageStorm®

c
First Encounter® SignalStorm® Xtreme®
HDL-ICE® Silicon Design Chain™
Other Trademarks
Open SystemC, Open SystemC Initiative, OSCI, SystemC, and SystemC Initiative are trademarks or registered
trademarks of Open SystemC Initiative, Inc. in the United States and other countries and are used with permission. All
other trademarks are the property of their respective holders.

6/16/08 BD03: Digital Physical Design 3

Phase 1: Curriculum Map


General (all disciplines)

e
BG01
BG02 IC Process
Semiconductor BG03 IC Packaging BG04 Test and DFT
and Devices

c
Business Processes

n
Digital Discipline

e
BD01 Digital IC BD02 Digital IC BD03 Digital
Tool
Lab Training
Training
Architecture Design Physical Design

Analog Discipline

a
BA01 Analog IC
d
BA02 Mixed-Signal
BA04 Mixed-Signal

c
CMOS Physical Lab Training
Design IC Design
Implementation
or

BA03 RF IC Design

6/16/08 BD03: Digital Physical Design 4


Course Objectives
After taking this course, you will be able to

e
‹ Draw a complete flowchart of the digital design implementation flow
and explain the steps in detail

n
design and timing analysis

c
‹ Describe how cell libraries and timing libraries are used for physical

‹ Explain the steps involved in synthesis, floorplanning, placement,

e
clock tree synthesis, routing, extraction, delay calculation, static timing
analysis, and design optimization

a d
‹ Contrast power consumption and power grid analysis, and apply
power saving design techniques

c
‹ Explain the issues involved with signal integrity

‹ Describe how engineering change orders (ECOs) are performed, and


how chips are physically verified and taped out

6/16/08 BD03: Digital Physical Design 5

Course Policies
It is important that you attend class. Your participation is essential.

e
‹ Three or more absences will damage your grade.

‹ Be on time.

‹ Be respectful of each other.

nc
‹ Conduct only one conversation at a time.

e
‹ Turn off cell phones, pagers, and laptops.

d
‹ Get involved.
‰ Come prepared to discuss the day’s assignment.

a
‰ Volunteer.

c
‰ Ask questions.
‰ Share relevant ideas and observations.
‰ Offer your own experiences.

6/16/08 BD03: Digital Physical Design 6


Assignments and Grades
Assignments

e
‹ Assignments are discussed in class.

‹ Assignments are due on the date indicated in the syllabus.

Grades

nc
‹ Keep a copy of all assignments you hand in.

d e
A. Outstanding achievement, exceeding course requirements

B. Praiseworthy performance, meets course requirements and criteria

a
C. Average, satisfactory performance

c
D. Below average, marginal performance

6/16/08 BD03: Digital Physical Design 7

Assignments and Grades (continued)


Assignment Percentage Due Date

e
Homework 1: 15% 7/30/08
Describe the issues, changes in design flow,

c
and considerations that design teams must
take into account when designing for a deep
submicron process (90 nm or less).

n
Homework 2: 15% 8/6/08
Create a clock tree constraint file for

e
automatic CTS based on a specification.

Homework 3: 20% 8/13/08

d
Part I. Given several scenarios, calculate
static and dynamic power.
Part II. Given several IR-drop heat maps,

a
discuss the potential problems and solutions.
Part III. Given a block diagram and several
scenarios, discuss which possible low-power

c
design methods can be used to reduce overall
power.

Formal Study Group Presentation 10% 8/18/08 – 8/21/08

Final Exam 40% 8/22/08

6/16/08 BD03: Digital Physical Design 8


Course Calendar
Week Day Module and Topics Assignments (Due Date)

e
1 July 21 Introduction to Digital Physical Design flowchart, activity in class.
Implementation
- Inputs

c
- Steps in Flow
1 July 22 Introduction and Overview of Layout LEF terms, activity in class.

n
Technology
- Layout Layers Homework Assignment 1 (7/3/08):

e
- Introduction to Physical Verification, Describe the issues, changes in design flow,
DRC/LVS, DRC and considerations that design teams must
- Cell Libraries, LEF Syntax take into account when designing for a deep

d
submicron process in 90 nm or less.

1 July 23 Timing Libraries and Constraint Create timing constraints, activity in class.

a
Files
- Concepts
- Libraries

c
- Constraint Files
2 July 28 Synthesis Review log file and optimization steps,
- Logical Synthesis Optimization Steps activity in class.
- Physical Synthesis Overview

6/16/08 BD03: Digital Physical Design 9

Course Calendar (continued)


Week Day Module and Topics Assignments (Due Date)

e
2 July 29 Floorplanning and Placement Examples of floorplans, activity in class.
- Floorplanning Fundamentals
- Placement Fundamentals

c
2 July 30 Clock Tree Synthesis Homework Assignment 2 (7/10/08):
- Clock Trees and Clock Tree Synthesis Describe the issues, changes in design flow,

n
- Clock Tree Specification and considerations that design teams must
- CTS Reports take into account when designing for a deep
submicron process in 90 nm or less.

e
- Low-Power Clocking Techniques
3 Aug 4 Routing Review routing log files, activity in class.
- Fundamentals

d
- Special Types of Routing
3 Aug 5 Power Consumption and Power Grid Homework Assignment 3 (7/17/08):

a
Analysis Part I. Given several scenarios, calculate
- Power Consumption static and dynamic power.

c
- Power Grid Analysis Part II. Given several IR-drop heat maps,
- Low-Power Design Techniques discuss the potential problems and solutions.
Part III. Given a block diagram and several
scenarios, discuss which possible low-power
design methods can be used to reduce
overall power.

6/16/08 BD03: Digital Physical Design 10


Course Calendar (continued)
Week Day Module and Topics Assignments (Due Date)

e
3 Aug 6 Extraction and Delay Calculation Flowchart with SPEF/SDF, activity in class.
- Extraction Models and SPEF Format
- Delay Calculation Fundamentals and SDF

c
Format
4 Aug 11 Static Timing Analysis and Signal Timing and SI report analysis, activities in

n
Integrity Analysis class.
- Timing Constraints and Analysis

e
- Design Rule Verification
- Signal Integrity Fundamentals Analysis
4 Aug 12 Design Optimization Review optimization cases, activity in class.

d
- Fundamentals
- Types

a
4 Aug 13 Engineering Change Orders, Design ECO scenarios and tapeout requirements,
Verification, and Tapeout activities in class.

c
- ECO Types and Fundamentals
- Physical Verification Overview
- Tapeout Requirements

6/16/08 BD03: Digital Physical Design 11

Course Calendar (continued)


Week Day Module and Topics Assignments (Due Date)

e
5 Aug 18 Formal Study Group Presentation

c
5 Aug 19 Formal Study Group Presentation

5 Aug 20 Formal Study Group Presentation

5
Aug 21

Aug 22 Final Exam

en
Formal Study Group Presentation

a d
6/16/08
c BD03: Digital Physical Design 12
Recommended Text
‹ Hennessy, John L. and Patterson, David A. Computer Architecture,
Fourth Edition: A Quantitative Approach. San Francisco, CA: Morgan

e
Kaufmann. 2007.

c
‰ ISBN-10: 0123704901
‰ ISBN-13: 978-0123704900

en
a d
6/16/08
c BD03: Digital Physical Design 13

Instructor Information
‹ Instructor name:

e
‹ Phone:

‹ E-mail:

‹ Office location:

‹ Office hours:

nc
d e
ca
6/16/08 BD03: Digital Physical Design 14
Introduction to Digital Physical
Implementation
Module 1

June 16, 2008

The Life of a CMOS Inverter

Specification RTL Gates

e
“Device that outputs the module invx1(a,z);
inverse of its input with

c
input a; invx1
minimum size and power”
output z; a z
assign z=!a;

n
endmodule

d e
Transistor and Layout

a
VDD VDD

6/16/08
c a

GND
z a

BD03: Digital Physical Design


VSS
z

16
Design Implementation Flow
Much like the simple CMOS inverter, the general process of digital design
implementation is the transformation of a design into various representations,

e
eventually into physical hardware devices, just on a much BIGGER scale.

SPEC

nc RTL Gates

d e
ca Layout

6/16/08 BD03: Digital Physical Design 17

Module Objectives
In this module, you will be able to

e
‹ Draw a complete flowchart of the digital design implementation flow

nc
d e
ca
6/16/08 BD03: Digital Physical Design 18
Learning Activity
In this activity, you will
‹ Complete a flowchart of the digital
design implementation flow
‹ Include the design flow steps

‹ Include the necessary inputs and

ce RTL

n
outputs Design Flow
?

e
Step

15 minutes for activity

a d ?

6/16/08
c BD03: Digital Physical Design
GDSII

19

Topics in This Module


‹ Overall design flow

e
‹ Basic implementation flow

‹ Example flow

nc
d e
ca
6/16/08 BD03: Digital Physical Design 20
Overall Design Flow
A design flow can be divided into three phases:

e
‹ System

‹ Logical

‹ Physical

nc
In each phase, two main processes need to be performed:
‹ Implementation

‹ Verification

d e
ca
6/16/08 BD03: Digital Physical Design 21

Overall Design Flow


VERIFICATION Specification

e
IMPLEMENTATION System Simulation Designer
SYSTEM

c
Microarchitecture

n
System Simulation Designer

e
RTL
LOGICAL

Formal
Verification Logic Simulation Logic Synthesis

a d
Gate Level Simulation
Gates

Place/Route
Synthesized Netlist

c
PHYSICAL

Timing GDSII Placed/Routed Design


Signoff

Physical Verification Layout

GDSII GDSII

6/16/08 BD03: Digital Physical Design 22


Implementation Flow
Specification

e
Designer

SYSTEM
c
Microarchitecture

n
Designer

e
RTL

LOGICAL
Logic Synthesis

a
Gates

Place/Route

d Synthesized Netlist

6/16/08
c GDSII

Layout

GDSII
Placed/Routed Design

GDSII

BD03: Digital Physical Design


PHYSICAL
23

Implementation Flow
Specification

e
Designer

c
Microarchitecture
Front-end chip design
FRONT-END

definition: Processes in

n
Designer the overall chip design flow
that involve system and
logical design and

e
RTL verification

Logic Synthesis

a
Gates

Place/Route

d
Synthesized Netlist

c
Back-end chip design
BACK-END

definition: Processes in the


GDSII Placed/Routed Design overall chip design flow that
involve physical design and
Layout verification

GDSII GDSII

6/16/08 BD03: Digital Physical Design 24


Topics in This Module
‹ Overall design flow

e
‹ Basic physical implementation flow

‹ Example flow

nc
d e
ca
6/16/08 BD03: Digital Physical Design 25

Back End or Physical Design


The terms “physical design” or “back end” or
“place/route” encompass many process

e
steps, such as
‹ Floorplanning
‹ Placement
‹ Clock Tree Synthesis (CTS)
‹ Route

nc Gates

Place/Route
Synthesized Netlist

‹ Extraction
‹ Delay Calculation

d e
‹ Static Timing Analysis (STA)
Gates

Place/Route
Placed/Routed Gates

a
GDSII GDSII
‹ Signal Integrity

c
‹ Design Optimization
‹ Physical Synthesis
‹ Design Verification
‹ Mask Prep
6/16/08 BD03: Digital Physical Design 26
Back-End Implementation Flow
Specification Floorplanning Place/Route

e
Designer Placement

c
Microarchitecture
Scan Reorder

Physical Synthesis
n
Design Optimization
Designer

Static Timing Analysis


Pre-CTS

Delay Calculation

Signal Integrity
e
Extraction
CTS
RTL
Design Optimization

d
Post-CTS
Logic Synthesis
Route

a
Synthesized Design Optimization
Gates Gates
Post-Route

c
Detail
Routed GDSII
Design

Layout Design Verification

GDSII GDSII Mask Prep

6/16/08 BD03: Digital Physical Design 27

Topics in This Module


‹ Overall design flow

e
‹ Basic implementation flow

‹ Example flow

nc
d e
ca
6/16/08 BD03: Digital Physical Design 28
Flow Example
Let’s take a simple example through the implementation flow.

e
We will cover each step and highlight the following:

c
‹ Definition and step in the overall flow

‹ Inputs and outputs

‹ Formats

‹ Example per step

en
a d
6/16/08
c BD03: Digital Physical Design 29

What Is a Specification?
Ideas begin with a specification, which
Floorplanning Place/Route
Specification
can be a textual, graphical, or

e
sometimes a software representation. Designer Placement
Physical Synthesis

Microarchitecture

c
Scan Reorder

‹ Definition: A specification is an
Static Timing Analysis

Design Optimization
Designer
Delay Calculation

Pre-CTS
Signal Integrity

explicit set of requirements to be


Extraction

RTL

n
CTS

satisfied by a material, product, Design Optimization


Post-CTS

or service. Logic Synthesis

e
Route
Synthesized Design Optimization
Netlist Gates Post-Route
‹ Example: The specification for

d
Detail
the latest chip specified a 250- Routed
Design
GDSII

MHz core clock with a serial

a
Layout Design Verification
interface, able to process 1 Mb
GDSII
of data per second at less than GDSII Mask Prep

c
10W total power.

6/16/08 BD03: Digital Physical Design 30


What Is a Microarchitecture?
Step between the specification and
RTL, the microarchitecture defines how

e
the block will be implemented. Floorplanning
Specification Place/Route

c
‹ Definition: The microarchitecture Designer Placement

implements the specification and

Physical Synthesis
Microarchitecture Scan Reorder

n
defines specific mechanisms and

Static Timing Analysis


Design Optimization
Designer

Delay Calculation
Pre-CTS

Signal Integrity
structures for achieving that

Extraction
RTL

e
CTS

implementation. Design Optimization


Post-CTS
Logic Synthesis

d
Route

‹ Example: For Block A, the Synthesized


Gates Design Optimization
Netlist Post-Route
designer created a

a
Detail
microarchitecture and partitioned Routed
Design
GDSII

the block into several smaller

c
Layout Design Verification
modules.
GDSII GDSII Mask Prep

6/16/08 BD03: Digital Physical Design 31

Specification and Microarchitecture: Input and Output, Format


Specification
‹ Input: Requirements from

ce
Marketing, CEO (Chief Executive
Officer), CTO (Chief Technology
Officer), etc.

n
‹ Output: Document or model in Specification
text/graphics or software (C++,

e
SystemC, SystemVerilog, etc.)
Designer
format

Microarchitecture

a d
‹ Input: Specification + requirement
from designer
Microarchitecture

6/16/08
c
‹ Output: Typically a document in
text/graphics, could be software
as well

BD03: Digital Physical Design 32


Example: Specification
Let’s assume we have a specification,
microarchitecture, and RTL.

We are designing a chip called “EX”


with

c
‹ Three main partitions “A,” “B,” and
e EX (Block Diagram)

n
“C”

e
‹ Memories in each partition
din, clk A C dout
‹ Perimeter I/O

‹ 250-MHz clock

a
‹ 10W total power

d B

c
‹ Die size not to exceed 10x10 mm2
due to custom package
requirements

6/16/08 BD03: Digital Physical Design 33

Example: Microarchitecture
For Block C
‹ 32-bit data bus interface to
Block A
‹ 16-bit control interface from
Block B

ce EX (Block Diagram)

n
‹ Use 64 Mb of SRAM

e
‹ Duplicate datapath elements in a 32

parallel implementation
din, clk A C dout

d
‹ Limit of five clock cycles from data 16

input processed to data output B

ca
6/16/08 BD03: Digital Physical Design 34
What Is Logic Synthesis?
‹ Definition: The process of Floorplanning
Specification Place/Route
parsing, translating,

e
Designer Placement
optimizing, and mapping RTL
code into a specified standard

Physical Synthesis
Microarchitecture Scan Reorder

c
cell library

Static Timing Analysis


Design Optimization
Designer

Delay Calculation
Pre-CTS

Signal Integrity
Extraction
RTL CTS

n
‹ Example: To determine the Design Optimization
Post-CTS
Logic Synthesis
feasibility of the design, we

e
Route

need to synthesize the RTL Synthesized


Netlist Gates Design Optimization
Post-Route

code into gates and measure Detail

d
Routed GDSII
timing, power, and area. Design

a
Layout Design Verification

GDSII GDSII Mask Prep

6/16/08
c BD03: Digital Physical Design 35

Logic Synthesis: Input and Output, Format


‹ Input
RTL SDC
‰ RTL in the Verilog® language or

e
other HDL

c
‰ Constraints in Synopsys Design
Constraints (SDC) format
Logic Synthesis Library

n
‰ Timing Libraries in Liberty (.lib)
format Synthesized
Gates

e
Netlist
‹ Output
‰ Gate Level Netlist in the Verilog

d
language or other HDL

ca
6/16/08 BD03: Digital Physical Design 36
Example: Logic Synthesis
We use the RTL for blocks A, B, and C
RTL
to produce the following netlists:

e
For Blocks A, B, C
block_a.vg

c
block_b.vg Logic Synthesis

n
block_c.vg
Gates
At the top level EX, the module are

e
Synthesized
instantiated: Gates

d
// top.vg block_a.vg
block_b.vg
block_c.vg
module ex (…);

a
top.vg

block_a u0 (…);

c
block_b u1 (…);
block_c u2 (…);

endmodule

6/16/08 BD03: Digital Physical Design 37

What Is Floorplanning?
‹ Definition: Process of deriving Floorplanning
Specification Place/Route
the die size, allocating space for

e
Designer Placement
soft blocks, planning power, and
macro placement.
Physical Synthesis

Microarchitecture Scan Reorder

c
Static Timing Analysis

Design Optimization
Designer Pre-CTS
Delay Calculation

Signal Integrity

‹ Example: The three blocks of the


Extraction

RTL CTS

n
chip were floorplanned to Design Optimization
Post-CTS
Logic Synthesis
minimize the distance between the

e
Route

I/Os of the blocks and their Synthesized


Netlist Gates Design Optimization
Post-Route

interfaces to the chip. This Detail

d
Routed GDSII
reduces the routing between the Design

blocks and, thus, improves the

a
Layout Design Verification

timing and routability of the GDSII GDSII Mask Prep


design.

6/16/08
c BD03: Digital Physical Design 38
Floorplanning: Input and Output, Format
‹ Input
Synthesized Netlist
‰ Gate Level Netlist in the Verilog

e
SDC TCL
language or other HDL Gates

c
‰ Constraints in Synopsys Design
Constraints (SDC) format
Floorplanning

n
‰ Logical Timing Libraries in Liberty
(.lib) format Logical
Library
Physical
Library
Gates +

e
‰ Physical Libraries in LEF format DEF

‰ Floorplan constraints and script in Floorplanned Design

d
TCL

‹ Output

a
‰ Floorplanned design in the Verilog

c
language (logical connectivity
data) or other HDL + DEF
(physical data)

6/16/08 BD03: Digital Physical Design 39

Example: Floorplanning
With a top level netlist, we can begin to
floorplan the chip
‹ Set die size to 10x10 mm2

‹ Assign the din, clk, and dout I/Os


to the perimeter.

ce EX

n
‹ Create hard blocks for A, B, and C din dout

e
‹ Size the blocks A, B, and C A C
‹ Perform power planning

‹ Perform macro placement

a d
‹ Check for early routing congestion
clk

B
10mm

c
‹ Check for early block utilization
10mm

6/16/08 BD03: Digital Physical Design 40


Example: Floorplanning (continued)
For each block, we can also perform
some early checks.
C
‹ Assign pins

‹ Place RAMs and macros

‹ Check power plan

ce from_a dout

n
‹ Check for early routing congestion

e
RAM A1

RAM A0
‹ Check for early block utilization clk

d
It is important to make sure the floorplan
is routable and meets the utilization

a
requirements with a given RAM and from_b
macro placement, pin assignment, etc.

6/16/08
c BD03: Digital Physical Design 41

What Is Placement?
‹ Definition: Process of placing Floorplanning
Specification Place/Route
the standard cells in a

e
Designer Placement
floorplanned design.
Microarchitecture
Physical Synthesis

Scan Reorder

c
‹ Example: After the chip was
Static Timing Analysis

Design Optimization
Designer
Delay Calculation

Pre-CTS
Signal Integrity

floorplanned, we performed
Extraction

RTL CTS

n
placement and discovered the Design Optimization
Post-CTS

floorplan was too small to fit all Logic Synthesis

e
Route

of the cells and macros in the Synthesized


Netlist Gates Design Optimization
Post-Route

design. Detail

d
Routed GDSII
Design
‹ Question: How can we avoid

a
Layout Design Verification

this problem?
GDSII GDSII Mask Prep

6/16/08
c BD03: Digital Physical Design 42
What Is Physical Synthesis?
‹ Definition: The combination of
Floorplanning Place/Route
Specification
logical synthesis and

e
placement. Designer Placement

Physical Synthesis
Microarchitecture

c
Scan Reorder

‹ Example: To meet timing, we

Static Timing Analysis


Design Optimization
Designer

Delay Calculation
Pre-CTS

Signal Integrity
ran physical synthesis which, in

Extraction
RTL

n
CTS

addition to upsizing and Design Optimization


Post-CTS
downsizing components, also Logic Synthesis

e
Route
ran logic restructuring. Synthesized
Gates Design Optimization
Netlist Post-Route

d
Detail
Routed GDSII
Design

a
Layout Design Verification

GDSII GDSII Mask Prep

6/16/08
c BD03: Digital Physical Design 43

Placement and Physical Synthesis: Input and Output, Format


‹ Input
Floorplanned Design
‰ Floorplanned design in the Verilog

e
SDC TCL
Gates +
language or other HDL + DEF DEF

c
‰ Constraints in Synopsys Design
Constraints (SDC) format Placement

n
‰ Logical Timing Libraries in Liberty
Logical Physical
(.lib) format Gates + Library Library

e
DEF
‰ Physical Libraries in LEF format
‰ Placement constraints and script Placed Design

d
in TCL

‹ Output

ca
‰ Placed design in the Verilog
language or other HDL + DEF

6/16/08 BD03: Digital Physical Design 44


Example: Placement
‹ If the design is small enough (<300K instances), we can run standard cell
placement “top down” at the EX level and place everything at once.

Top-Down Placement

ce
‹ Or we can place the standard cells for each of the blocks separately.

Bottom-Up Placement

n
EX C

din
A C

d e dout from_a dout

a
RAM A1

RAM A0
c
clk clk
u10 u11 u12
B u14
u13
u15
u16 u17

from_b

6/16/08 BD03: Digital Physical Design 45

What Is Scan Reorder?


‹ Definition: Process of re-
Floorplanning Place/Route
Specification
connecting the scan chains in a

e
design to optimize for routing, Designer Placement

timing, etc.
Physical Synthesis

Microarchitecture

c
Scan Reorder
Static Timing Analysis

Design Optimization
Designer
Delay Calculation

Pre-CTS
Signal Integrity

‹ Example: Since logic synthesis


Extraction

RTL

n
CTS

arbitrarily connects the scan Design Optimization


Post-CTS

chain, we need to perform scan Logic Synthesis

e
Route
reorder after placement so that Synthesized
Gates Design Optimization
Netlist Post-Route
the scan chain routing will be

d
Detail
optimal. Routed
Design
GDSII

a
Layout Design Verification

What is a scan chain?


GDSII GDSII Mask Prep

c
A scan chain is the connection
of the flip-flops in a design, such
that test patterns can be scanned
in and results scanned out
during automated testing.

6/16/08 BD03: Digital Physical Design 46


Scan Reorder: Input and Output, Format
‹ Input
‰ Placed design in the Verilog

e
language or other HDL + DEF Placed Design
SDC SCANDEF

c
‰ Constraints in Synopsys Design Gates +
DEF
Constraints (SDC) format

n
‰ Logical Timing Libraries in Liberty
Scan Reorder
(.lib) format

e
‰ Physical Libraries in LEF format Logical Physical
Gates + Library Library

‰ Scan chain information in DEF

d
SCANDEF format
Scan Chain Reordered
‹ Output Design

ca
‰ Scan chain reordered design in
the Verilog language or other HDL
+ DEF

6/16/08 BD03: Digital Physical Design 47

Example: Scan Reorder


Scan chains that were stitched in the logical netlist need be reordered now that
placement is done.

ce
‹ Logical netlist was stitched numerically.

‹ Physical netlist is reordered based on placement.

n
SI DFF1 DFF2 DFF3 SO

e
Logical Netlist

SI DFF1 DFF2 SO

a
DFF3

d Physical Netlist before Reorder

SI

6/16/08
c DFF1

DFF3

BD03: Digital Physical Design


DFF2 SO

Physical Netlist after Reorder

48
Example: Scan Reorder (continued)
Reordered scan chain requires much less routing resources in the example design.

Before Scan Reorder


C

ce After Scan Reorder


C

n
dff1 dff3 dff1 dff3

d e
a
RAM A1

RAM A0
RAM A1

RAM A0

dff2 dff2

6/16/08
c BD03: Digital Physical Design 49

What Is Design Optimization?


‹ Definition: Process of using
Floorplanning Place/Route
Specification
automated algorithms to

e
improve the quality of a digital Designer Placement

design
Physical Synthesis

Microarchitecture

c
Scan Reorder
Static Timing Analysis

Design Optimization
Designer
Delay Calculation

Pre-CTS
Signal Integrity

‹ Example: After initial


Extraction

RTL

n
CTS

placement, we run a pass of Design Optimization


Post-CTS

pre-CTS design optimization to Logic Synthesis

e
Route
fix timing violations that may Synthesized
Gates Design Optimization
Netlist Post-Route
show up now that the design is

d
Detail
placed and we have delays Routed
Design
GDSII

based on estimated

a
Layout Design Verification
interconnect.
GDSII GDSII Mask Prep

6/16/08
c BD03: Digital Physical Design 50
Pre-CTS Design Optimization: Input and Output, Format
‹ Input
‰ Scan chain reordered design in Scan Chain

e
the Verilog language or other HDL Reordered Design
+ DEF SDC TCL

c
Gates +
‰ Constraints in Synopsys Design DEF

Constraints (SDC) format (ideal

n
clocks) Design Optimization
Pre-CTS
‰ Logical Timing Libraries in Liberty

e
Logical Physical
(.lib) format Gates + Library Library
DEF
‰ Physical Libraries in LEF format

d
‰ Commands in TCL Optimized Placed Design

a
‹ Output

c
‰ Optimized placed design in the
Verilog language or other HDL +
DEF

6/16/08 BD03: Digital Physical Design 51

Example: Pre-CTS Design Optimization


Because logical synthesis uses wire load models (estimates of net delay), the
design choices it makes can sometimes lead to sub-optimal results in placement.

‹ Upsizing or downsizing cells

ce
Pre-CTS design optimization can clean up some of these issues by

n
‹ Buffering nets

‹ Re-synthesizing paths to improve timing, etc. C

d e from_a dout

ca
RAM A1

RAM A0

clk
u10 u11 u12
u13
u14 u15
u16 u17
u11 and u16 are upsized

from_b

6/16/08 BD03: Digital Physical Design 52


Example: Pre-CTS Design Optimization (continued)
Cell u11 was driving several cells, and one of them, u20, was far away. In order to
drive the long net and meet timing, the cell was upsized. Cell u16 was upsized for

e
the same reason.

c
u20

en
d
RAM A1

RAM A0
a
u10 u11 u12

c
u13

u14 u15

u16 u17

u11 and u16 are upsized

6/16/08 BD03: Digital Physical Design 53

What Is Clock Tree Synthesis?


‹ Definition: Process of
inserting buffers in the clock Specification Floorplanning Place/Route

e
path, with the goal of Designer Placement

minimizing clock skew and

c
Physical Synthesis

Microarchitecture Scan Reorder


latency to optimize timing
Static Timing Analysis

Design Optimization
Designer
Delay Calculation

Pre-CTS
Signal Integrity

n
Extraction

RTL CTS
‹ Example: We ran clock tree Design Optimization

synthesis on the example Logic Synthesis


Post-CTS

e
Route
block and saw a large clock Synthesized
Gates Design Optimization
skew due to bad clock Netlist Post-Route

d
Detail
constraints. We ended up re- Routed GDSII
Design
running clock tree synthesis

a
with better constraints to get Layout Design Verification

an optimal result. GDSII GDSII Mask Prep

6/16/08
c BD03: Digital Physical Design 54
Clock Tree Synthesis: Input and Output, Format
‹ Input
‰ Optimized design in the Verilog

e
language or other HDL + DEF Optimized Placed Design

c
‰ Constraints in Synopsys Design SDC TCL
Gates +
Constraints (SDC) format DEF

n
‰ Logical Timing Libraries in Liberty
(.lib) format CTS

e
‰ Physical Libraries in LEF format Logical Physical
Gates + Library Library
‰ Clock constraints and commands DEF

d
in TCL
Placed Design with
‹ Output

a
Clock Trees Inserted
‰ Post-CTS design with clock trees

c
inserted in the Verilog language or
other HDL + DEF

6/16/08 BD03: Digital Physical Design 55

Example: Clock Tree Synthesis


Up to now, the clocks in the design have been treated as ideal (no clock skew, no
clock latency, ideal transition time, etc.). In CTS, we add buffers for the real clock

e
tree in order to minimize

c
‹ Clock skew in the design

‹ Clock latency in the design

n
C

d e from_a dout

a
RAM A1

RAM A0

c
clk
u10 u11 c2 u12
u13 c3
c0 u14 u15
u16 u17 c1
c0,c1,c2,c3 clock buffers are added

from_b

6/16/08 BD03: Digital Physical Design 56


Example: Clock Tree Synthesis (continued)
Buffers are added to the clock tree in our example design.

e
Netlist before CTS

DFF1 DFF2

c
u11 u13
u10 u11

u11

en
c2
Netlist after CTS

DFF1
u10
u13
c3
DFF2
u11

d
c0 c1

cac0
dff1
u10

u14

u16
u11

u13

u17
Placement after CTS

c2

u15

c1
dff2
u11
c3

6/16/08 BD03: Digital Physical Design 57

Example: Design Optimization, Post-CTS


Another round of design optimization takes place, since it is possible that CTS could
have disturbed the timing of some of the paths in the design.

Post-CTS optimization can include


‹ Buffering

ce
n
‹ Upsizing or downsizing cells

‹ Modifications to the clock tree itself


C

d e from_a dout

a
RAM A1

RAM A0

c
clk
u10 u11 c2 u12
u13 c3
c0 u14 u15
u16 u17 c1

from_b

6/16/08 BD03: Digital Physical Design 58


What Is Route?
‹ Definition: Process of
connecting the pins of the

e
Floorplanning Place/Route
Specification
standard cells, macros, and
I/Os of a digital design to Designer Placement

c
specific metal layers in the

Physical Synthesis
Microarchitecture Scan Reorder
process technology to

Static Timing Analysis


Design Optimization
Designer
match the schematic

Delay Calculation
Pre-CTS

Signal Integrity
Extraction
RTL CTS

‹ Example: We ran a
Design Optimization

e
Post-CTS
Logic Synthesis
preliminary route on the Route

example block and saw that Synthesized


Netlist Gates Design Optimization

d
Post-Route

routing congestion was an Detail


issue. To fix it, we re-ran Routed
Design
GDSII

a
placement with a placement Layout Design Verification
density screen to force a

c
lower utilization in that area GDSII GDSII Mask Prep

and allow for more routing


resources.

6/16/08 BD03: Digital Physical Design 59

Route: Input and Output, Format


‹ Input
‰ Placed design in the Verilog

e
language or other HDL + DEF Placed Design with
Clocks Inserted

c
‰ Constraints in Synopsys Design
Constraints (SDC) format SDC TCL
Gates +
DEF

n
‰ Logical Timing Libraries in Liberty
(.lib) format
Route

e
‰ Physical Libraries in LEF format
Logical Physical
‰ Route constraints and commands Gates + Library Library

d
in TCL DEF

‹ Output Routed Design

ca
‰ Routed design in the Verilog
language or other HDL + DEF

6/16/08 BD03: Digital Physical Design 60


Example: Route
When the design has been fully placed with all of the clock tree buffers, it is time to
perform routing. Routing connects all of the I/Os, standard cells, RAMs, and macros

e
to their specific routing layers according to the synthesized netlist.
The router will try to minimize
‹ Route congestion

‹ Timing impact on critical paths

nc
d e
RAM A1

RAM A0
ca c0
u10

u14

u16
u11

u13

u17
c2

u15

c1
u12

c0,c1,c2,c3 clock buffers are routed


c3

6/16/08 BD03: Digital Physical Design 61

Example: Design Optimization, Post-Route


Another round of design optimization takes place, since the timing is more realistic
now that there are actual wires and not just estimates of wires.
Post-route optimization can include
‹ Buffering

ce
n
‹ Upsizing or downsizing cells

‹ More advanced and aggressive modifications

d e
a
RAM A1

RAM A0

c
u10 u11 c2 u12

u13 c3

c0 u14 u15

u16 u17 c1

c0,c1,c2,c3 clock buffers are routed, u10 and u14 are upsized

6/16/08 BD03: Digital Physical Design 62


Discussion Questions
‹ What iterations take place when a design goes from logic synthesis
through floorplanning, placement, CTS, and route?

RTL to placement?

ce
‹ What precautions would you take if you were to take your design from

en
a d
6/16/08
c BD03: Digital Physical Design 63

What Is Extraction?
‹ Definition: Process of
calculating the parasitic

e
Floorplanning Place/Route
Specification
resistance and capacitance
Designer Placement
of the interconnect of the

c
Physical Synthesis

Microarchitecture
physical design Scan Reorder
Static Timing Analysis

Design Optimization
Designer
Delay Calculation

Pre-CTS

n
Signal Integrity

‹ Example: Extraction can be


Extraction

RTL CTS

performed at various parts of Design Optimization

e
Post-CTS
Logic Synthesis
the design with varying Route

accuracy. The most accurate Synthesized


Netlist Gates Design Optimization

d
Post-Route

results are achieved when Detail


Routed GDSII
extraction is performed on a Design

a
fully routed design, because Layout Design Verification

all of the nets are of known

c
GDSII GDSII Mask Prep
metal type and length. There
are no estimates for nets at
this point.

6/16/08 BD03: Digital Physical Design 64


Extraction: Input and Output, Format
‹ Input
‰ Routed design in the Verilog

e
language or other HDL + DEF or Routed Design
GDSII TCL

c
DEF or
GDSII
‰ LVS verified netlist

n
‰ Physical Libraries in LEF format
Extraction
‰ Extraction constraints and

e
commands in TCL Physical
Library
SPEF
‹ Output

d
‰ Standard Parasitic Extraction Parasitic File
Format (SPEF) file containing all

a
of the RC information for the
routed nets in the design

6/16/08
c BD03: Digital Physical Design 65

Example: Extraction
When the design has been routed, we can perform a detailed extraction of the
resistance and capacitance of the routed nets in the design.

design.

ce
This RC data will give us a more accurate report of the timing and power of the

n
Resistance and
capacitance

e
for each net is
“extracted”
and saved in

d
a SPEF file.

a
RAM A1

RAM A0

c
u10 u11 c2 u12

u13 c3

c0 u14 u15

u16 u17 c1

c0,c1,c2,c3 clock buffers are routed

6/16/08 BD03: Digital Physical Design 66


What Is Delay Calculation?
‹ Definition: Process of
computing the delay of Specification Floorplanning Place/Route

e
interconnect and standard Designer Placement

cells in a digital design

c
Physical Synthesis
Microarchitecture Scan Reorder

Static Timing Analysis


Design Optimization
Designer
‹ Example: In the example

Delay Calculation
Pre-CTS

Signal Integrity
n

Extraction
RTL CTS
design, delay calculation was Design Optimization

performed after CTS and also Logic Synthesis


Post-CTS

e
Route
after final route. Using the Synthesized
Gates Design Optimization
delay information, we were Netlist Post-Route

d
Detail
able to find several timing Routed GDSII
Design
violations in the design.

a
Layout Design Verification

GDSII GDSII Mask Prep

6/16/08
c BD03: Digital Physical Design 67

Delay Calculation: Input and Output, Format


‹ Input
‰ Routed design in the Verilog

e
Routed Design
language or other HDL + DEF TCL
Gates +

c
‰ Parasitic extraction file (SPEF) DEF
SPEF
‰ Logical Timing Libraries in Liberty

n
(.lib) format Delay Calculation
‰ Physical Libraries in LEF format
Logical Physical

e
Library Library
‰ Constraints and commands in SDF
TCL

d
Delay File
‹ Output

a
‰ Standard Delay Format (SDF) file
containing all of the delay

c
information in the design

6/16/08 BD03: Digital Physical Design 68


Example: Delay Calculation
We can perform delay calculation for all the cells and nets in the design and
generate an SDF file. This can be done in a

e
‹ Separate delay calculator

c
‹ STA tool

The reason for generating an SDF file is to have consistency for all timing

n
calculations throughout the flow. Once it is generated, then all tools can access the
same SDF file.

d e
Delay for each
cell and net
in the design
is calculated

a RAM A1

RAM A0
c
u10 u11 c2 u12

u13 c3

c0 u14 u15

u16 u17 c1

c0,c1,c2,c3 clock buffers are routed

6/16/08 BD03: Digital Physical Design 69

What Is Signal Integrity?


‹ Definition: Unintended
effects on digital signals Specification Floorplanning Place/Route

e
caused by interconnect Designer Placement

parasitic resistance or

c
Physical Synthesis

Microarchitecture Scan Reorder


capacitance that causes noise
Static Timing Analysis

Design Optimization
Designer
Delay Calculation

Pre-CTS
and/or changes delays
Signal Integrity

n
Extraction

RTL CTS

Design Optimization

‹ Example: In our example Logic Synthesis


Post-CTS

e
Route
design, we saw signal Synthesized
Gates Design Optimization
integrity (SI) effects such as Netlist Post-Route

d
Detail
noise-on-delay and glitches, Routed GDSII
Design
due to long nets that were

a
running in parallel. Layout Design Verification

GDSII GDSII Mask Prep

c
What is noise-on-delay?
Crosstalk-induced delay or incremental
delay due to coupling capacitance?
What is a glitch?
A glitch is a bump or change in value
caused by a changing signal effecting a
neighboring wire.

6/16/08 BD03: Digital Physical Design 70


Signal Integrity: Input and Output, Format
‹ Input
‰ Routed Design in the Verilog language

e
or other HDL + DEF SPEF

‰ Constraints in Synopsys Design

c
Constraints (SDC) format Routed Design
‰ Constraints and commands in TCL SDC TCL
Gates +

n
‰ Parasitic extraction file (SPEF) DEF

‰ Logical Timing Libraries in Liberty (.lib)

e
format Signal Integrity
‰ Physical Libraries in LEF format
Logical Physical

d
‰ Power rail IR-drop data Incremental Library Library
SDF
‰ Tool specific SI libraries Tool

a
Specific

‹ Output Delay File Library

‰ Incremental Standard Delay Format

c
(SDF) file containing all of the delay
information in the design related to
noise-on-delay
‰ Reports for glitch nets
‰ List of problem nets that need to be
re-routed.

6/16/08 BD03: Digital Physical Design 71

Example: Signal Integrity


We can run checks and produce data or reports to help us identify timing and reliability issues
due to SI. For submicron designs, closely coupled nets can produce

e
‹ Crosstalk-induced delay

c
‹ Noise
Power rail IR drop can cause

n
‹ Weakened drivers

‹ Increased delays

‹ Lower noise margins

d e
Incremental delay due
coupling capacitance is
stored in an SDF file

a
RAM A1

RAM A0

c
u10 u11 c2 u12

u13 c3

c0 u14 u15

u16 u17 c1

c0,c1,c2,c3 clock buffers are routed


6/16/08 BD03: Digital Physical Design 72
What Is Static Timing Analysis?
STA is the preferred method for timing
signoff since the majority of ASIC

e
Floorplanning Place/Route
Specification
vendors and foundries have adopted
Designer Placement
it.

c
Physical Synthesis
Microarchitecture Scan Reorder

Static Timing Analysis


‹ Definition: Process of Designer
Design Optimization
Pre-CTS

Delay Calculation
n

Signal Integrity
computing the timing of

Extraction
RTL CTS

logically related paths for a Design Optimization

e
Post-CTS
Logic Synthesis
digital design without regard to Route

large scale functional behavior Synthesized


Netlist Gates Design Optimization

d
Post-Route

Detail
Routed GDSII
‹ Example: To determine the Design

a
timing of the design, we ran Layout Design Verification

static timing analysis after

c
GDSII GDSII Mask Prep
detail route, and saw several
paths violating their setup time
requirements.

6/16/08 BD03: Digital Physical Design 73

Static Timing Analysis: Input and Output, Format


‹ Input
‰ Routed Design in the Verilog

e
language or other HDL (Note: SPEF

STA can be run on a design at

c
any stage of the back-end flow) SDF Routed Design
SDC TCL
‰ Constraints in Synopsys Design Incremental

n
SDF Gates
Constraints (SDC) format
‰ Logical Timing Libraries in Liberty

e
Static Timing
(.lib) format Analysis

‰ Constraints and commands in Logical

d
Library
TCL

a
‰ SPEF, SDF, and incremental SDF
Reports
‹ Output

c
‰ Timing reports, including noise-
on-delay effects

6/16/08 BD03: Digital Physical Design 74


Example: Static Timing Analysis
At the end of the physical implementation phase, we will need to run signoff STA to
make sure that all of the paths in our design meet timing.

STA can be used

ce
‹ During the implementation phase to check on timing, etc.

n
‹ For signoff just before tapeout to ensure all paths meet timing

e
Full chip timing
can now be run

d
with routing and
SI effects
included

a RAM A1

RAM A0
c
u10 u11 c2 u12

u13 c3

c0 u14 u15

u16 u17 c1

c0,c1,c2,c3 clock buffers are routed

6/16/08 BD03: Digital Physical Design 75

What Is Design/Physical Verification?


Definition: Layout versus schematic (LVS)
and design rule check (DRC) and power (IR

e
drop and EM) are signoff checks run to Specification Floorplanning Place/Route

ensure the integrity, functionality, and Designer Placement

c
manufacturability of the chip.
Physical Synthesis

Microarchitecture Scan Reorder


‹ LVS is a comparison of transistor-
Static Timing Analysis

Design Optimization

n
Designer
level SPICE netlist vs. GDSII to
Delay Calculation

Pre-CTS
Signal Integrity
Extraction

RTL
ensure the connectivity of the CTS

e
design. Design Optimization
Post-CTS
Logic Synthesis
Route
‹ DRC is a detailed check of the
Synthesized

d
Design Optimization
physical design against the process Netlist Gates Post-Route

technology rules. Detail


Routed GDSII

a
Design
‹ IR drop is a detailed check of the
Layout Design Verification
chip’s power plan to ensure that the

c
supply voltages do not drop below GDSII GDSII Mask Prep
accepted levels.
‹ EM is a detailed check to ensure that
the current density in all parts of the
design does not exceed accepted
levels.
6/16/08 BD03: Digital Physical Design 76
What Is Mask Prep?
Process of creating the mask set from the GDSII database to allow chip
manufacturing

Specification

ce Floorplanning Place/Route

n
Designer Placement

Physical Synthesis
e
Microarchitecture Scan Reorder

Static Timing Analysis


Design Optimization
Designer

Delay Calculation
Pre-CTS

Signal Integrity
Extraction
d
RTL CTS

Design Optimization
Post-CTS
Logic Synthesis

a
Route
Synthesized Design Optimization
Netlist Gates Post-Route

c
Detail
Routed GDSII
Design

Layout Design Verification

GDSII GDSII Mask Prep

6/16/08 BD03: Digital Physical Design 77

DRC: Input and Output, Format


‹ Input
‰ GDSII

e
GDSII

‰ Rule deck

c
‹ Output DRC

n
‰ DRC reports Rule
Deck

e
Reports

a d
6/16/08
c BD03: Digital Physical Design 78
LVS: Input and Output, Format
‹ Input
‰ Gate Level Netlist in the

e
Gates GDSII
Verilog language

c
‰ GDSII LVS
‰ Rule deck

n
Rule SPICE
Deck Libs
‰ SPICE libraries

e
‹ Output Reports
‰ LVS reports

a d
6/16/08
c BD03: Digital Physical Design 79

Power Grid Analysis, IR Drop, and EM: Input and Output, Format
‹ Input
‰ Gate Level Netlist in the Verilog

e
language + DEF
VCD

c
‰ Power characterized libraries in
tool-specific format Gates + SDC TWF
SPEF
DEF

n
‰ Timing libraries in Liberty (.lib)
format Power Grid

e
‰ Timing constraints in SDC format Analysis
Logical Power
‰ Extraction data in SPEF format Libraries Libraries

d
‰ Timing windows file (TWF)
‰ Value-change-dump file (optional)

a
Reports
‹ Output

c
‰ IR drop reports
‰ EM reports

6/16/08 BD03: Digital Physical Design 80


Mask Prep: Input and Output, Format
‹ Input
‰ GDSII GDSII

‰ Technology Specific Files

‹ Output

ce Mask Prep

Tech

n
‰ Optimized GDSII Optimized Files
GDSII

d e
ca
6/16/08 BD03: Digital Physical Design 81

Example: Physical Verification and Mask Prep


Physical verification involves power, LVS, and DRC checks to ensure the integrity of
the design.

ce
When the design passes all of the PV checks, a GDSII is produced and mask prep
can begin. Mask prep involves complex processes such as lithography (the process
of creating the masks to create the layers for an integrated circuit) modifications,

n
etc.

e
Make sure
power, LVS, DRC
checks pass

d
Perform mask
prep

a
RAM A1

RAM A0

c
u10 u11 c2 u12

u13 c3

c0 u14 u15

u16 u17 c1

c0,c1,c2,c3 clock buffers are routed

6/16/08 BD03: Digital Physical Design 82


Discussion Questions
‹ What are the main process steps in the physical design of a chip?

e
‹ Which process steps can be done at multiple stages of the flow?

‹ If you were to lead the design of a chip, how would you organize your

c
resources to handle the various tasks?

n
d e
ca
6/16/08 BD03: Digital Physical Design 83

Summary
We have introduced all of the steps in the physical implementation flow:

e
‹ Specification, microarchitecture, RTL, logic synthesis

‹ Floorplanning, placement, clock tree synthesis, route

nc
‹ Extraction, delay calculation, static timing analysis, signal integrity

‹ Design optimization, physical synthesis

e
‹ Design verification, mask prep

d
Each step in the process, in and of itself, is very detailed, so we will spend
the rest of course learning more about each step.

ca
6/16/08 BD03: Digital Physical Design 84
Testing Your Understanding
True or false

e
1. In creating a floorplan, we can gather information to see if our design
is routable.

nc
2. If a design does not meet timing after synthesis, it is possible that it
can meet timing during placement.
3. When routing a design, it is best to avoid having long parallel routes.

timing analysis.

d e
4. Accurate SDC constraints are important to meet timing during static

a
5. Errors in physical verification are simple to fix.

6/16/08
c BD03: Digital Physical Design 85

Learning Activity
In this activity, you will
‹ Complete a flowchart of the digital
design implementation flow
‹ Include the design flow steps

‹ Include the necessary inputs and

ce RTL

n
outputs Design Flow
?

e
Step
‹ Fill in the missing or wrong
sections of the flowchart ?

a d
10 minutes for debriefing
?

6/16/08
c BD03: Digital Physical Design
GDSII

86
Terms and Definitions
Floorplanning Process of deriving the die size, allocating space for soft blocks, planning power, and macro
placement.

e
Placement Process of placing the standard cells in a floorplanned design

Clock Tree Synthesis Process of inserting buffers in the clock tree of a digital design

c
Route Process of connecting the pins of the standard cells, macros, and I/Os of a digital design to
specific metal layers in the process technology to match the schematic.

n
Extraction Process of calculating the parasitic resistance and capacitance of the interconnect of the physical
design

e
Delay Calculation Process of computing the delay of interconnect and standard cells in a digital design

Static Timing Analysis Process of computing the timing of logically related paths for a digital design without regard to

d
large scale functional behavior

Signal Integrity Unintended effects on digital signals caused by interconnect parasitic resistance or capacitance

a
that causes noise and/or changes delays

Design Optimization Process of using automated algorithms to improve the quality of a digital design

c
Physical Synthesis Process of combining logic synthesis and placement to improve the accuracy of the physical
implementation of a digital design

Design Verification Process of physically verifying the design rules and backend checks of a design

Mask Prep Process of creating the mask set from the GDSII database to allow chip manufacturing

6/16/08 BD03: Digital Physical Design 87

Terms and Definitions (continued)


LEF Library Exchange Format, Physical Library (metal and via routing rules)
DEF Design Exchange Format, Physical (floorplanning, placement, routing) and Logical Representation

e
(connectivity)
Liberty Format for logical libraries, includes timing, area, and power information

c
SDC Synopsys Design Constraints, includes clocks and timing constraints
Clock Skew Delay difference between clock paths in a design

n
Clock Latency Delay from clock source to destination in a design
SPEF Standard Parasitic Exchange Format, standard format for representing capacitance and resistance

e
for each net
SDF Standard Delay Format, standard format for representing interconnect and cell delays

d
LVS Layout vs. schematic, connectivity checking
DRC Design Rule Check, physical rule checking

a
IR Drop Voltage Drop, measure of power plan integrity
EM Electromigration, term used to describe failures in wires due to high current

c
TWF Timing Windows File, file used in signal integrity analysis to determine the overlap of signals
VCD Value Change Dump, file used to provide toggle information to power analysis
GDSII Graphic Data System, standard format for IC layout data exchange
Rule Deck Technology specific information used by physical verification
Spice Deck Format to represent circuits, cells, and macros in detail

6/16/08 BD03: Digital Physical Design 88


Introduction and Overview of Layout
Technology
Module 2

June 16, 2008

Urban Planning
When civil engineers plan the layout of an urban settlement, they need to consider
‹ Total population and population density of the settlement.

ce
‹ Locations of parks, apartments, shopping centers, etc.

‹ Spacing of each building, tree, and street.

en
a d
6/16/08
c BD03: Digital Physical Design 90
Integrated Circuit Layout
In a similar fashion, but on a micron scale, layout engineers must also decide where
to place parts of the circuit under design and follow spacing rules.

ce
en
a d
6/16/08
c BD03: Digital Physical Design 91

Module Objectives
In this module, you will be able to

e
‹ Describe the fabrication of field effect transistors (FETs) and layout
technologies, and correspond layout data to library exchange format

c
(LEF) syntax
‹ Read a design rule manual (DRM) and interpret design rule check

en
(DRC) and layout versus schematic (LVS) errors
‹ Describe how cell libraries are used

a d
6/16/08
c BD03: Digital Physical Design 92
Topics in This Module
‹ Introduction to layout describing layers, FETs, and logic gate layouts

e
‹ DRC and reading a DRM

‹ Layout versus schematic checking

‹ LEF library format and syntax

‹ Review

nc
d e
ca
6/16/08 BD03: Digital Physical Design 93

Introduction to Layout
‹ After a design has been synthesized, it is time to start laying out the
design.

ce
‹ Usually, place and route tools more or less automate the process
using ready-made transistor and gate libraries provided by the
foundry.

en
‹ Sometimes, when performance and design density is of primary
importance, the designer must lay out the design manually.
‹ This approach, called “custom design,” leads to high production costs

a d
and a long time to market.
‹ There are usually three reasons to justify a custom design:

c
‰ The block can be re-used many times such as a library cell.
‰ The product can be sold in a large volume, such as microprocessors.
‰ Cost is not the primary concern, such as chips used in space.

6/16/08 BD03: Digital Physical Design 94


Laying Out a Transistor
‹ Since the transistor is the basic building block of circuits, we must
understand how a transistor is laid out on a chip.

MOS transistor (NMOS).

ce
‹ Below are the symbol schematic and the 3D diagram for an n-channel

‹ Recall that the NMOS has four terminals:

n
‰ The gate is made out of polysilicon.

e
‰ The drain and source are made out of heavily doped n+ diffusion layers
(also called active area).

d
‰ The bulk is made out of p-type substrate.

a
Gate

c
Poly Gate
n+ drain n+ source
Drain Source
Bulk p-substrate bulk
Symbol Schematic 3D Diagram

6/16/08 BD03: Digital Physical Design 95

Laying Out a Transistor (continued)


‹ In a layout tool, you will simply draw the top-down view of the 3D diagram you
saw in the previous slide with a few additions.

e
‰ Due to lithographic error margins, the polysilicon gate must be extended over the
diffusion layer according to design rules (discussed later).

nc
‰ Metal contacts must be added for the drain and source diffusion layers.

Metal 1

e
contact

a d Gate

6/16/08
c Source Drain

BD03: Digital Physical Design 96


Laying Out a Transistor (continued)
‹ A p-channel transistor is similar to the n-channel transistor except for a few
differences:

e
‰ The drain and source of a p-channel MOS (PMOS) are made out of p+ diffusion
layer, and the substrate is composed of n-type material.

c
‰ For consistency in this module, the wafer is created with a p-type process. This
means the default substrate is p-type material. Therefore, since the PMOS requires
n-type material, a well composed of n-type material must be built around the

n
diffusion layer.
‰ In layout tools, this well is also called the select region.

e
Layout of p-channel transistor

a d
c
Gate

Source Drain

n-well

6/16/08 BD03: Digital Physical Design 97

Laying Out an Inverter


‹ Now that we have seen how a transistor is laid out, it is time to lay out
the simplest logic gate: an inverter.
‹ Features to consider:

ce
‰ The inverter needs power and ground metal strips (rails). Recall that the
source of the PMOS is connected to the power rail, whereas the source of

n
the NMOS is connected to the ground rail.

e
‰ The poly can be extended to connect together both gates.
‰ Recall that for an inverter with equal drive strength for rising and falling

d
transitions, the PMOS is twice the width of the NMOS.
‰ Since the substrate needs to be biased, we will also need to add substrate

a
contacts to both transistors (n-tap for n-type substrate contact, p-tap for p-
type substrate contact).

6/16/08
c
‰ In digital circuits, the substrate of a transistor is usually biased to the same
voltage as the source of the transistor to avoid body effect.

BD03: Digital Physical Design 98


Laying Out an Inverter (continued)
VDD

n-tap

ce Drain

In Out

en n-well
Source
Gate

a d Source
In Out

Drain

c
Note: In digital circuit schematics, the
bulk node is usually not drawn because it
is assumed that the bulk is connected to
the source.

p-tap

Ground
6/16/08 BD03: Digital Physical Design 99

Stick Diagram
‹ Just like the way a writer writes a rough draft for an essay, layout engineers
can also plan their layout on paper before diving into the tools.

e
‹ A commonly preferred method for scratch work is a stick diagram.

c
‹ A stick diagram is a way to visualize a layout without drawing the actual
dimensions.

n
‹ Each object (that is, poly, metal strip, diffusion) is represented by a
dimensionless “stick.”

e
‹ Below is a stick diagram for the inverter we just drew. Can you identify which
each stick represents?

a d
Stick diagram of an inverter

6/16/08
c BD03: Digital Physical Design 100
NAND Gate Layout
‹ Let’s draw a stick diagram for a slightly more complicated, two-input NAND
gate.

NMOS in series.

ce
‹ Looking at the schematic, we see that there are two PMOS in parallel and two

‹ Due to this fact, we do not need to draw separate diffusion layers for two

n
devices that share sources and drains.

A B

d e A B

ca A

6/16/08 BD03: Digital Physical Design 101

NAND Gate Layout (continued)


‹ Now let’s translate that stick diagram into an actual layout with dimensions.

‹ We will need to add the n-well, contacts, and substrate taps.

n-tap VDD

ce
en
d A B

ca p-tap Ground

6/16/08 BD03: Digital Physical Design 102


General Layout Tips
Here are some fundamental guidelines to follow when laying out a design:

e
‹ Always try to create a continuous diffusion layer or well.

‹ If you must separate the wells (select areas), remember to place

nc
substrate taps in each separate well.
‹ All poly strips should run in one direction (usually vertical).

e
‹ Keep metal jogs to a minimum, and absolutely no diagonal wires,
since diagonal wires can cause problems with design rules

d
downstream.

a
‹ Power and ground rails should be extra wide to allow a large amount
of current to flow into your device.

c
‹ Placing additional contacts never hurt. They will give you more options
in terms of where to place the metal wire during routing.
‹ Plan your layout on paper first; the stick diagram is your friend!

6/16/08 BD03: Digital Physical Design 103

Technology Layers
‹ In most layout tools today, you will
have access to a layers palette

e
like the one shown on the right
from Cadence Virtuoso Layout

c
Editor.
‹ We have already encountered

n
some of the layers on the previous

e
slides.
‹ The table on the next slide will

d
summarize commonly used
layers.

ca
6/16/08 BD03: Digital Physical Design 104
Technology Layers (continued)

Layer Name as Displayed in Palette Description

Metal(1-8)

Poly

ce Metal layer used to connect together pins on


a layout
Polysilicon material used for transistor gates

nactive

pactive

en n+ diffusion layer for NMOS

p+ diffusion layer for PMOS

d
nselect n well for PMOS

a
pselect In a p-process, this represents an abstract
boundary.

c
cc Contact cut. This layer, in conjunction with
metal layers, is used to create vias and
contacts.

6/16/08 BD03: Digital Physical Design 105

Class Exercise
Draw the stick diagram and layout for a two-input NOR gate.

ce A

en B

a d A B

6/16/08
c BD03: Digital Physical Design 106
Topics in This Module
‹ Introduction to layout describing layers, FETs, and logic gate layouts

e
‹ DRC and reading a DRM

‹ Layout versus schematic checking

‹ LEF library format and syntax

‹ Review

nc
d e
ca
6/16/08 BD03: Digital Physical Design 107

Design Rules
‹ Today’s semiconductor manufacturing processes are extremely
complex. It is simply not possible to expect every layout engineer to

e
understand the intricacies of the fabrication process.

c
‹ Layout engineers want tighter, smaller designs.

‹ Process engineers want a reproducible and high-yield process.

and process engineers.

en
‹ Design rules act as an interface and a compromise between layout

d
‹ By understanding design rules, layout engineers can make their
design as compact as possible while ensuring that their design will

a
have a high yield.

6/16/08
c BD03: Digital Physical Design 108
Design Rule Manual
‹ Usually, the layout engineer will have access to a document called the
design rule manual (DRM), which explains all the design rules that

e
need to be followed.

c
‹ The information is also annotated into layout tools that automatically
check the design for violations as the design is being laid out.

en
‹ We will take a brief look at a sample DRM.

‹ To make the rules more readable, the rules in this manual are divided
into sections based on the different layers.

a d
6/16/08
c BD03: Digital Physical Design 109

Design Rule Manual: N-Well Rules


N-Well Rules
P-active Rule Rule Description Drawn

e
1A
1F
1A Minimum width 2.2 μm

c
1B
1B Minimum spacing in x and y, 1.6 μm

n
both N-wells biased at the
same potential
N-well

e
1C Minimum spacing either or 3.0 μm
both N-wells not biased or
biased to different potentials

1C

a 1E
d 1D

1E
Minimum enclosure of p-
active region
Minimum spacing in x and y
1.5 μm

2.0 μm

c
to an external n-active region
N-active

1F Minimum spacing in x and y 1.5 μm


1D to an external p-active region

Note: These dimensions are not drawn to scale.

6/16/08 BD03: Digital Physical Design 110


Design Rule Manual: N/P-Active Rules
N-Active Rules
2A
Rule Rule Description Drawn

N-active

ce 2A

2B
Minimum width

Minimum spacing over field


0.6 μm

0.8 μm

2B

P-active
2C

en 2C Minimum spacing to p-active

P-Active Rules
1.0 μm

d
3A Minimum width 0.6 μm

a
3B Minimum spacing over field 0.8 μm

c
3C Minimum spacing to n-active 1.0 μm

Note: These dimensions are not drawn to scale.


The rules for 3A, 3B, 3C are graphically identical to 2A, 2B, 2C except they are for p-active regions.
The numbers for n-active and p-active are the same. It might not be so for every technology.

6/16/08 BD03: Digital Physical Design 111

Design Rule Manual: Contact1 Rules

4D Contact 1 (Metal 1 Contacts) Rules

e
4E Rule Rule Description Drawn

c
4A Required size (square) 0.8 μm2

n
4A
4C 4B Minimum spacing 0.6 μm

e
4C Minimum poly contact 1 0.4 μm
spacing to any active region

d
4B
4D Minimum active region 0.6 μm
contact 1 to poly

a
4F
4E Minimum enclosure by any 0.2 μm
active region

c
4F Minimum enclosure by poly 0.2 μm

Note: These dimensions are not drawn to scale.


The rules for contacts belonging to other layers are similar; just the numbers are different.

6/16/08 BD03: Digital Physical Design 112


Design Rule Manual: Metal1 Rules
5C
Metal 1 Rules

e
Rule Rule Description Drawn

5A

nc 5A Minimum width 0.6 μm

e
5B Minimum spacing 0.8 μm

d
5C Minimum overlap of contact 1 0.2 μm

a
5B

6/16/08
c
Note: These dimensions are not drawn to scale.
The rules for contacts belonging to other layers are similar; just the numbers are different.

BD03: Digital Physical Design 113

Discussion Questions
‹ If we had design rule violations on METAL2 and METAL3 after detail
route, which sections of the DRM should we refer to?

ce
‹ Why is there a minimum spacing rule for specific metal layers?

‹ Why is there a minimum width rule for specific metal layers?

n
‹ Are the rules different per metal layer?

d e
ca
6/16/08 BD03: Digital Physical Design 114
Topics in This Module
‹ Introduction to layout describing layers, FETs, and logic gate layouts

e
‹ DRC and reading a DRM

‹ Layout versus schematic checking

‹ LEF library format and syntax

‹ Review

nc
d e
ca
6/16/08 BD03: Digital Physical Design 115

Layout vs. Schematic


‹ After your layout is completed, how do you verify that it is functionally
correct?

ce
‹ A function called Layout versus Schematic (LVS) is found in most
tools, which checks your layout against a schematic netlist.
‹ The tool first extracts a netlist from the layout by using some basic
rules:

en
‰ A transistor is detected when poly overlaps active regions.
‰ All poly, diffusion, and metal layers are conductive and are assumed to

d
route signals.

ca
6/16/08 BD03: Digital Physical Design 116
Layout vs. Schematic (continued)

Net2 Net2

e
Net3
Net1 VDD VDD Net1 I1 I3 Net3
I1 I3 2/1 2/1

GND
A B

I2
GND
A B

I4

nc I2
1/1
I4
1/1

d e
ca Net1

IN1
Net2 Net3

O1

6/16/08 BD03: Digital Physical Design 117

General LVS Tips


‹ Although different LVS tools contain different user interfaces,
commands, and options, they all share the same principle of

e
comparing a netlist against a layout.

c
‹ These are some general tips when performing LVS:
‰ Similar to Verilog® design, a bottom-up approach should be used when

n
performing LVS. If LVS does not pass, then the error can be narrowed
down to the interconnects, because the smaller blocks are already LVS

e
clean.

d
‰ Label your layout. All pins and wires should be labeled exactly as they
appear in the netlist. It gives the tool a good chance to correctly identify a

a
mismatch.
‰ If the device count between layout and netlist is the same, do not perform

c
any netlist reduction. If the count is different, check to see if the layout is
correct before performing netlist reduction, because this process attempts
to simplify logic and can potentially collapse nets.

6/16/08 BD03: Digital Physical Design 118


General LVS Tips (continued)
‹ The first goal of LVS should be to get a connectivity-clean LVS.
‰ The electrical connections linking different devices in the front and back
end should be equivalent.

ce
‰ The two netlists should be topologically equivalent, meaning they have the
same type of devices.

n
‰ Set your constraints to check only the above factors.

e
‹ The second goal of LVS is to make sure that device parameters and
capacitance values are correct. This can only be done if the netlist

d
annotates such information.

a
‹ Check the reports.
‰ LVS reports usually consist of matching and non-matching nets and

c
devices.
‰ Most tools have a cross-probing feature that will highlight the equivalent
object on both the layout and the schematic of the netlist if one is selected.
This is your best debugging friend!

6/16/08 BD03: Digital Physical Design 119

Topics in This Module


‹ Introduction to layout describing layers, FETs, and logic gate layouts

e
‹ DRC and reading a DRM

‹ Layout versus schematic checking

‹ LEF library format and syntax

‹ Review

nc
d e
ca
6/16/08 BD03: Digital Physical Design 120
Physical Libraries
‹ After a standard cell is laid out, the information is encapsulated into a
Library Exchange Format (LEF).

ce
‹ The LEF provides a means to exchange layout information between
layout and routing tools in the IC flow (such as the Cadence®
Virtuoso® tools and the SoC Encounter® RTL-to-GDSII system).

standard cell.

en
‹ The LEF contains only information on layout of metal layers inside a

‹ This information includes the locations of I/O pins and also internal

a
route.

d
metal routing so that the router knows where to route and where not to

6/16/08
c BD03: Digital Physical Design 121

General Rules about LEF Files


‹ A LEF file is limited to 2048 characters per line.

e
‹ The unit of distance is in microns.

‹ The precision for unit of distance is controlled by the UNITS


statement.

nc
‹ LEF statements end with a semicolon. A space must separate the last
character in the statement and the semicolon.

cell LEF file.

d e
‹ LEF information is usually divided into two files, a technology and a

a
‹ LEF statements can be defined in any order. But data must be defined
before it is used. The following table is the typical format for LEF files.

6/16/08
c BD03: Digital Physical Design 122
Typical LEF Format
Statements for a tech LEF file. Statements for a standard cell LEF file.
[VERSION statement] [VERSION statement]

e
[BUSBITCHARS statement] [BUSBITCHARS statement]
[DIVIDERCHAR statement] [DIVIDERCHAR statement]

c
[UNITS statement] [VIA statement] ...
[MANUFACTURINGGRID statement] [SITE statement]

n
[USEMINSPACING statement] [MACRO statement
[CLEARANCEMEASURE statement ;] [PIN statement] ...

e
[PROPERTYDEFINITIONS statement] [OBS statement ...] ] ...
[LAYER(Nonrouting) statement [BEGINEXT statement] ...

d
| LAYER(Routing) statement] ... [END LIBRARY]
[SPACING statement ]
[MAXVIASTACK statement]

a
[VIA statement] ...
[VIARULE statement] ...

c
[VIARULE GENERATE statement] ...
[NONDEFAULTRULE statement] ...
[SITE statement] ...
[BEGINEXT statement] ...
[END LIBRARY]

6/16/08 BD03: Digital Physical Design 123

Technology LEF File


‹ A technology LEF file contains information about a certain technology
process (for example, UMC 130 nm and IBM 65 nm).

ce
‹ The bulk of the technology LEF file describes the metal and via layers,
and their process rules (such as width, spacing, extension, minimum
area, and antenna area).

en
‹ Metal layers are used to connect standard cells and macros, whereas
vias are used to connect different metal layers.
‹ A via is a rectangular object that connects two routing layers together.

a
two routing layers.

d
The via is usually composed of three layers: a cut layer sandwiched by

6/16/08
c BD03: Digital Physical Design 124
LAYER Statement
‹ Every layer in the technology is described with the LAYER statement.

e
‹ There are four types of layers: CUT, Routing, Implant, and
Masterslice.

nc
‹ In this class, we will cover only the CUT and Routing layers, which are
responsible for creating metal routes and the vias.
‹ Implant and Masterslice layers are beyond the scope of this module

e
and will not be discussed here.

a d
6/16/08
c BD03: Digital Physical Design 125

Routing LAYER
LAYER ME1
‹ Routing layers are responsible for TYPE ROUTING ;
creating metal routes between WIDTH 0.160 ;

e
cells. AREA 0.1024 ;
SPACING 0.160 ;

c
‹ For each layer, there are many SPACING 0.26 RANGE 1.765 100000.0 ;
PITCH 0.400 ;
attributes to set. On the right is a OFFSET 0.200 ;
sample LEF file describing the

n
DIRECTION HORIZONTAL ;
attributes for metal layer 1. THICKNESS 0.320 ;
HEIGHT 0.46 ;

e
‹ The important attributes will be MINENCLOSEDAREA 0.3072 ;
MINIMUMCUT 2 WIDTH 1.40 ;
described in detail on the next MAXWIDTH 25.00 ;

d
slide. CAPACITANCE CPERSQDIST 1.1012E-04 ;
RESISTANCE RPERSQ 0.09100000 ;

a
EDGECAPACITANCE 9.362E-05 ;
MINIMUMDENSITY 20 ;
MAXIMUMDENSITY 80 ;

c
DENSITYCHECKWINDOW 200 200 ;
DENSITYCHECKSTEP 100 ;
FILLACTIVESPACING 0.8 ;
ANTENNACUMAREARATIO 396 ;
ANTENNACUMDIFFAREARATIO PWL ( ( 0 396 )
( 0.102 396 ) ( 0.103 999999999 ) ( 1 999999999
) ) ;
END ME1

6/16/08 BD03: Digital Physical Design 126


Routing LAYER Attributes
3D view Attribute Description

e
Width Minimum width of the routing
Thickness wires

c
Area Minimum area for a polygon of
metal

n
Top down view Thickness Minimum thickness of wire

d e Spacing Minimum spacing between wires.


You may specify different
minimum spacing values for
various range of widths.

a
ex: SPACING 0.26 RANGE 1.765
100000.0 ;
means the minimum spacing is

c
0.26 for wires with widths beyond
1.765 microns.

Spacing Width
6/16/08 BD03: Digital Physical Design 127

Routing LAYER Attributes (continued)


‹ These attributes are used to calculate wire delays, cross talk, and other
physical verification parameters.

ce Attribute Description

n
Capacitance calculations
Capacitance The capacitance per
square unit of the wire-to-

e
Resistance ground capacitance
Resistance The resistance per square

d
of the metal
EdgeCapacitance Capacitance EdgeCapacitance The capacitance from the

a
sidewall to the ground of
the metal

6/16/08
c BD03: Digital Physical Design 128
Routing LAYER Attributes (continued)
‹ Most place and route tools have
routing tracks. All metal routes

e
must be placed squarely on these
Attribute Description
tracks.

c
Offset The distance of the first routing track
from the edge of the chip

en Pitch The distance between each successive


routing track

d
Direction Each metal layer has a preferred
direction that the auto router will route
with. It is either vertical or horizontal.

a
Diagonal tracks are usually not
preferred.

6/16/08
c
Offset Pitch

Edge of Chip
BD03: Digital Physical Design 129

Vias
‹ Vias are contacts that connect
together different metal layers. Sample Via Definition

e
//The LAYER statement for metal 1 and 2
‹ Vias usually have three layers: two defined on previous slides.

c
routing layers and a CUT layer in
between. LAYER VI1
TYPE CUT ;

n
SPACING 0.20 ;
END VI1

e
VIA VI1_H DEFAULT
RESISTANCE 4.0000e+00 ;

d
CUT Layer LAYER ME1 ;
RECT -0.16 -0.1 0.16 0.1 ;

a
LAYER VI1 ;
RECT -0.1 -0.1 0.1 0.1 ;
LAYER ME2 ;

c
RECT -0.16 -0.1 0.16 0.1 ;
END VI1_H

This via to connect metal 1 and 2 has three


layers—two routing layers and a cut layer with the
sizes defined by the RECT statement.

6/16/08 BD03: Digital Physical Design 130


Standard Cell LEF File
A standard cell LEF file contains the metal pin layout information for macros.

e
MACRO INVX10MTL PIN VSS
CLASS CORE ; DIRECTION INOUT ;
FOREIGN INVX10MTL 0.000 0.000 ; USE GROUND ;

c
ORIGIN 0.000 0.000 ; SHAPE ABUTMENT ;
SIZE 3.200 BY 2.800 ; PORT
SYMMETRY X Y ; LAYER ME1 ;

n
SITE SAMPLEFSNSITE ; RECT 2.540 -0.180 3.200 0.180 ;
PIN Y RECT 2.260 -0.180 2.540 0.680 ;
DIRECTION OUTPUT ; RECT 1.460 -0.180 2.260 0.180 ;

e
PORT RECT 1.180 -0.180 1.460 0.580 ;
LAYER ME1 ; END
RECT 2.815 0.605 3.100 2.305 ; END VSS

d
RECT 1.980 1.040 2.815 1.760 ; PIN VDD
RECT 1.700 0.605 1.980 2.305 ; DIRECTION INOUT ;
RECT 1.220 0.740 1.700 2.020 ; USE POWER ;

a
END SHAPE ABUTMENT ;
ANTENNADIFFAREA 1.687 ; PORT
END Y LAYER ME1 ;

c
PIN A RECT 2.540 2.620 3.200 2.980 ;
DIRECTION INPUT ; RECT 2.260 2.070 2.540 2.980 ;
PORT RECT 1.460 2.620 2.260 2.980 ;
LAYER ME1 ; RECT 1.180 2.180 1.460 2.980 ;
RECT 0.160 1.140 1.040 1.500 ; END
END END VDD
ANTENNAGATEAREA 0.888 ; END INVX10MTL
END A

6/16/08 BD03: Digital Physical Design 131

MACRO General Attributes


‹ A MACRO in a LEF file can refer Attribute Description
to any instantiated macro, Class This is the type of MACRO. For

e
standard cell, and I/O pads. standard cells, the value is CORE.
Foreign Specifies the name of the macro

c
‹ We will focus on standard cells. when seen in a tool. It specifies how
the position and orientation would be
translated when read into a layout

n
tool.
Origin Specifies the origin of the macro

e
relative to a DEF COMPONENT
placement point. Usually leave this

d
as 0 0 to avoid confusion.
Size Dimensions of the MACRO

ca
6/16/08 BD03: Digital Physical Design 132
MACRO Symmetry
‹ A chip is divided into core rows in which standard cells are placed.

‹ The rows are usually placed in a flipped and abutted pattern, with alternating north (N),

e
and flipped south (FS) orientations.

c
‹ Standard cells are placed in the rows, in N or FS orientation, such that they share VDD
rails and VSS rails.

n
‹ Cells in the N row have the N orientation, whereas those in the FS row have FS
orientation.

VDD or VSS rail

d e Flip and abut


Shared VDD or VSS rail

N Row

FS Row ca N

FS
VDD or VSS rail

6/16/08 BD03: Digital Physical Design 133

MACRO Symmetry (continued)


‹ Cells can also be flipped about their y-axis.

‹ Cells on the N row that are flipped about their y-axis have the FN orientation,

N Row N
e
whereas those flipped vertically on the FS row have the S orientation.

c FN VDD or VSS Rail

FS Row FS

en S

d
‹ The SYMMETRY statement (SYMMETRY X ;) tells the placer which
orientations are allowed when placing cells in the rows.

a
‹ Possible values include

c
‰ X : N and FS orientations should allowed
‰ Y : N and FN orientations should allowed
‰ X Y: All orientations should allowed
‰ R90: Do not use this value for standard cells

6/16/08 BD03: Digital Physical Design 134


MACRO Pins
‹ Now that we have seen some of the
attributes about the standard cell LEF Code for Vdd Pin

e
itself, it is time to look at the most PIN VDD
important components of a standard DIRECTION INOUT ;
cell: its pins. USE POWER ;

c
SHAPE ABUTMENT ;
‹ The pin DIRECTION specifies the PORT
direction of the pin. Values can be

n
LAYER ME1 ;
either INPUT, OUTPUT, or INOUT, RECT 2.540 2.620 3.200 2.980 ;
TRISTATE, or FEEDTHRU. RECT 2.260 2.070 2.540 2.980 ;

e
‹ The pin SHAPE specifies how the pin RECT 1.460 2.620 2.260 2.980 ;
is connected. Values can be RECT 1.180 2.180 1.460 2.980 ;
END

d
ABUTMENT, RING, or FEEDTHRU
(used only for pins with special END VDD
connection requirements, such as

a
power/ground).
‹ The pin USE specifies how the pin is

c
used. Values can be either
ANALOG, CLOCK, GROUND,
POWER, or SIGNAL.

6/16/08 BD03: Digital Physical Design 135

MACRO Pin Shape


The SHAPE statement specifies a pin with special connection requirements due to
its shape. The values are

e
‹ ABUTMENT: Pins that stretch across the cell joining the same pin on adjacent
cells without routing. (Power rails are a good example.)

c
‹ RING: Pin on a large macro that forms a ring around the macro allowing
connection to any point on the ring (used for power on big macros such as
RAMS).

n
e
‹ FEEDTHRU: Pin with an irregular shape with a jog within the cell.

Abutment

a d Ring Feedthrough

6/16/08
c BD03: Digital Physical Design 136
MACRO Pin Port Block
‹ The port statement begins a section, which specifies the location of the metal and via
geometries of the pin relative to the standard cell origin.

e
‹ There can be more than one port block. All ports are electrically connected for that pin.

c
‹ The LAYER statement specifies the layer of the metal or via geometry in the port. There
can be more than one LAYER or VIA statement in each PORT.

n
‹ The RECT statements give the dimensions of the port. (The first two numbers are the x
y coordinates of one corner, whereas the second two numbers are the x and y

e
coordinates of the corner diagonally across from the first one. The convention is lower
left, upper right for the two sets of coordinates.)

a d PIN A
DIRECTION IN ;
USE SIGNAL ;

c
PORT
LAYER ME1 ;
RECT 0.000 0.000 1.000 1.000 ;
RECT 1.000 0.000 2.000 2.000 ;
END
END VSS

6/16/08 BD03: Digital Physical Design 137

Topics in This Module


‹ Introduction to layout describing layers, FETs, and logic gate layouts

e
‹ DRC and reading a DRM

‹ Layout versus schematic checking

‹ LEF library format and syntax

‹ Review

nc
d e
ca
6/16/08 BD03: Digital Physical Design 138
Summary
‹ Layout is the process of placing physical instances of a netlist onto a
chip.

e
‰ This process is primarily used for full custom designs or library cells.

c
‰ The layout for a transistor consists of a polysilicon gate, a diffusion layer,
and a substrate layer.

n
‰ Metal contacts, vias, and substrate taps are needed as interconnects for
your transistors.

e
‰ It is a good idea to lay your design out on paper using a stick diagram
before diving into a tool.

d
‰ Stick diagrams ensure a continuous diffusion layer and consistent vertical
poly strips.

ca
‹ Design rules allow layout engineers to produce a high-yield design
without understanding the intricacies of the fabrication process.
‰ The process engineer provides a document called the design rule manual,
which contains all the pertinent design rules to the layout engineer.
‰ The manual contains minimum spacing requirements for all layers on a
layout.

6/16/08 BD03: Digital Physical Design 139

Summary (continued)
‹ LVS is a tool to check the functional correctness of a layout by
comparing the layout against the netlist for which it was designed.

library file.

ce
‹ The information from a completed layout is annotated into a LEF

‰ The LEF file is used in automatic place and route tools, giving the tools

n
information about the routing layers for a certain technology process.

e
‰ The technology LEF file contains design rules of all metal layers, whereas
the standard cell LEF file contains the locations of all internal pins and

d
routing inside the standard cells.

ca
6/16/08 BD03: Digital Physical Design 140
Testing Your Understanding
True or false

e
1. Laying out an entire chip manually is an easy process and is done
routinely in the industry.

material.

nc
2. The diffusion layer of an NMOS is made out of heavily doped p-type

3. Stick diagrams contain no information about the dimensions of your

e
layout.

d
4. Design rules must be strictly followed in order for the design to have a
high yield.

ca
5. The technology LEF file contains only the standard cell information
about a certain technology process.

6/16/08 BD03: Digital Physical Design 141

Learning Activity
In this activity, you will

e
‹ Match the following LEF file terms with the corresponding diagram in
the handout.

c
‹ Present your results to the class.

n
e
15 minutes for activity
10 minutes for debriefing

a d
6/16/08
c BD03: Digital Physical Design 142
Timing Libraries and Constraint Files

Module 3

June 16, 2008

One Verilog Source, Many Design Possibilities

Design 1

ce
Verilog
Design 2

Timing
Library

en
Logic

d
Synthesis Design 3
Constraints

ca Design 4

6/16/08 BD03: Digital Physical Design 144


One Verilog Source, Many Design Possibilities (continued)
‹ Using the same Verilog® design files, different variations of the same
design can be made.

ce
‰ Design1 is the smallest and Design4 is the biggest, but all four designs
perform the same logical function.
‹ How were the designs made different?

‹ Timing library

en
‰ The timing library and constraints made all the difference.

d
‰ Guides as to which technology to target to, for example 130 nm or 65 nm.

‹ Constraints

a
‰ This defines the rules based on which the design has to be made.

c
‰ If the rules are written well, the results is a better and smaller design.
‰ If the rules are written poorly, even with the best technology, the result is
the worst and biggest design.

6/16/08 BD03: Digital Physical Design 145

One Verilog Source, Many Design Possibilities (continued)


‹ Let’s look at the process in detail to understand what we will learn
today.

ce
‰ The designers write equivalent behavioral Verilog code, which has the
same functionality as a digital circuit that is to be manufactured.
‰ A synthesis tool is used to convert this behavioral code into a structural

n
code implementing the same functionality.
 Structural Verilog consists of instantiated gates.

e
‹ But from where does the synthesis tool get these gates?

d
ca
6/16/08 BD03: Digital Physical Design 146
Module Objectives
In this module, you will be able to

e
‹ Identify the syntax of a timing library and describe how the numbers in
the library are used for timing analysis

c
‹ Create a constraint file based on timing specifications

n
d e
ca
6/16/08 BD03: Digital Physical Design 147

Topics in This Module


‹ Technology libraries

e
‹ Constraints
‰ General-purpose and object-access constraints
‰ Timing constraints

n
‰ Environmental constraints

c
d e
ca
6/16/08 BD03: Digital Physical Design 148
What Are Timing Libraries?
‹ Every foundry has a list of gates with which it can build designs.

e
‹ A list of such gates and cells is stored in a file generically called a
library.

c
‰ Cells are library representations of gates. You use a cell from a library to
create a gate in your design.

technology library.

en
‹ One such file that is used by a synthesis tool, is called as synthesis

‰ For our presentation, we will refer to it as library.

a d
‰ Different views (logical, physical, etc.) of the gates and cells are stored in
different files. Together, these views are called a technology library.
‰ Other library files also exist and contain information needed by the back-

c
end tools (not discussed in this section).

6/16/08 BD03: Digital Physical Design 149

What Are Timing Libraries? (continued)


‹ The most common format used to
write these libraries is the Liberty

e
format, which uses a .lib
extension.

c
‹ A library file is comprised not only
of a list of gates but also RTL SDC

n
‰ Their functional/logical definitions
‰ Power, energy, and timing

e
characteristics
‰ Their physical characteristics such Logic Synthesis Library

d
as area and footprint
Synthesized
‹ If this library is given as an input to Gates

a
Netlist
a synthesis tool along with the
behavioral (RTL) code, it converts

c
it into an appropriate structural
design.
‹ Synthesis can replace cells with
other cells of the same footprint
without affecting logic function.

6/16/08 BD03: Digital Physical Design 150


What Is in a Library File?
A library file consists of two sections, header and body.

General attributes
Header

ce
Library File

Cell name
Body

Documentation attributes
Unit attributes
Operating conditions

en Physical description

d
Pin information
Threshold and default definitions

a
Templates
Power characteristics
More attributes

c
Voltage information
Wire load definitions

6/16/08
Timing characteristics

BD03: Digital Physical Design 151

General, Documentation, and Unit Attributes


‹ General attributes Library File
‰ delay_model: Delay model used,

e
Header
lookup, or calculated
General attributes

c
‰ Other attributes not discussed

‹ Documentation attributes Documentation attributes

n
‰ revision: Revision number
Unit attributes
‰ date: Date created
‰ comment: Any comments

‹ Unit attributes

d
‰ time_unit: nano, pico, etc.
e Operating conditions

Threshold and default definitions

a
‰ voltage_unit: milli, micro, etc. Templates

c
‰ current_unit: milli, micro, Etc.
More attributes
‰ pulling_resistance_unit: Ω, etc.
‰ leakage_power_unit: watt, etc. Voltage information
‰ capacitive_load_unit: pico,
Wire load definitions
femto, farad, etc.
6/16/08 BD03: Digital Physical Design 152
Operating Conditions
Operating conditions are the conditions Library File
under which the chip will operate,

e
including process, temperature, and Header
voltage General attributes
‹ nom_process: 1, 2, etc.

‹ nom_temperature: 100, 120, etc.

‹ nom_voltage: 1, 0.9, etc.

nc Documentation attributes

Unit attributes

‹ operating_conditions
‰ process: 1, 2, etc.

d e Operating conditions

Threshold and default definitions

a
‰ temperature: 100, 120, etc.
‰ voltage: 1, 0.9, etc.
Templates

c
‰ tree_type: balanced, etc. More attributes

Voltage information

Wire load definitions

6/16/08 BD03: Digital Physical Design 153

Threshold and Default Definitions


‹ Threshold definitions
Library File
‰ Slew_lower_threshold_pct_rise:

e
10, 30, etc. Header
‰ Slew_lower_threshold_pct_fall: General attributes

c
90, 70, etc.
‰ These indicate the points from Documentation attributes

n
where the slew should be
calculated Unit attributes

30% here

d
70% here
e Operating conditions

Threshold and default definitions

a
Templates
‹ Default definitions

c
‰ Contains attributes such as More attributes
default_fanout_load,
default_max_transition, etc. Voltage information
‰ There are many more attributes
that are not discussed Wire load definitions

6/16/08 BD03: Digital Physical Design 154


Templates
‹ Templates: Different types of Library File
templates present are

e
Header
‰ Power/energy template
General attributes

c
‰ Timing template, etc.
Documentation attributes
‹ Template shows how these
characteristics would be
described in the library
‹ Let’s look at an example to

en Unit attributes

Operating conditions

d
understand better: Threshold and default definitions

a
lu_table_template(delay_template_7x7) { Templates
variable_1: input_net_transition;

c
variable_2:
total_output_net_capacitance;
More attributes
index_1 ("1000, 1001, 1002, 1003, 1004,
1005, 1006"); Voltage information
index_2 ("1000, 1001, 1002, 1003, 1004,
1005, 1006");
}
Wire load definitions

6/16/08 BD03: Digital Physical Design 155

Example Template
Shown below is an example of timing template. (Templates for other
characteristics are similar and will not be discussed in this module.)

ce
lu_table_template(delay_template_7x7) {
variable_1: input_net_transition; (ex: 1, 2, 3, 4, 5, 6, 7)
variable_2: total_output_net_capacitance; (ex: 10, 20, 40, 80,

n
160, 320, 640)
index_1 ("100, 101, 102, 103, 104, 105, 106");
index_2 ("100, 101, 102, 103, 104, 105, 106");

e
}

d
‹ lu indicates that it is a lookup and not calculated.

‹ 7x7 indicates the size of the lookup to be 7 rows and 7 columns.

a
‹ variable_1 indicates the factor for row indices.

c
‹ variable_2 indicates the factor for column indices.
‰ Example: Delay for an input_net_transition of 2 ps and
total_output_net_capacitance of 80 pF is row2 and column4. From the
table, we get this value to be 103 ps.

6/16/08 BD03: Digital Physical Design 156


Voltage Information and Wire Load Definitions
‹ There are more attributes defined
Library File
in a library file such as the pad

e
attributes (I/O pads), which are not Header
discussed here. General attributes

c
‹ Voltage information
‰ Minimum, maximum, and other
Documentation attributes

n
complimentary MOS (CMOS)
characteristics of the input and Unit attributes

e
output voltages are described in
this section, Operating conditions

d
‹ Wire load definitions
Threshold and default definitions
‰ In a digital circuit, not only the

a
gates, but even wires have delays
associated with them. Templates

c
‰ They may be small compared to
gate delay, but considering the More attributes
amount of wiring in the latest
chips, their delay accounts to as Voltage information
much as 50%.
Wire load definitions

6/16/08 BD03: Digital Physical Design 157

Example: Wire Load


‹ A wire load or wire load model (WLM) is an estimate of the net delays
in a netlist.

ce
‹ Many WLM choices are available in a timing library, and they are
chosen based on the size of a design.
‹ Example:

e
wire_load(“wire_load name") {
resistance : 8.0e-8;
n
d
capacitance: 1.2e-4;
area : 0.7;

a
slope : 66.667;
fanout_length (200.0);
}

6/16/08
c
‹ A custom wire load model (CWLM) is a user-generated model that can
be used to more accurately estimate the net delays.

BD03: Digital Physical Design 158


Cell Name and Physical Description
‹ Cell name: Indicates the name of Library File
the cell

e
Body
‹ Physical description
Cell name

c
‰ cell_footprint: general name, ex
and2, or2, etc.

n
‰ area: area of the cell, 20.8, 30.4,
etc.
Physical description

e
‹ Example:

d
cell (ADDX1) { Pin information
cell_footprint : add;

a
area : 80.000;
Power characteristics

6/16/08
c Timing characteristics

BD03: Digital Physical Design 159

Pin Information
‹ Direction: Input/Output/Inout/Internal
Library File
‹ Capacitance: Capacitance that is seen

e
at this pin. Body
‹ Output pins: Cell name

c
‰ Function: Value based on the inputs
‰ Example:

n
 Function: (in1 in2) Æ and gate
 Function: (in1 | in2) Æ or gate Physical description

e
‹ Example:
pin(CI) {
direction : input;

d
Pin information
capacitance : 0.004189;
}

a
pin(S) {
direction : output; Power characteristics

c
capacitance : 0.0;
function : "(A ^ B ^ CI)";
‹ Other characteristics of the pin include
power and timing characteristics Timing characteristics
‹ Note: Power is not discussed in this
module.
6/16/08 BD03: Digital Physical Design 160
Timing Characteristics
‹ The timing information is Library File
displayed for each output pin in

e
relation to each input pin, in the Body
form of a lookup table. Cell name

c
‰ If it is not calculated

n
‹ There are multiple lookup tables
for each type of delay. Physical description

e
‰ Rise delay
‰ Rise transition

d
Pin information
‰ Fall delay, etc.

a
‹ It is displayed exactly as we saw
in the timing template earlier, but a Power characteristics

c
little more detail.

Timing characteristics

6/16/08 BD03: Digital Physical Design 161

Timing Characteristics Example


‹ The two variables and the full 7x7 lookup table is displayed.

e
‹ Example:

c
pin(Y) {
direction: output;
capacitance: 0.0;

n
function: "(A B)";
internal_power() {

e
related_pin: "A";
cell_rise(delay_template_7x7) {
index_1 ("0.04, 0.07, 0.1, 0.2, 0.5, 1.0, 2");

d
index_2 ("0.006, 0.030, 0.078, 0.174, 0.366, 0.749,
1.523");
values ( \

a
"0.07, 0.09, 0.13, 0.20, 0.35, 0.64, 1.23", \
"0.08, 0.10, 0.13, 0.21, 0.35, 0.65, 1.24", \

c
"0.09, 0.11, 0.15, 0.22, 0.37, 0.66, 1.25", \
"0.11, 0.13, 0.17, 0.25, 0.39, 0.68, 1.28", \
"0.14, 0.17, 0.20, 0.28, 0.42, 0.72, 1.31", \
"0.18, 0.21, 0.25, 0.33, 0.47, 0.76, 1.35", \
"0.23, 0.26, 0.31, 0.39, 0.54, 0.83, 1.42");
}

6/16/08 BD03: Digital Physical Design 162


Timing Characteristics Example (continued)
‹ index_1 represents input net transition, and index_2 represents total
output net capacitance.

values are looked up.

ce
‹ Depending on various values for these indexes, corresponding delay

‹ Question: In the previous example, if the delay template was given as

en
follows, what would the cell_rise be if the input_net_transition was
0.07 and the total_output_net_capacitance was 0.030?

d
lu_table_template(delay_template_7x7) {
variable_1 : input_net_transition;

a
variable_2 : total_output_net_capacitance;
index_1 ("1000, 1001, 1002, 1003, 1004, 1005,

c
1006");
index_2 ("1000, 1001, 1002, 1003, 1004, 1005,
1006");
}

6/16/08 BD03: Digital Physical Design 163

Library File: Summary


‹ A library file is a file that contains basic information about cell
functionality, timing, power, etc., for a given technology node.

ce
‹ One such file (the liberty file), which is used by a synthesis tool, is
called a synthesis technology library.
‹ The library file is divided into two main parts:

en
‰ Header: Contains all the attributes and terminology used in the library
‰ Body: Contains characteristics of each cell that a foundry has for a specific
technology

a d
‹ Synthesis tools use these files to generate structural Verilog files
equivalent to behavioral (RTL) Verilog files given as inputs.

c
‹ Next, we will see what else is given as inputs to a synthesis tool.

6/16/08 BD03: Digital Physical Design 164


Discussion Question
Given the following circuit and .lib file examples, what is the path delay?
cell (DFFX1) {

e
cell_footprint : dffx1;
area : 50.0;
pin(D) {
direction : input;
DFFX1 DFFX1 timing() {
BUFX1

c
related_pin : "CK";
timing_type : setup_rising;
rise_constraint(setup_template_3x3) {
index_1 ("0.05, 1.4, 4.5");
index_2 ("0.05, 1.4, 3.3");
values ( \

n
"0.156250, 0.070312, 0.113281", \
"0.246094, 0.140625, 0.175781", \
"0.203125, 0.093750, 0.128906");
}
pin(Q) {

e
direction : output;
timing() {
related_pin : "CK";
timing_type : rising_edge;
cell (BUFX1) { timing_sense : non_unate;

d
cell_footprint : buf; cell_rise(delay_template_7x7) {
area : 13.0; index_1 ("0.05, 0.15, 0.6, 1.4, 2.3, 3.3, 4.5");
pin(A) { index_2 ("0.00035, 0.021, 0.0385, 0.084, 0.147, 0.231, 0.3115");
direction : input; values ( \
} "0.291957, 0.437181, 0.550916, 0.843878, 1.248819, 1.788431, 2.305442", \

a
pin(Y) { "0.316264, 0.461499, 0.575227, 0.868187, 1.273127, 1.812741, 2.329752", \
direction : output; "0.388358, 0.533648, 0.647351, 0.940318, 1.345271, 1.884899, 2.401920", \
function : "A"; "0.439033, 0.584292, 0.697982, 0.990937, 1.395897, 1.935540, 2.452571", \
internal_power() { "0.462183, 0.607445, 0.721146, 1.014067, 1.419031, 1.958683, 2.475723", \
timing() { "0.468653, 0.613990, 0.727660, 1.020554, 1.425521, 1.965184, 2.482228", \

c
related_pin : "A"; "0.460997, 0.606314, 0.719968, 1.012831, 1.417787, 1.957454, 2.474507");
timing_sense : positive_unate; }
cell_rise(delay_template_7x7) { lu_table_template(delay_template_7x7) {
index_1 ("0.05, 0.15, 0.6, 1.4, 2.3, 3.3, 4.5"); variable_1 : input_net_transition;
index_2 ("0.00035, 0.021, 0.0385, 0.084, 0.147, 0.231, 0.3115"); variable_2 : total_output_net_capacitance;
values ( \ index_1 ("1000, 1001, 1002, 1003, 1004, 1005, 1006");
"0.094400, 0.235579, 0.351869, 0.653282, 1.070188, 1.625921, 2.158454", \ index_2 ("1000, 1001, 1002, 1003, 1004, 1005, 1006");
"0.116567, 0.257243, 0.373654, 0.675230, 1.092220, 1.647999, 2.180553", \ }
"0.156644, 0.301067, 0.417546, 0.719020, 1.136089, 1.691941, 2.224538", \
"0.165784, 0.318068, 0.434036, 0.735633, 1.152743, 1.708488, 2.241054", \ lu_table_template(setup_template_3x3) {
"0.149625, 0.311618, 0.428220, 0.729893, 1.147035, 1.702969, 2.235440", \ variable_1 : constrained_pin_transition;
"0.117344, 0.289370, 0.407324, 0.710811, 1.128181, 1.684128, 2.216830", \ variable_2 : related_pin_transition;
"0.067751, 0.250660, 0.370401, 0.676924, 1.096166, 1.652295, 2.184962"); index_1 ("1000, 1001, 1002");
} index_2 ("1000, 1001, 1002");
}

6/16/08 BD03: Digital Physical Design 165

Topics in This Module


‹ Technology libraries

e
‹ Constraints
‰ General-purpose and object-access constraints
‰ Timing constraints
‰ Environmental constraints

nc
d e
ca
6/16/08 BD03: Digital Physical Design 166
Constraints
‹ The rules that are written are referred to as constraints.

e
‹ Constraints are essential to meet design goals in terms of area, timing,
and power to obtain the best possible implementation of a circuit.

nc
‹ Constraints allow designers to control various aspects of synthesis.

‹ Synthesis algorithms and heuristics are tuned to automatically find the


most optimal solution; however, sometimes they initially fail to reach

e
the most optimal result.

a d
6/16/08
c BD03: Digital Physical Design 167

Defining Constraints
‹ Every EDA tool has its own commands to define constraints for a
design.

ce
‹ However, there is a common format, which is supported by almost all
the EDA tools, to define the constraints.
‹ This format is called Synopsis Design Constraint (SDC) format.

en
‹ The constraints are defined using special SDC commands.

‹ The file is saved with an .sdc extension.

a d
6/16/08
c BD03: Digital Physical Design 168
SDC Format
‹ The SDC commands are divided into these broad categories:
‰ General-purpose commands
‰ Object-access commands
‰ Timing commands

ce
n
‰ Environmental commands

‹ There are more categories in SDC format, but we will be discussing

e
only these for this course.

d
ca
6/16/08 BD03: Digital Physical Design 169

Topics in This Module


‹ Technology libraries

e
‹ Constraints
‰ General-purpose and object-access constraints
‰ Timing constraints

n
‰ Environmental constraints

c
d e
ca
6/16/08 BD03: Digital Physical Design 170
General-Purpose Commands
‹ Following are general-purpose commands:
‰ expr: Used to create simple expressions

ce
 Syntax: expr arg1 arg2 arg3 … argn
 Example: expr 0.1 + 0.2 + 0.1
 The result of this is the addition of the three numbers that is 0.4.

en
‰ set: Used to define variables
 Syntax: set variable_name Value
 Example: set design design1

a d
 The variable $design now contains the value design1. A list of values
can also be defined as shown below:
 set list {item1, item2, item3, item4 … itemn}

6/16/08
c
‹ There are other general-purpose commands that are not discussed
here.

BD03: Digital Physical Design 171

Object-Access Commands
‹ These commands are used to get the location of an object in the
design.

design.

ce
‹ The object can be a cell, a block, a port, a pin, or anything else in the

‹ The commands are as listed below, most of which are self-


explanatory.

en
‰ all_clocks: Returns a list of all clocks
‰ all_inputs: Returns a list of all inputs within a clock domain

a d
 Syntax: all_inputs –clock <clock_name>
‰ all_outputs: Returns a list of all outputs within a clock domain
 Syntax: all_outputs –clock <clock_name>

6/16/08
c
‰ get_cells: Searches for a cell with a particular naming pattern and returns
its location if found
 Syntax: get_cells pattern

BD03: Digital Physical Design 172


Object-Access Commands (continued)
‰ get_clocks: Searches for a clock with a particular naming pattern and
returns its location if found

e
 Syntax: get_clocks pattern

c
‰ get_nets: Searches for a net (equivalent to a wire in the Verilog language)
with a particular naming pattern and returns its location if found

n
 Syntax: get_nets pattern
‰ get_pins: Searches for a pins (ports of cells) with a particular naming

e
pattern and returns its location if found
 Syntax: get_pins pattern

a d
‰ get_ports: Searches for a port (inputs/outputs to the design) with a
particular naming pattern and returns its location if found
 Syntax: get_ports pattern

6/16/08
c BD03: Digital Physical Design 173

Topics in This Module


‹ Technology libraries

e
‹ Constraints
‰ General-purpose and object-access constraints
‰ Timing constraints

n
‰ Environmental constraints

c
d e
ca
6/16/08 BD03: Digital Physical Design 174
Timing Constraints
‹ To model a clock in a design, use the create_clock and the
create_generated_clock SDC commands.

ce
‹ Let’s look at a few definitions before we start modeling a clock.
‰ Clock period is defined as the time difference between two consecutive
rising or falling clock edges.

en
‰ Duty cycle is defined as the ratio between the pulse duration (t) and the
period (T) of a rectangular waveform.

d
Pulse Duration t
Duty Cycle = t/T

ca
Rising Falling
Edges Edges
Clock Period T

6/16/08 BD03: Digital Physical Design 175

create_clock
‹ The create_clock command is used to model a clock waveform.

e
‰ Syntax:
create_clock -period <period_value in nanoseconds> \

c
-name <clock name> -waveform <edge list> source_objects

n
‰ Example:
create_clock –name core_clock –period 10 \

e
–waveform (4, 10) [get_port “clock”]

d
‹ The clock waveform that would be modeled is as below:

a
Pulse Duration t = 6 ns

Core_clock

6/16/08
c 4 10 14 20 24 30

Clock Period T = 10 ns

BD03: Digital Physical Design


34 40

Duty Cycle = t/T


6/10 = 60%

176
create_generated_clock
‹ This command is used to generate the model of a clock from an
existing clock model, such as from a PLL or dedicated clocking block.
‹ Example:

ce
‰ core_clock is the base clock defined in the design below.

n
‰ clock2 is derived by multiplying the core_clock by 2.

e
External
clock port

d
PLL multiply
by 2 logic

ca core_clock is defined
at this point.
clock2 is defined at
this point.

6/16/08 BD03: Digital Physical Design 177

create_generated_clock (continued)
‹ If the definition of the core_clock changes, it is automatically reflected
in the generated clock model.

‰ create_generated_clock

e
‹ Some of the arguments that can be passed to it are

c
-name <clock_name>
-source <master_pin>

en -divide_by <factor>
-multiply_by <factor>
-duty_cycle <percent>

d
-invert
-master_clock clock

ca
‹ Example: create_generated_clock –name clock2
source_objects

–source [get_pin core_clock] –multiply_by 2


–duty_cycle 50 –master_clock core_clock
[get_port] clock2

6/16/08 BD03: Digital Physical Design 178


create_generated_clock (continued)
‹ The clock waveform that would be modeled by the command in the
previous slide is as shown below.

ce
n
core_clock

clock2

d e
ca
6/16/08 BD03: Digital Physical Design 179

set_clock_transition
‹ The create_clock command assumes an ideal clock with no rise and fall times.
‹ To model some realistic values of rise and fall time, the set_clock_transition command

e
is used.
‹ Some of the arguments to this command are

c
‰ set_clock_transition -rise –fall <transition> <clock_list>
‰ Example: set_clock_transition –rise 0.1

n
[get_clocks “clock_core”]
Set_clock_transition –fall 0.1

e
[get_clocks “clock_core”]
‰ The above two commands together model the clock core to have a rise and fall time of 0.1 ns.

a ideal clock
d
c
0.1 0.1

with clock_
transition

6/16/08 BD03: Digital Physical Design 180


set_clock_uncertainty
‹ The clock in real time is never perfect, and this must be accounted for
in the clock model that is defined.

ce
‹ In SDC, this is achieved by the set_clock_uncertainty command

‹ Some of the arguments that can be given to it are

n
‰ set_clock_uncertainty -from <from_clock>
-to <to_clock>

e
-setup
-hold

a
‰ Example:
d <uncertainty>
<object_list>

c
set_clock_uncertainty 1.0 [get_port “clock2”]
‰ An uncertainty of 1.0 ns is set for clock2, as shown in the next slide.

6/16/08 BD03: Digital Physical Design 181

set_clock_uncertainty (continued)

clock2

ce
en
‹ The uncertainty value means that the clock edge can start 1 ns before
or after the ideal clock edge.

d
‹ Example:
set_clock_uncertainty –from core_clock –to clock2 3.0

a
‰ This means that if there is a logical path that goes from the core_clock
clock domain to the clock2 clock domain, the uncertainty for such paths is

c
3 ns.

6/16/08 BD03: Digital Physical Design 182


set_disable_timing
‹ This command marks a path to not be timed; that is, this path will be
regarded as virtually nonexistent. Use this command as a last resort,

e
when certain technology specific cells or macros require it.

c
‹ Care should be taken in using this command, because not only does it
render the said path as not timed, but any other path that passes

n
through this path as well.

d e
ca
6/16/08 BD03: Digital Physical Design 183

set_disable_timing (continued)
‹ When the path from in2 to out2 as shown by the red arc is disabled, all the other paths going through
this arc are also disabled.

e
‹ That is, by stating this one arc, two paths are disabled:
1. D1->Q1->in2->out2->D3->Q3 and

c
2. D2->in1->out1-> in2->out2->D3->Q3

‹ But if the intention was only to disable one pat, say number 1 above, it should have been stated in a

n
different way or using a different command, which we will see later.
‹ Various arguments that can be used with command are

e
‰ set_disable_timing -from <from_pin_name> -to <to_pin_name> <cell_pin_list>
‰ Example: set_disable_timing –from in2 –to out2

d
sel
D1 Q1 in2

a
D Q D3 Q3
D Q

c
out2

D2 in1
D Q
out1

6/16/08 BD03: Digital Physical Design 184


set_false_path
‹ This command is used to remove a particular path or set of paths from
being timed, that is, it will set it as false.

e
‰ False path: A path that has no functional purpose, or a path that does not
need to be timing constrained (for example, path between two clock

c
domains).
‹ When a path is set as a false path, the synthesis tool only maps it to
technology-specific gates.

en
‰ The tool does not optimize or improve the timing of this path even if it does
not meet timing.

d
‹ Reasons for false path:

a
‰ Path is never exercised during circuit operation
‰ Path is only possible in special operation mode (test mode, etc.)

6/16/08
c
‹ This command is different from set_disable_timing in the sense that
‰ Only the paths specified are set as false.
‰ Any other paths passing through a false path, but not sharing the same
exact start-end pair, will not be affected.

BD03: Digital Physical Design 185

set_false_path (continued)
‹ Some of the arguments that can be passed to this command are
‰ set_false_path -from <from_list> -to <to_list> -through <through_list>

e
‹ The following command sets the path from F1 to F3 as false:
‰ set_false_path –from [get_cells “F1”] –to [get_cells “F3”]

nc
‹ But if the intention is to set all the paths that pass through the red arc as false, this is
how it can be done:
set_false_path –through [get_cell “OR1/in2”]

‰ This will set both the paths from F1 and F2 to F3 as false.

D1 Q1

d e sel
in2

a
D Q D3
M1 OR1 Q3
F1
D Q

c
out2 F3

D2 in1
D Q AND1
F2 out1

6/16/08 BD03: Digital Physical Design 186


I/O Delays
‹ The I/O delays consist of input delay and output delay.

e
‹ Input delay is the time it would take for the data to arrive at the input
port of the design.

should have.

nc
‹ Output delay is the margin that the data going out of the output port

‰ It can be viewed as the input delay for the input port of another design that

e
is connected to this output port.
‰ The figure on the following slide illustrates this better.

a d
6/16/08
c BD03: Digital Physical Design 187

I/O Delays (continued)


‹ The clouds in the figure represent combinational logic.

e
‹ The corresponding commands in the SDC format to model these
delays are

c
‰ Input delay: set_input_delay

n
‰ Output delay: set_output_delay

Output delay

e
Input delay

a d
6/16/08
c Clock period
Data arrival timing
Clock period Clock period
Data Required timing

BD03: Digital Physical Design 188


set_input_delay
‹ Some of the arguments to set_input_delay are
‰ set_input_delay

e
-clock <clock_name>
-max

c
-min
-add_delay

n
<delay_value>

e
<port_pin_list>
‰ If an input delay has already been specified for a pin, then the –add_delay

d
argument enables the new delay specified to be added on to the existing
delay.

a
‹ Example: set_input_delay –max 3.0 [get_pin “in1”]

c
–clock [get_clock “core_clock”]
‰ This command assumes an input delay of 3.0 ns for the data coming in at
the input port in1.

6/16/08 BD03: Digital Physical Design 189

set_output_delay
‹ Some of the arguments to set_output_delay are
‰ set_output_delay

e
-clock <clock_name>
-max

c
-min
-add_delay

n
<delay_value>

e
<port_pin_list>
‰ If an output delay has already been specified for a pin, then the

d
–add_delay argument enables the new delay specified to be added on to
the existing delay.

a
‹ Example: set_output_delay –max 3.0 [get_pin “out1”]

c
–clock [get_clock “core_clock”]
‰ This command assumes an output delay of 3.0 ns for the data going out of
the output port out1.

6/16/08 BD03: Digital Physical Design 190


Logic Outside the Design
‹ To get an accurate model, the virtual flops/cell that may be there outside your
design, and the input transition from it, must be modeled as well.

e
Input delay Output delay

nc
d e
Clock period Clock period Clock period

a
Data arrival timing Data Required timing

c
‹ Same is true on the output side, the case being that the output load that is
being driven must be modeled.
‹ These modeling techniques fall under environmental modeling commands and
will be covered after this section.

6/16/08 BD03: Digital Physical Design 191

set_max_delay
‹ max_delay is the period that a combinational path from the input port
to the output port in the design should meet.

ce
n
In Out
Combinational 1

d e
ca Max Delay

6/16/08 BD03: Digital Physical Design 192


set_max_delay (continued)
‹ Some of the arguments to the set_max_delay command are
‰ set_max_delay

e
-from <from_list>
-to <to_list>

c
-through <through_list>
<delay_value>

en
‹ Example: set_max_delay 5.0 –from [get_port “IN”]
–through [get_cell “combinational1”]
–to [get_port “OUT”]

a d
‰ This command sets a maximum delay that is allowable through the
combinational path shown in the previous figure as 5.0 ns.

6/16/08
c BD03: Digital Physical Design 193

set_multicycle_path
‹ The figure below helps illustrate multi_cycle_path.
D1 Q1
D Q

ce D Q

en
d
Data captured Data not Data captured
here; launch
captured here here

a
data from D1

c
‹ The data in this example is captured every other clock cycle.

‹ Specify in the SDC file that this particular path has a time period of two
times that of the clock period.
‹ This can be achieved by using the set_multicycle_path command.

6/16/08 BD03: Digital Physical Design 194


set_multicycle_path (continued)
‹ Some of the arguments to this command are
‰ set_multicycle_path -start

ce -end
-from <from_list>
-to <to_list>

n
-through <through_list>

e
<path_multiplier>
‰ Example: set_multicycle_path –from [get_pin “D1”]

d
–to [get_pin “Q1”] 2

a
‰ This command sets the time period for this particular path as twice the
clock period of that clock domain.

6/16/08
c BD03: Digital Physical Design 195

Topics in This Module


‹ Technology libraries

e
‹ Constraints
‰ General-purpose and object-access constraints
‰ Timing constraints

n
‰ Environmental constraints

c
d e
ca
6/16/08 BD03: Digital Physical Design 196
set_driving_cell
‹ Remember the virtual logic outside the design when modeling I/O
delays.

particular input port.

ce
‹ The set_driving_cell command specifies the cell type that is driving a

n
Virtual Logic
Output delay

e
Input delay

a d
6/16/08
c Clock period
Data arrival timing
Clock period Clock period
Data Required timing

BD03: Digital Physical Design 197

set_driving_cell (continued)
‹ Some of the arguments to this command are
‰ set_driving_cell

e
-lib_cell lib_cell_name
-library <lib_name>

c
-pin <pin_name>
-clock <clock_name>

n
<port_list>

e
‰ Example: set_driving_cell –libcell AND2X
–library xyz_130nm –pin Y

d
–clock [get_clocks “clk”] [get_port “input1”]

a
‰ This command indicates that the output pin Y of an AND2X gate from the
library xyz_130nm is connected to the input1 port in the clk clock domain.

6/16/08
c BD03: Digital Physical Design 198
set_input_transition
‹ This command models the transition of the waveform at the input port.

e
‹ Rise time: The time it takes for the waveform to rise from 5% to 95% of
its final value.

its base value

nc
‹ Fall time: The time it takes for the waveform to fall from 95% to 5% of

Input transition

d e
ca Rise time Fall time

6/16/08 BD03: Digital Physical Design 199

set_input_transition (continued)
‹ Some of the arguments to this command are
‰ set_input_transition

e
-rise
-fall

c
-clock <clock_name>
<transition>

n
<port_list>

d e
‹ Example: set_input_transition –rise 0.1
–clock [get_clocks “clk”]

a
[get_port “input1”]
set_input_transition –fall 0.2

c
–clock [get_clocks “clk”]
[get_port “input1”]
‰ These commands model the waveform of input1 with a rise time of 0.1 ns
and a fall time of 0.2 ns.

6/16/08 BD03: Digital Physical Design 200


set_load
‹ This command is used to model a load that an output port may see.

e
‹ At the output, one is not concerned about the cell that may be driven
or the transition that they may receive.

nc
‹ View this as the input port modeling that another block, connected to
this output port, would model.
‹ Some of the arguments to this command are
‰ set_load -min

d
-max

e
<value>

a
<Objects>

c
‰ Example: set_load –max 20 [get_ports “output1”]
‰ This command sets a maximum load of 20 fF on the output1 port.

6/16/08 BD03: Digital Physical Design 201

set_case_analysis
‹ In a large design, all of the logic in the design may not be active at the
same time.

ce
‹ Logic blocks may be activated based on the value of certain inputs.
‰ This is done sometimes to save power.

n
‰ Some designs themselves are configurable to perform different tasks
depending on certain input values.

EN[1:0]

d e Block1
EN(00)
Block2
EN(01)

ca EN(11)
Block4
EN(10)
Block3

6/16/08 BD03: Digital Physical Design 202


set_case_analysis (continued)
‹ The previous design is an example where different blocks are enabled
by an external enable signal.

ce
‹ To get accurate timing and power numbers of the entire design, it
should be timed with one block enabled at a time, because that is
essentially how the design would actually behave.

en
‹ This can be achieved by setting the enable pin to a constant value for
timing with the use of the set_case_analysis command.
‹ The arguments that can be passed to it are
‰ set_case_analysis

a d <value (0 or 1)>
<port_or_pin_list>
‰ Example: set_case_analysis 0 [get_port “EN[0]”]

6/16/08
c set_case_analysis 0 [get_port “EN[1]”]
‰ This command sets the value of the EN pin to binary 00 during timing.

BD03: Digital Physical Design 203

set_max_fanout
‹ Fanout indicates the number of cells being
driven by one cell.

e
‹ If this number is very big, the size of the
driving cell is increased by the synthesis to

c
be able to drive this large load.
‹ Bigger cell means more area, and
sometimes it is desirable to restrict the size

n
of cells.
‹ This can be achieved by specifying the

e
maximum number of loads a cell can drive,
and this is exactly what this command

d
models.
‹ The arguments to this command are

a
‰ set_max_fanout <value>

object_list

c
‹ Example: set_max_fanout 16 TOP_LEVEL
‰ This command sets a limit of 16 on the
Fanout of 16
number of loads to all the cells in the design
TOP_LEVEL.

6/16/08 BD03: Digital Physical Design 204


Summary
‹ To make a good design, the technology choice and the constraints are
as important as the design itself.

aspects of the design.

ce
‹ Constraints guide the synthesis tool and tell it how to handle different

‹ The libraries provide the synthesis tool with the building blocks for the
design itself.

en
a d
6/16/08
c BD03: Digital Physical Design 205

Testing Your Understanding


1. Is it possible to give more than one generic technology library as input
to a synthesis tool? What would the outcome be?
A. No, it is not possible.

ce
B. Yes, it is possible and will result in a better design.
C. Yes, it is possible, but will result in a worse design.

MHz. The port name is clk.

en
2. Write the SDC command to model a clock that has a frequency of 100

d
3. Write the SDC command to model rise and fall times of 100 ps for the
above mentioned clock.

ca
6/16/08 BD03: Digital Physical Design 206
Testing Your Understanding (continued)
4. Which of the following constitutes uncertainty?

e
A. Clock skew
B. Clock jitter

c
C. Wire load assumptions

n
D. Margin
E. All of the above

d e
ca
6/16/08 BD03: Digital Physical Design 207

Learning Activity
In this activity, you will

e
‹ Interpret the specifications for a given design

‹ Create an SDC file based on the specifications given

c
‹ Present your results to the class

n
e
20 minutes for activity
10 minutes for debriefing

a d
6/16/08
c BD03: Digital Physical Design 208
Synthesis

Module 4

June 16, 2008

Which Method Would You Use …


… to design the logic circuitry for a one-million gate design?

12
Paper
and Pencil

ce
n
Months

d e
Schematic
Capture

a
.1 Logic Synthesis

6/16/08
c
Logic synthesis has dramatically reduced the ASIC design cycle. You will
learn why in this module.

BD03: Digital Physical Design 210


Module Objectives
In this module, you will be able to

e
‹ Explain the optimization stages of the synthesis flow

‹ Interpret the results in a timing report

nc
d e
ca
6/16/08 BD03: Digital Physical Design 211

Discussion Questions
‹ What is logic synthesis?

e
‹ What are the inputs and outputs to and from logic synthesis?

nc
d e
ca
6/16/08 BD03: Digital Physical Design 212
Topics in This Module
‹ Logic synthesis
‰ Introduction
‰ Reading HDL source files
‰ Elaborating design

ce
n
‰ Technology-independent (generic) mapping
‰ Technology transformation

‰ Scan chain insertion

d
‰ Timing report analysis
e
‰ Technology-dependent optimizations

a
‰ Running logic synthesis

c
‹ Physical synthesis
‰ Fundamentals
‰ Basic operation and flow

6/16/08 BD03: Digital Physical Design 213

What Is Logic Synthesis?


‹ Definition: The process of
parsing, translating, Specification Floorplanning Place/Route

e
optimizing, and mapping RTL Designer Placement

code into a specified

c
Microarchitecture
Physical Synthesis

Scan Reorder
standard cell library
Static Timing Analysis

Design Optimization
Designer
Delay Calculation

Pre-CTS
Signal Integrity

n
Extraction

RTL CTS
‹ Example: To determine the Design Optimization
Post-CTS
feasibility of the design, we Logic Synthesis

e
Route
need to synthesize the RTL Synthesized Design Optimization
Netlist Gates
code into gates, and measure Post-Route

d
Detail
timing, power, and area. Routed GDSII
Design

a
Layout Design Verification

GDSII GDSII Mask Prep

6/16/08
c BD03: Digital Physical Design 214
Logic Synthesis: Input and Output, Format
‹ Input
RTL SDC
‰ RTL in the Verilog® language or

e
other HDL

c
‰ Constraints in Synopsys Design
Constraints (SDC) format
Logic Synthesis Library

n
‰ Timing Libraries in Liberty (.lib)
format Synthesized
Gates

e
Gates
‹ Output
‰ Gate-level netlist in the Verilog

d
language or other HDL

ca
6/16/08 BD03: Digital Physical Design 215

Logic Synthesis Goals


‹ Minimize area
‰ In terms of cell count and cell size

‹ Minimize power

ce
‰ In terms of switching activity in individual gates, deactivated circuit blocks
‰ In terms of leakage power

‹ Maximize performance

en
‰ In terms of maximum clock frequency of synchronous systems, throughput for asynchronous
systems

d
‹ Quickly produce accurate functional models
‰ Gate-level model is functionally equivalent to RTL model

a
‰ Gate-level model is produced in less time than is required by an experienced logic designer to
create the same model

6/16/08
c
‹ Produce predictable and accurate results
‰ Timing, area, and power consumption calculations should correspond with actual values
measured on physical device once manufactured.

BD03: Digital Physical Design 216


Logic Synthesis Phrases and Commands
Synthesis Phrase Description Command

e
Read RTL source files Parse source code, check read_hdl
syntax

c
Elaboration Build data structures and elaborate
registers
Technology-independent
mapping

n
Optimize data structures

e
synthesize –to_generic

Technology transformation Map to specific technology synthesize –to_map

d
(mapping) gates

a
Technology-dependent Use optimized gates in the retime (optional)
optimization technology library

c
Scan chain insertion Build the scan chain synthesize –to_map
–incremental
Timing report analysis Create timing reports report_timing

6/16/08 BD03: Digital Physical Design 217

Topics in This Module


‹ Logic synthesis
‰ Introduction
‰ Reading HDL source files
‰ Elaborating design

ce
n
‰ Technology-independent (generic) mapping
‰ Technology transformation

‰ Scan chain insertion

d
‰ Timing report analysis
e
‰ Technology-dependent optimizations

a
‰ Running logic synthesis

c
‹ Physical synthesis
‰ Fundamentals
‰ Basic operation and flow

6/16/08 BD03: Digital Physical Design 218


Reading RTL Source Files
‹ Reading the RTL source files performs two functions:
‰ Source files undergo a lint check (syntax and structure check).

‹ Example
in the next phase.

ce
‰ If the source files pass the lint check, they are loaded into memory for use

n
rc:/> read_hdl -v2001 my_design.v

e
a d
6/16/08
c BD03: Digital Physical Design 219

Log Entries for Reading RTL


read_hdl command loads
rc:/> read_hdl -v2001 my_design.v my_design.v

e
Reading Verilog file ‘my_design.v'

c
assign #1875 write_clk_int = ~clk; loads my_design.v
|

n
Warning : Ignoring delay specifier. [VLOGPT-35]
: in file '/my_design.v' on line 373, column 14.
: A delay specifier, either in an assignment or as a separate statement, is

e
not synthesizable.
assign #1875 postamble_clk_int = ~clk;

a d
Linting process has detected a problem in my_design.v.
Details of the problem are listed. In this case, my_design.v includes a Verilog

c
construct that is not synthesizable (Verilog # construct).

6/16/08 BD03: Digital Physical Design 220


Topics in This Module
‹ Logic synthesis
‰ Introduction
‰ Reading HDL source files
‰ Elaborating design

ce
n
‰ Technology-independent (generic) mapping
‰ Technology transformation

‰ Scan chain insertion

d
‰ Timing report analysis
e
‰ Technology-dependent optimizations

a
‰ Running logic synthesis

c
‹ Physical synthesis
‰ Fundamentals
‰ Basic operation and flow

6/16/08 BD03: Digital Physical Design 221

Elaborating Design
‹ Builds data structures and infers registers in the design

e
‹ Function expansion (e.g., functions are in-line expanded)

‹ Constant propagation

‰ Original code

nc
‰ Detect operands driven by constant values and pre-compute the output

e
a = 0;
b = a + 1;

d
c = 2 * a;
‰ Optimized code

ca
b = 1;
c = 0

6/16/08 BD03: Digital Physical Design 222


Elaborating Design (continued)
‹ Loop unrolling
‰ for loops are replaced by as many instances of the loop body as the loop

‰ Original loop

ce
would have iterated. This allows for greatest possible optimizations later.

for (a=2; a >= 0; a = a -1)

n
z[a] = x[a] + y[2-a];

e
‰ Unrolled loop
z[2] = x[2] + y[0]

d
z[1] = x[1] + y[1]

a
z[0] = x[0] + y[3]
a = x

6/16/08
c BD03: Digital Physical Design 223

Elaborating Design (continued)


‹ Dead code removal
‰ Dead code consists of operations that cannot be reached, or whose result

removed.
‰ Original code

ce
is never referenced elsewhere. Such operations are detected and

n
a = x

e
b = a + 1
c = 2 * a

d
‰ Optimized code
b = x + 1

ca
c = 2 * x
‰ Dead code elimination removed
a = x

6/16/08 BD03: Digital Physical Design 224


Log Entries for Elaboration

e
elaborate command
rc:/> elaborate my_design builds my_design

c
Elaborating block my_design from file ‘my_design.v'.
Warning : Removing unused register. [CDFG-508]

n
: Removing unused register 'doing_wr_r' in module ‘my_design' in
file ‘my_design.v' on line 155. Beginning of

e
Info : Unused module input port. [CDFG-500] elaboration section
: Input port 'p_clk' is not used in module ‘my_design' in file
‘my_design.v' on line 90.

d
End of elaboration
Done elaborating ‘my_design'. section

ca
6/16/08 BD03: Digital Physical Design 225

Topics in This Module


‹ Logic synthesis
‰ Introduction
‰ Reading HDL source files
‰ Elaborating design

ce
n
‰ Technology-independent (generic) mapping
‰ Technology transformation

‰ Scan chain insertion


‰ Timing report analysis

d e
‰ Technology-dependent optimizations

a
‰ Running logic synthesis

c
‹ Physical synthesis
‰ Fundamentals
‰ Basic operation and flow

6/16/08 BD03: Digital Physical Design 226


Technology-Independent Mapping
‹ A design is technology-independent when the formula (function,
system) has no connection with the building blocks in the

e
implementation.

c
‹ Technology-independent mapping and optimization techniques:
‰ Carry save arithmetic optimization
‰ Logic pruning
‰ Resource sharing
‰ Speculation

en
a d
‰ Implementation selection
‰ Arithmetic optimization

c
‰ Common sub-expression sharing
‰ Logic speculation

6/16/08 BD03: Digital Physical Design 227

Carry-Save Arithmetic Operations


‹ Carry-save arithmetic (CSA) operations are functionally equivalent to their
carry-propagate counterparts.

e
‹ The carry logic for the intermediate sums is saved until the very end, thus

c
saving area and possibly timing.
a bc d e f

n
a b c d e f
+ + +

d e
a
+ +
z z

6/16/08
c Carry-propagate

BD03: Digital Physical Design


Carry-save

228
Pruning Logic Driving Unused Pins
‹ By default, logic that drives unused (unloaded) hierarchical pins is optimized
away.

c
transitively drive an output port.

e
‹ In the example below, instances in red are deleted because they do not

D Q

en D Q
*

in1

in2

a d * *

c
in3 1 1 out1

in4 0 1’b1 0

6/16/08 BD03: Digital Physical Design 229

Resource Sharing
‹ A resource is any computational element, Given the following HDL description:
such as an add, shift, or “if/then” operation. if (select)

e
sum <= A + B;
‹ Each type of operator in the RTL description else
sum <= C + D;
requires a unique resource type.

c
‰ For instance + operator requires an adder One possible implementation:
and’ > requires a comparator. A

n
+
‹ Maximum number of resources required for B
MUX sum
each operator type is the number of times an

e
operator is used in the RTL description. C
D
+
‹ Resources can be reduced, thus saving

d
area, using the following techniques: select

‰ Some operators can be mapped to a common

a
Another, more efficient implementation.
resource type. For instance, + and - operators
A
can be mapped to an add-subtract unit. MUX

c
C
‹ Operators in different clock cycles can share select sum
+
the same resource. This is determined by B
analyzing if there are any data flow or control D
MUX
flow conflicts (discussed later).

6/16/08 BD03: Digital Physical Design 230


Sharing and Speculation
‹ The sharing and un-sharing (speculation) of resources trades off area versus
timing during logic synthesis

if (Q =‘0’)
Q
A

MUX
C

ce B

MUX
D

Speculation
A B C D

n
x = a + b; + +
else
+

e
y = c + d; X Y
Q
X
Resource MUX
Y

d
Sharing

ca
6/16/08 BD03: Digital Physical Design 231

Implementation Selection: ChipWare


‹ Some advanced synthesis tools come
with a libraries of re-usable designs. HDL Operator Definition

e
RTL File
‰ Cadence Encounter® RTL Compiler (RC) has
such a library known as ChipWare.
Z <= X + Y

c
‹ ChipWare (CW) library includes

n
‰ Common combinational and sequential
components
‰ Arithmetic components (adders, subtractors,

e
ChipWare
add_op
multipliers) Library

‰ Memory components (flip flops, FIFOs)

‹ Logic synthesis searches for operators

a
maps those operators to CW
d
in RTL files it reads and automatically
ADD_SUB ADD ALU

c
components, if available.
‹ CW components often have multiple
Implementations
architectural implementations that allow
ripple CLA proprietary
logic synthesis to pick one according to
design need.

6/16/08 BD03: Digital Physical Design 232


Implementation Selection: Architecture Tradeoff
‹ Different implementations of the ChipWare components have different area
and timing characteristics.

e
‹ Design constraints determine the appropriate ChipWare component.

c
en fastest
Brent-Kung

Carry Look-Forward

d
Z <= A*B + C +

a
HDL Operator Carry Look-Ahead

c
smallest
Ripple Carry

6/16/08 BD03: Digital Physical Design 233

Arithmetic Optimization
SUM <= A + B + C + D
A

e
B +
+

c
Initial Order C
SUM
D +

n
A
+

e
B
SUM
Optimized For Speed C +
•All inputs have equal delay

d
D +

a
Late A

B +
SUM

c
Optimized For Speed C +
•Input A is late arriving
D +

Note: Operators can not be re-arranged if initial order


is overridden by use of parenthesis in HDL

6/16/08 BD03: Digital Physical Design 234


Common Sub-Expression Sharing
Consider the assignments:

e
SUM1 <= A + B + C
SUM2 <= A + B + D

c
SUM3 <= B + A + E

n
The “A+B” sub-expression could be shared, thus saving two adders in the process.
The order within the sub-expressions is not important, but the position must be the

e
same.

d
A B C A B D B A E A B C D E

a
+ + + +

c
Sharing of Sub-
+ + + + + +
Expressions

SUM1 SUM2 SUM3 SUM1 SUM2 SUM3

6/16/08 BD03: Digital Physical Design 235

Commands for Technology-Independent Mapping


‹ In this stage, logic synthesis performs technology-independent
optimizations, including
‰ Constant propagation
‰ Resource sharing
‰ Logic speculation

ce
‰ Multiplexor optimization

en
‰ Carry-save arithmetic optimization

d
‹ You can run this stage separately by using the following command:
synthesize –to_generic -effort <effort_level>

ca
6/16/08 BD03: Digital Physical Design 236
Log Entries for Technology Independent Mapping
Starts technology-
independent

e
rc:/> synthesize -to_generic optimization
process
Deleting 2 sequential instances. They do not transitively

c
drive any primary outputs: Logic pruning
vpb/vpo/luma_sel_a1_reg[0], vpb/vpo/luma_sel_reg[0] (floating root)

Info
: The implementation

n
: An implementation was inferred. [CWD-19]

e
'/hdl_libraries/GB/components/increment/implementations/very_fast' was
Implementation
selection

d
inferred through the binding 'b1' for the call to synthetic operator
'INCREMENT_CI_OP'. Mux

a
Optimizing muxes in design ‘my_design' optimization

c
End of technology-
Synthesis succeeded. independent
optimization

6/16/08 BD03: Digital Physical Design 237

Topics in This Module


‹ Logic synthesis
‰ Introduction
‰ Reading HDL source files
‰ Elaborating design

ce
n
‰ Technology-independent (generic) mapping
‰ Technology transformation

‰ Scan chain insertion

d
‰ Timing report analysis
e
‰ Technology-dependent optimizations

a
‰ Running logic synthesis

c
‹ Physical synthesis
‰ Fundamentals
‰ Basic operation and flow

6/16/08 BD03: Digital Physical Design 238


Technology Transformation (Mapping)
‹ Technology transformation or “technology mapping” is the phase of
logic synthesis when gates are selected from a technology library to

e
implement the circuit.

c
‹ Technology mapping is normally done after technology-independent
optimization.
‹ Why technology mapping?

en
‰ Straight implementation may not be good. For example, F = abcdef as a
six-input AND gate cause a long delay.

d
‰ Gates in the library are pre-designed; they are usually optimized in terms

a
of area, delay, power, etc.
 Fastest gates along the critical path, area-efficient gates (combination)

c
off the critical path.

6/16/08 BD03: Digital Physical Design 239

Technology Mapping Stages


‹ Target setting
‰ Target timing goals (clock period) for each class and group of timing paths

‹ Global mapping

ce
are derived from the fastest arrival time.

‰ Optimizes for area, timing, power, and maps the design while aiming for

‹ Remapping

en
the target clock frequency

‰ Evaluates every cell in the design and resizes as needed to improve area
and power consumption

a
‹ Incremental optimization

d
c
‰ Runs Design Rule Checks (DRCs), timing and area cleanup, and critical
region resynthesis (CRR) for timing optimization

6/16/08 BD03: Digital Physical Design 240


Synthesis Stages: Target Setting
In this first phase of global mapping, logic synthesis performs tentative structuring
and computes the estimated arrival and required times for all the endpoints based

e
on the effort level you set. The result of this stage is the target for each cost group.
synthesize –to_mapped -effort <effort_level>

rc:/> synthesize -to_generic

nc Starts technology-
dependent
optimization
process

e
Mapping my_design to gates.
Technology
Mapping ‘my_design'... mapping

d
Preparing the circuit
Structuring (delay-based) logic partition in alu_32...

a
Target setting
Performing redundancy-removal...

c
Performing bdd-opto...
Performing redundancy-removal... End of target setting
Done structuring (delay-based) logic partition in alu_32

6/16/08 BD03: Digital Physical Design 241

Synthesis Stages: Global Mapping


In this second phase of global mapping, RC restructures paths and computes
delays based on the targets and the effort level you set. The goal of this phase is to

e
meet the target timing.
synthesize –to_mapped -effort <effort_level>

Optimizing component cb_seq...

nc
Restructuring (delay-based) cb_part_4...

Optimizing component cb_part_4...

e
Done restructuring (delay-based) cb_part_4

d
Restructuring (delay-based) cb_oseq_3...
Indicates the beginning of
global mapping

a
Done restructuring (delay-based) cb_oseq_3

c
Optimizing component cb_oseq_3...
Restructuring (delay-based) cb_part...
Done restructuring (delay-based) cb_part
Optimizing component cb_part...

6/16/08 BD03: Digital Physical Design 242


Synthesis Stages: Remapping
Several optimization routines are used during this stage of synthesis,
mainly to reduce the area of the design.

Global mapping status


=====================
Group

ce
n
Total
Total Worst
Operation Area Slacks Worst Path

e
-------------------------------------------------------------------------------
global_map 721782 -308 VIT_ACS10/NEW_reg[5]/CP --> VIT_ACS26/NEW_reg[1]/D

d
fine_map 514143 -372 VIT_ACS10/NEW_reg[6]/CP --> VIT_ACS8/NEW_reg[2]/D
area_map 512565 -344 VIT_ACS23/NEW_reg[4]/CP --> VIT_ACS31/NEW_reg[7]/D

a
area_map 498515 -345 VIT_ACS1/NEW_reg[5]/CP --> CS4/SELECT_REG_reg/D
Done mapping dtmf_chip

c
Indicates the beginning of
remapping

6/16/08 BD03: Digital Physical Design 243

Synthesis Stages: Incremental Synthesis


Incremental synthesis iterates on paths with a mix of strategies to improve timing,
area, etc.

Incremental optimization status


===============================
Group

ce
n
Total Total - - - - DRC Totals - - -
Total Worst Neg Max Max Max
Operation Area Slacks Slack Trans Cap Fanout

e
-------------------------------------------------------------------
init_delay 498515 -345 -124671 414 18 229

d
Path: VIT_ACS1/NEW_reg[5]/CP -->
VIT_ACS4/THREE_SELECT_REG_reg/D
incr_delay 502638 -301 -114125 129 69 276

a
Path: VIT_ACS16/NEW_reg[2]/CP --> VIT_ACS2/NEW_reg[6]/D
incr_delay 511982 -267 -100144 0 19 614 Indicates the

c
Path: VIT_ACS19/NEW_reg[6]/CP --> beginning of
VIT_ACS13/NEW_reg[1]/D incremental
synthesis
incr_delay 515304 -221 -91064 0 34 614
Path: VIT_ACS30/NEW_reg[2]/CP -->
VIT_ACS10/NEW_reg[0]/D

6/16/08 BD03: Digital Physical Design 244


Synthesis Stages: Incremental Synthesis (continued)
This report shows the localized algorithms (tricks) used in incremental
optimization, the corresponding number of attempts made by the synthesis

e
engine, and the number of times that the routine has been run to improve
the design goals.

Run time for each of these

nc
Trick Calls

crr_rsyn
Accepts

389 (
Attempts

215 /
Time
-------------------------------------------------------
300 ) 79917

e
tricks must be small.
crr_glob 25 ( 198 / 215 ) 5324
crit_upsz 4746 ( 2047 / 2117 ) 31691

d
fopt 358 ( 0 / 0 ) 23
crit_dnsz 428 ( 23 / 25 ) 4970
dup 347 ( 1 / 1 ) 250

a
DRC fixing is done at the end fopt 1076 ( 261 / 336 ) 22515
of each pass. setup_dn 398 ( 11 / 14 ) 324

c
exp 25 ( 23 / 64 ) 3214

init_drc 522875 -235 -10660 0 31 537


Path: VIT_ACS6/NEW_reg[6]/CP -->
VIT_ACS11/NEW_reg[0]/D

6/16/08 BD03: Digital Physical Design 245

Topics in This Module


‹ Logic synthesis
‰ Introduction
‰ Reading HDL source files
‰ Elaborating design

ce
n
‰ Technology-independent (generic) mapping
‰ Technology transformation

‰ Scan chain insertion

d
‰ Timing report analysis
e
‰ Technology-dependent optimizations

a
‰ Running logic synthesis

c
‹ Physical synthesis
‰ Fundamentals
‰ Basic operation and flow

6/16/08 BD03: Digital Physical Design 246


Technology-Dependent Optimization
‹ A design is technology-dependent
if the formula (function, circuit,

e
system) is implemented by one or
more logic gates in a pre-designed

c
set of gates (called technology
library or cell library).

n
‹ Advantage: Gates in the cell

e
library have a highly optimized,
pre-defined path to silicon, so that
the area and delay parameters are

d
known and accurate.

ca
6/16/08 BD03: Digital Physical Design 247

Technology-Dependent Optimization Types


‹ Boundary optimization

e
‹ Register re-timing

nc
d e
ca
6/16/08 BD03: Digital Physical Design 248
Controlling Boundary Optimization
Examines input and output pin characteristics of a sub-design to try and optimize a
mapped netlist

design

ce
‹ Removes any gate that drives output ports that are not connected outside a

‹ Considers swapping or merging of input ports to minimize logic.

n
‹ Propagates constant values across hierarchical boundaries and eliminates

e
unnecessary logic.

d
a being 0, the blocks L1 and L2
are equivalent and therefore

a
optimized. L2

L1

c
clk

Constant Hierarchical
a=0 boundary

6/16/08 BD03: Digital Physical Design 249

Retiming
Retiming optimizes the register locations in the design to improve the results without
changing the combinational logic or latency through the chip or block. Use the

e
following attributes to control retiming on the design and sub-designs:

Reposition flops
6ns

nc 4ns

Required Clock : 5ns


5ns

d e
WNS: -1 ns
Combine
flops

a
5ns

6/16/08
c
Required Clock : 5ns

Retime for Delay


WNS: 0 ns

retime –min_delay

BD03: Digital Physical Design


Retime for Area
Retime –min_area

250
Topics in This Module
‹ Logic synthesis
‰ Introduction
‰ Reading HDL source files
‰ Elaborating design

ce
n
‰ Technology-independent (generic) mapping
‰ Technology transformation

‰ Scan chain insertion

d
‰ Timing report analysis
e
‰ Technology-dependent optimizations

a
‰ Running logic synthesis

c
‹ Physical synthesis
‰ Fundamentals
‰ Basic operation and flow

6/16/08 BD03: Digital Physical Design 251

Test Synthesis
‹ Manufacturing defects in ASICs are detected using automated test
equipment (ATE), which sends special bit patterns known as test

e
vectors into the inputs of the ASIC and compares the output to
expected values. Any difference could mean the ASIC is not
functioning properly.

nc
‹ Improve testability by making every register in the design look like a
“virtual I/O.”

d e
‰ Allows every flip-flop to be independently controlled and observed.
‰ Allows every flip-flop to act like a combinational logic input.
‰ Allows every flip-flop to act like a combinational logic output.

ca
6/16/08 BD03: Digital Physical Design 252
Test Synthesis (continued)
‹ Test synthesis is the modification of a chip design to make both the
chip and the PC board system containing it more testable.

ce
‹ Coupled with this testability is the automatic test-pattern generation
(ATPG) of test vectors. Design for test (DFT) lets you modify a design
to make a circuit more testable. Test synthesis tools can assist in both

n
places.

e
‹ The use of test synthesis for DFT techniques and ATPG reduces from
months to days the time to generate manufacturing test vectors.

a d
‹ Use the DFT features of RTL Compiler to improve your ability to
control and observe internal signal nodes. After RTL and logic
synthesis, test synthesis can perform full or partial internal-scan cell

c
insertion and boundary scan. An ASIC vendor often implements
special cells in the ASIC library to handle these tasks.

6/16/08 BD03: Digital Physical Design 253

Test Synthesis (continued)


‹ The use of internal scan cells enables ATPG tools to easily generate
nearly 100% fault coverage on the combinatorial logic.

ce
‹ Internal scan replaces latches and flip-flops with their scan-equivalent
latches and flip-flops. Each scan cell has a scan-data input (SDI), a
scan-data output (SDO), and a test-enable (TE) input. The tool

n
connects groups of these cells in chains of equal or similar length.

data_in

d e Combinational logic to be tested


data_out

ca
clock

6/16/08 BD03: Digital Physical Design 254


Muxed Scan
‹ To convert a traditional flip-flop to a muxed-scan flip-flop, simply add a
multiplexer on the data input to the flop.

data input to the flip-flop.

ce
‹ shift_enable signal selects normal functional data input or a new scan

n
‹ Scan inputs are chained to the output of other flip-flops.

‹ Same clocks are used for both scan and functional operations.

d e Scan-DFF
Q

a
2:1
SI DFF

c
SE
QB

6/16/08 BD03: Digital Physical Design 255

Muxed-Scan Hookup
Add scan chains

e
D Q D Q
SI SI SI

SE
CK
QB

nc SE
CK
QB

D Q

d e D
SI
Q SO

a
SI

c
SE SE
QB QB
CK CK

SE
Connect shift_enable

6/16/08 BD03: Digital Physical Design 256


Muxed-Scan Shift Cycle
Sequence: SE to active state, pulse clock “n” times to scan in/out data

SI
D
SI
Q

ce D
SI
Q

n
SE SE
QB QB
CK

e
CK

a
D
SI
Q

d D
SI
Q
SO

SE

6/16/08
cSE
CK
QB

BD03: Digital Physical Design


SE
CK
QB

257

Muxed-Scan Capture Cycle


Sequence: SE to inactive state, pulse clock “1” times to capture data in the registers

SI
D
SI
Q

ce D
SI
Q

n
SE SE
QB QB
CK

e
CK

a
D
SI
Q

d D
SI
Q SO

SE

6/16/08
c
SE
CK
QB

BD03: Digital Physical Design


SE
CK
QB

258
Muxed-Scan Shift Cycle
Sequence: SE to active state, pulse clock “n” times to scan in/out data

SI
D
SI
Q

ce D
SI
Q

n
SE SE
QB QB
CK

e
CK

a
D
SI
Q

d D
SI
Q
SO

SE

6/16/08
cSE
CK
QB

BD03: Digital Physical Design


SE
CK
QB

259

RTL Top-down Design-for-Testability Flow


Read Target Libraries
Read HDL files

e
Elaborate Design
Modify constraints Set Timing and Design

c
Constraints ‰ Shift enable
Modify optimization ‰ Test mode
Apply Optimization Directives
directives ‰ Prevent scan mapping of flops

n
Setup for DFT Rule Checker ‰ Internal clocks as test clocks
Run DFT Rule Checker and ‰ DFT controllable constraints
‰ Abstract scan segments

e
Report Registers
Fix DFT Violations
‰ Test-point insertion

d
Add Testability Logic ‰ Shadow logic insertion
Synthesize Design and Map to
Scan

a
‰ Scan chains
Set up DFT Configuration ‰ Number of scan chains
Constraints and Preview Scan ‰ Length of scan chains

c
Chains ‰ Control data lockup elements
Connect Scan Chains

No Run Incremental Optimization


Meet Analyze Design
constraints?

Netlist, SDC
Yes ScanDEF, ATPG, Abstraction Model

6/16/08 BD03: Digital Physical Design 260


Topics in This Module
‹ Logic synthesis
‰ Introduction
‰ Reading HDL source files
‰ Elaborating design

ce
n
‰ Technology-independent (generic) mapping
‰ Technology transformation

‰ Scan chain insertion


‰ Timing report analysis

d e
‰ Technology-dependent optimizations

a
‰ Running logic synthesis

c
‹ Physical synthesis
‰ Fundamentals
‰ Basic operation and flow

6/16/08 BD03: Digital Physical Design 261

Reading Timing Reports


============================================================
Generated by: Encounter(r) RTL Compiler v07.10-p004_1
Generated on: Jul 23 2007 03:16:40 AM
Module: dtmf_chip Header includes
library and module

e
Technology libraries: slow_normal 1.0 slow_hvt 1.1 tpz973gtc 230 ram_128x16A 0.0
ram_256x16A 0.0 rom_512x16A 0.0 pllclk 4.3 information.
Operating conditions: slow (balanced_tree)
Wireload mode: enclosed

c
============================================================
Pin Type Fanout Load Slew Delay Arrival
(fF) (ps) (ps) (ps)
----------------------------------------------------------------------------------

n
(clock m_clk) launch 0 R
latency +4000 4000 R
DTMF_INST
TDSP_CORE_INST

e
DATA_BUS_MACH_INST
data_out_reg[0]/clk 0 4000 R Body includes
data_out_reg[0]/q (u) unmapped_d_flop 19 155.1 0 +258 4258 R arrival time
DATA_BUS_MACH_INST/data_out[0]
calculation.

d
TDSP_CORE_GLUE_INST/data_out[0]
TDSP_CORE_GLUE_INST/port_data_in[0]
PORT_BUS_MACH_INST/data_in[0]
PORT_BUS_MACH_INST/pad_data_out[0]

a
TDSP_CORE_INST/port_pad_data_out[0]
DTMF_INST/port_pad_data_out[0]
IOPADS_INST/tdsp_portO[0]

c
Ptdspop00/I +0 4258
Ptdspop00/PAD PDO04CDG 1 6719.0 2038 +1648 5906 R
IOPADS_INST/tdsp_port_out[0]
port_pad_data_out[0] out port +0 5906 R
(ou_del_1) ext delay +500 6406 R
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
(clock refclk) capture 6000 R
uncertainty -250 5750 R
---------------------------------------------------------------------------------- Footer includes
Timing slack : -656ps (TIMING VIOLATION) timing slack
Start-point : DTMF_INST/TDSP_CORE_INST/DATA_BUS_MACH_INST/data_out_reg[0]/clk calculation.
End-point : port_pad_data_out[0]

6/16/08 BD03: Digital Physical Design 262


Reading Timing Reports: Header
============================================================
Generated by: Encounter(r) RTL Compiler v07.10-p004_1

e
Generated on: Jul 23 2007 03:16:40 AM
Module: dtmf_chip Header includes
Technology libraries: slow_normal 1.0 slow_hvt 1.1 tpz973gtc 230 ram_128x16A 0.0 library and module
ram_256x16A 0.0 rom_512x16A 0.0 pllclk 4.3 information.

c
Operating conditions: slow (balanced_tree)
Wireload mode: enclosed
============================================================

‹ Tool-specific information

en
In the header, the following information is given:

d
‹ Timestamp

a
‹ Module information

‹ Technology libraries

6/16/08
c
‹ Operating conditions

‹ Wireload mode

BD03: Digital Physical Design 263

Reading Timing Reports: Body


Pin Type Fanout Load Slew Delay Arrival
(fF) (ps) (ps) (ps)
----------------------------------------------------------------------------------
(clock m_clk) launch 0 R

e
latency +4000 4000 R
DTMF_INST
TDSP_CORE_INST
DATA_BUS_MACH_INST

c
data_out_reg[0]/clk 0 4000 R Body includes
data_out_reg[0]/q (u) unmapped_d_flop 19 155.1 0 +258 4258 R arrival time
DATA_BUS_MACH_INST/data_out[0] calculation.
TDSP_CORE_GLUE_INST/data_out[0]

n
TDSP_CORE_GLUE_INST/port_data_in[0]
PORT_BUS_MACH_INST/data_in[0]
PORT_BUS_MACH_INST/pad_data_out[0]

e
TDSP_CORE_INST/port_pad_data_out[0]
DTMF_INST/port_pad_data_out[0]
IOPADS_INST/tdsp_portO[0]
Ptdspop00/I +0 4258
Ptdspop00/PAD PDO04CDG 1 6719.0 2038 +1648 5906 R

d
IOPADS_INST/tdsp_port_out[0]
port_pad_data_out[0] out port +0 5906 R
(ou_del_1) ext delay +500 6406 R

a
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
(clock refclk) capture 6000 R
uncertainty -250 5750 R
The body of the timing report includes arrival time calculation and includes

c
‹ Instance pins the timing path goes through
‹ Fanout for each pin output
‹ Load and slew for each pin output
‹ Incremental delay for each cell
‹ Cumulative delay (or arrival time) for each cell

6/16/08 BD03: Digital Physical Design 264


Reading Timing Reports: Footer
---------------------------------------------------------------------------------- Footer includes
Timing slack : -656ps (TIMING VIOLATION) timing slack

e
Start-point : DTMF_INST/TDSP_CORE_INST/DATA_BUS_MACH_INST/data_out_reg[0]/clk calculation.
End-point : port_pad_data_out[0]

‹ Timing slack

nc
The footer section of the timing report shows the final calculation and includes

e
‹ Start point

‹ End point

a d
6/16/08
c BD03: Digital Physical Design 265

Discussion Questions
Given the timing report in the previous example:

e
‹ What type of path is being checked?
‰ input->reg, reg->reg, reg->output, or input->output?

nc
‹ What logic gates are involved in the timing path?

‹ How many levels of logical hierarchy does the path go through?

e
‹ Which clocks are launching and capturing the data?

d
‹ What is the clock period of the design?

‹ What is the output delay of the path?

a
‹ What is the clock uncertainty?

c
‹ What is the arrival time?

‹ What is the required time?

‹ Why does the path violate timing, and how can it be fixed?

6/16/08 BD03: Digital Physical Design 266


Other Synthesis Reports
report area Prints an exhaustive hierarchical area report

e
report datapath Prints a datapath resources report
report design_rules Prints design rule violations
report gates

report hierarchy
summary

nc
Reports libcells used, total area, and instance count

Prints a hierarchy report

e
report instance Prints an instance report

d
report memory Prints a memory usage report
report messages Prints a summary of error messages that have been

a
issued

c
report power Prints a power report
report qor Prints a quality of results report
report timing Prints a timing report
report summary Prints an area, timing, and design rules report
6/16/08 BD03: Digital Physical Design 267

Topics in This Module


‹ Logic synthesis
‰ Introduction
‰ Reading HDL source files
‰ Elaborating design

ce
n
‰ Technology-independent (generic) mapping
‰ Technology transformation

‰ Scan chain insertion

d
‰ Timing report analysis
e
‰ Technology-dependent optimizations

a
‰ Running logic synthesis

c
‹ Physical synthesis
‰ Fundamentals
‰ Basic operation and flow

6/16/08 BD03: Digital Physical Design 268


Starting RTL Compiler
‹ The first step is ensure that RTL Compiler has been properly installed on a
computer server to which you have access. Check with your system

e
administrator to ensure that RC is installed and working properly.

c
‹ To invoke RC from the UNIX prompt, type the following:
unix% rc

n
‹ A message appears similar to the following, as well as the rc_shell prompt.

e
‹ Now you can enter commands directly at this prompt and synthesize a design.

d
Checking out license 'RTL_Compiler_Ultra'... (0 seconds elapsed)
License RTL_Compiler_Ultra checkout failed

a
Checking out license 'RTL_Compiler_Verification'... (0 seconds
elapsed)

c
Cadence Encounter(r) RTL Compiler
Version v06.20-s019_1 (64-bit), built Mar 8 2007

rc:/>

6/16/08 BD03: Digital Physical Design 269

Viewing the Log File


‹ Log files will list every command issued to logic synthesis, as well as
the tool’s response to such commands.

such as vi or emacs.

ce
‹ Log file is a text document that can be viewed using any text editor

n
Checking out license 'RTL_Compiler_Ultra'... (0 seconds elapsed)
License RTL_Compiler_Ultra checkout failed RC license checkout info
Checking out license 'RTL_Compiler_Verification'... (1 seconds elapsed)

e
RC version
Cadence Encounter(r) RTL Compiler
Version v06.20-s019_1 (32-bit), built Mar 8 2007

d
Welcome message

========================================================================

a
Welcome to Encounter (TM) Encounter(r) RTL Compiler
Any line that begins with
======================================================================== rc:/> is a command
issued to RC.

c
rc:/> source config/libraries_virage.tcl All other lines are RC’s
response.
rc.log

6/16/08 BD03: Digital Physical Design 270


Topics in This Module
‹ Logic synthesis
‰ Introduction
‰ Reading HDL source files
‰ Elaborating design

ce
n
‰ Technology-independent (generic) mapping
‰ Technology transformation

‰ Scan chain insertion


‰ Timing report analysis

d e
‰ Technology-dependent optimizations

a
‰ Running logic synthesis

c
‹ Physical synthesis
‰ Fundamentals
‰ Basic operation and flow

6/16/08 BD03: Digital Physical Design 271

Physical Synthesis Fundamentals


‹ Physical synthesis is the
integration of logic synthesis and RTL

e
placement.
‹ Logical synthesis and placement

c
optimizations can be run Logic Timing
Library
concurrently in a single Synthesis

n
executable.
‹ Physical information, usually Netlist

e
reserved for physical design
(place/route) tools, is used by the

d
physical synthesis tool to optimize Floorplan Floorplan

the design.

a
‰ Physical library Floorplan
Netlist
‰ Floorplan information

6/16/08
c
‹ Timing is more accurate, because
the wiring estimations are based
on real placement, but run times
are typically higher because of the
additional placement steps done.
Placement

BD03: Digital Physical Design


Physical
Library
Physical
Synthesis

272
RC Physical Methodology
‹ Introduced by Cadence in RTL Compiler 7.1

e
‹ Incorporates physical/process data into interconnect delay calculations

‹ Consists of two RTL Compiler features:

‰ QoS Prediction

nc
‰ Physical Layout Estimation (PLE)

d e
ca
6/16/08 BD03: Digital Physical Design 273

Physical Layout Estimation


‹ Physical layout estimation (PLE) is a method and model to calculate
interconnect delay as an alternative to wire load models.

into account.

ce
‹ PLE uses a proprietary algorithm that takes design and vendor process data

‹ Other significant differences between WLMs and PLE are summarized below:

en
a d
6/16/08
c BD03: Digital Physical Design 274
Synthesis Flow with PLE Enabled
Inputs Steps Commands

e
.lib Read tech lib set_attr library

c
cap set_attr lef_library
LEF Read physical lib info
Table set_attr cap_table_file

RTL

n
Read Verilog source files

e
read_hdl

d
Elaborate design elaborate

a
DEF .sdc Apply constraints read_sdc

c
set_attr def_file
Define DFT controls and DRC

Basic Flow

Map (PLE driven)


PLE Flow

6/16/08 BD03: Digital Physical Design 275

Quality-of-Silicon Prediction
Quality of silicon (QoS) prediction

e
‹ Targets the long nets PLE cannot estimate—the last 10-20% of nets.

‹ Invokes the SoC Encounter® Silicon Virtual Prototyping (SVP) feature


from within the RC session.

nc
‹ Uses SVP to perform trial place and route, so loading on long nets can
be estimated properly.

d e
‹ Works in concert with PLE, and maximum predictability is achieved
when both features are enabled.

ca
6/16/08 BD03: Digital Physical Design 276
Synthesis Flow with QoS Prediction Enabled
Inputs Steps Commands

DEF LEF
cap
Table

ce
Read physical lib info
set_attr lef_library
set_attr cap_table_file
set_attr def_file

en Map (PLE driven) synthesize –to_map

d
QoS prediction
predict_qos
(silicon virtual prototyping)

ca
Basic Flow

QoS Flow
Incremental optimization

Generate reports
synthesize –to_map -incr

report_timing,
area, power, qor, etc.

6/16/08 BD03: Digital Physical Design 277

Summary
We have discussed the following topics in this module:

e
‹ Major phases of logic synthesis
‰ Technology-independent (generic) mapping

nc
‰ Technology transformation
‰ Technology-dependent optimizations
‰ Scan chain insertion

d e
‹ Fundamental concepts of physical synthesis
‰ Integration of logic synthesis and placement

a
‰ Usage in the flow with PLE and QoS prediction

6/16/08
c BD03: Digital Physical Design 278
Testing Your Understanding
True or false

e
1. Technology-independent optimization takes place before technology-
dependent optimization.

nc
2. Boundary optimization takes place during technology mapping.

3. Two-level minimization generates smaller designs than multilevel


minimizations.

mapping.

d e
4. A Boolean network is generated immediately after technology

a
5. Physical synthesis is the integration of floorplanning and placement.

6/16/08
c BD03: Digital Physical Design 279

Learning Activity
In this activity, you will

e
‹ Study a log file after synthesis, including a timing report

‹ Explain the optimization stages of the synthesis flow

c
‹ Present your results to the class

n
e
20 minutes for activity
10 minutes for debriefing

a d
6/16/08
c BD03: Digital Physical Design 280
Floorplanning and Placement

Module 5

June 16, 2008

An Apartment Building vs. a Chip


In many ways, an apartment building and a chip are alike.

ce
en
a d
6/16/08
c BD03: Digital Physical Design 282
How?
‹ Built in layers from the ground up
‰ Silicon

ce
en
a d
6/16/08
c BD03: Digital Physical Design 283

How? (continued)
‹ Electrical wiring

ce
‹ Made up of building blocks

en
a d
‰ Bricks in the case of apartments
‰ Silicon atoms, dopants, and metals in the case of microchips

6/16/08
c BD03: Digital Physical Design 284
How? (continued)
‹ Built using a floorplan
‰ “Rooms” have explicit functions

ce
en
a d
6/16/08
c BD03: Digital Physical Design 285

Module Objectives
In this module, you will be able to

e
‹ Articulate the steps in floorplanning and power planning

‹ Articulate the steps in timing-driven placement and re-order scan

nc
d e
ca
6/16/08 BD03: Digital Physical Design 286
Discussion Questions
Recall the flowchart diagram of the
design flow steps required to take an

e
idea to product (chip).

c
‹ In which part of the flow does
floorplanning occur?

n
‹ In which part of the flow does
placement occur?

e
Design Flow
Input/Output ? ?
Step

d
? Input/Output

a
? ?

6/16/08
c BD03: Digital Physical Design 287

Topics in This Module


‹ Floorplanning

e
‹ Power planning

‹ Placement

nc
d e
ca
6/16/08 BD03: Digital Physical Design 288
Floorplanning
‹ Definition

e
‹ Implementation flow overview

‹ DEF file

‹ How to floorplan

nc
‹ Floorplanning inputs and outputs

e
‹ Module constraint types

d
‹ Pin placement

ca
6/16/08 BD03: Digital Physical Design 289

What Is Floorplanning?
‹ Floorplanning is the process of deriving the die size, allocating space for soft
blocks, planning power, and macro placement.
‹ Example:

ce E C

n
F G

e
D
B A

d
F B
G A

ca C D

6/16/08 BD03: Digital Physical Design 290


Implementation Flow Overview
RTL

ce Logic Synthesis

Gates

Timing
Closure Place
and

en Floorplanning
Static
Timing
Analysis
Test

d
Power Planning
Route
Placement

ca Clock Tree Synthesis

Route

GDSII GDSII

6/16/08 BD03: Digital Physical Design 291

Floorplanning: Inputs and Outputs


‹ Inputs
‰ Gate-level netlist from output of

e
logic synthesis Gates

c
‰ Constraints (SDC) are needed so
that timing with STA can be
accurate and measured against

n
the specifications of the design Tech
Constraints Lib
‰ Timing library (.lib) contains the

e
timing information for each
discrete logic gate or macro Floorplanning

d
‰ Physical library (LEF) contains
information about the shape and

a
Phys
connectivity of the technology Lib
library cells
‹ Outputs

6/16/08
c
‰ Floorplan of the design, which is
saved in the form of a DEF file

BD03: Digital Physical Design


Floorplan

292
What Is Design Exchange Format (DEF)?
‹ Definition: A specification for
representing logical connectivity

e
and physical layout of an
Gates
integrated circuit in ASCII format

‹ Example: A DEF file is used to


describe all the physical aspects
of a design, including die size,

nc Constraints
Tech
Lib

e
connectivity, and physical location
of cells and macros on the chip. It Floorplanning

d
contains floorplanning information
such as standard cell rows,

a
Phys
groups, placement and routing Lib
blockages, placement constraints,

c
and power domain boundaries. It
also contains the physical
representation for pins, signal DEF
routing, and power routing,
including rings and stripes.

6/16/08 BD03: Digital Physical Design 293

DEF File Example


VERSION 5.6 ;
DIVIDERCHAR "/" ;
BUSBITCHARS "[]" ;
DESIGN DSP ;
UNITS DISTANCE MICRONS 2000 ;

e
PROPERTYDEFINITIONS
COMPONENTPIN designRuleWidth REAL ;
DESIGN FE_CORE_BOX_LL_X REAL 2.8 ;
Header information
DESIGN FE_CORE_BOX_UR_X REAL 1997.2 ;
DESIGN FE_CORE_BOX_LL_Y REAL 2.8 ;

c
DESIGN FE_CORE_BOX_UR_Y REAL 3997.2 ;
END PROPERTYDEFINITIONS

DIEAREA ( 0 0 ) ( 4000000 8000000 ) ;

ROW CORE_ROW_0 UMC13FSNSITE 5600 5600 FS DO 4986 BY 1 STEP 800 0 ;


ROW CORE_ROW_1 UMC13FSNSITE 5600 11200 N DO 4986 BY 1 STEP 800 0 ;

n
ROW
ROW

CORE_ROW_2 UMC13FSNSITE 5600 16800 FS DO 4986 BY 1 STEP 800 0 ;
CORE_ROW_3 UMC13FSNSITE 5600 22400 N DO 4986 BY 1 STEP 800 0 ; Area and rows
ROW CORE_ROW_1423 UMC13FSNSITE 5600 7974400 N DO 4986 BY 1 STEP 800 0 ;
ROW CORE_ROW_1424 UMC13FSNSITE 5600 7980000 FS DO 4986 BY 1 STEP 800 0 ;

e
ROW CORE_ROW_1425 UMC13FSNSITE 5600 7985600 N DO 4986 BY 1 STEP 800 0 ;

TRACKS Y 1200 DO 5000 STEP 1600 LAYER ME8 ; TRACKS X 1200 DO 2500 STEP 1600 LAYER ME8 ;
TRACKS X 500 DO 5000 STEP 800 LAYER ME7 ; TRACKS Y 1200 DO 5000 STEP 1600 LAYER ME7 ;
TRACKS Y 400 DO 10000 STEP 800 LAYER ME6 ; TRACKS X 500 DO 5000 STEP 800 LAYER ME6 ;
TRACKS X 400 DO 5000 STEP 800 LAYER ME5 ; TRACKS Y 400 DO 10000 STEP 800 LAYER ME5 ;
TRACKS Y 400 DO 10000 STEP 800 LAYER ME4 ; TRACKS X 400 DO 5000 STEP 800 LAYER ME4 ;

d
TRACKS X 400 DO 5000 STEP 800 LAYER ME3 ;
TRACKS Y 400 DO 10000 STEP 800 LAYER ME3 ;
TRACKS Y 400 DO 10000 STEP 800 LAYER ME2 ;
TRACKS X 400 DO 5000 STEP 800 LAYER ME2 ;
Routing tracks
TRACKS X 400 DO 5000 STEP 800 LAYER ME1 ;
TRACKS Y 400 DO 10000 STEP 800 LAYER ME1 ; and GCell information

a
GCELLGRID X 3992400 DO 2 STEP 7600 ;
GCELLGRID X 400 DO 500 STEP 8000 ;
GCELLGRID X 0 DO 2 STEP 400 ;
GCELLGRID Y 7992400 DO 2 STEP 7600 ;

c
GCELLGRID Y 400 DO 1000 STEP 8000 ;
GCELLGRID Y 0 DO 2 STEP 400 ;

PINS 765 ;
- ADC0[0] + NET ADC0[0] + DIRECTION INPUT + USE SIGNAL + LAYER ME3 ( -1000 0 ) ( 1000 600 ) + FIXED ( 0 7524700 ) E ;
- ADC0[10] + NET ADC0[10] + DIRECTION INPUT + USE SIGNAL + LAYER ME3 ( -1000 0 ) ( 1000 600 ) + FIXED ( 0 7564700 ) E ;
- ADC0[11] + NET ADC0[11] + DIRECTION INPUT + USE SIGNAL + LAYER ME3 ( -1000 0 ) ( 1000 600 ) + FIXED ( 0 7568700 ) E ;
-ADC0[1] + NET ADC0[1] + DIRECTION INPUT + USE SIGNAL + LAYER ME3 ( -1000 0 ) ( 1000 600 ) + FIXED ( 0 7528700 ) E ;
Pins
- TST_SEL + NET TST_SEL + DIRECTION INPUT + USE SIGNAL + LAYER ME4 ( -300 0 ) ( 300 2000 ) + FIXED ( 3501690 8000000 ) S ;

END PINS

SPECIALNETS 2 ; - DVSS ( * VSS )


+ USE GROUND ;
- DVDD ( * VDD ) + USE POWER
; Special nets
END SPECIALNETS

END DESIGN

6/16/08 BD03: Digital Physical Design 294


DEF File Syntax
‹ [VERSION statement ] ‹ [NONDEFAULTRULES statement]
‹ [DIVIDERCHAR statement] ‹ [REGIONS statement]

‹ [BUSBITCHARS statement]

‹ DESIGN statement

‹ [TECHNOLOGY statement]

ce ‹ [COMPONENTS section]

‹ [PINS section]

‹ [PINPROPERTIES section]

‹ [UNITS statement]

‹ [HISTORY statement]

e
‹ [PROPERTYDEFINITIONS SECTION ]
n ‹ [BLOCKAGE section]

‹ [SLOTS section]

‹ [FILLS section]

‹ [DIEAREA statement]

a
‹ [ROWS statement]

‹ [TRACKS statement]
d ‹ [SPECIALNETS section]

‹ [NETS section]

‹ [SCANCHAINS section]

c
‹ [GCELLGRID statement]

‹ [VIAS statement]

‹ [STYLES statement]

6/16/08
‹ [GROUPS section]

‹ [BEGINEXT section]

‹ [END DESIGN statement[

BD03: Digital Physical Design 295

How to Floorplan
‹ When the design is imported into the tool, a default die size is
calculated and displayed, and each module is assigned a physical

e
representation using a default placement density of 70% and aspect
ratio of 1.

nc
‰ Each unit represents a particular module in the design.
‰ Floorplanning allocates position and area to each unit.

d e
ca
6/16/08 BD03: Digital Physical Design 296
How to Floorplan (continued)
‹ Position the modules and blocks in the die area. In general, position
the modules and blocks such that the area of the bounding rectangle

e
is minimum or meets the die size requirement. Try different
orientations, aspect ratios, and placement densities of the modules to
puzzle fit them into the die area.

nc
‰ The bounding rectangle represents the die area.

d e
ca
6/16/08 BD03: Digital Physical Design 297

How to Floorplan (continued)


‹ Identify modules that should be placed close together.
‰ Tool shows flightlines (lines showing number of connections) between the

ce
modules. The higher the flightlines between two modules, the closer these
modules will have to be within the design.
‰ Flightlines indicate how much communication occurs between two

n
modules.
‹ The diagram below shows how to floorplan optimally. The numbers

modules.

d e
over the flightlines indicate the number of nets between corresponding

a
121

B B
D
A

c
34
D 57
E
C A
C
152 104

E
6/16/08 BD03: Digital Physical Design 298
How to Floorplan (continued)
‹ Example: The design below shows the flightlines between one of the
modules and its macro on the right side of the die area, as well as with

e
other modules that communicate with it.

nc
d e
ca
6/16/08 BD03: Digital Physical Design 299

Module Constraint Types


The size of the design and of each module is initially calculated by the tool
during design import and assigned one of the following constraints.

Type
None
Definition

ce
Contents of module are placed without any constraint.

n
Guide Module is placed in core design area. It guides placement
of the module’s cells in the vicinity of guides location.
Fence

d e
Fence is a hard constraint in core design area. Design for
the module is self-contained within the rigid outline of a
fence.

a
Region Same as a fence, except that instances from other modules
can be placed within its physical outline.

c
Soft Guide Similar to guide, except that there are no fixed locations.

6/16/08 BD03: Digital Physical Design 300


Module Constraint Types (continued)

ce
en
a d
6/16/08
c BD03: Digital Physical Design 301

Pin Placement
There are two ways to handle pin placement, using a bottom-up or top-
down approach.
‹ Bottom up

ce
‰ Pins are initially placed along with the cells in a block to optimize their
placement with respect to that block.

n
‰ The top-level floorplan is finished, and pin placement is re-optimized
considering both top-level goals and block timing.
‹ Top down

d e
‰ The pins are initially placed in the top-level floorplan to optimize their
placement on a global level.

a
‰ Then, their location is fixed within a block, and the block level cells are
placed.

c
‰ Finally, the pin placement is re-optimized considering both top-level
goals and block timing.
Use bottom up if the top-level design is incomplete so progress can be
made at the block level. Use top down if the top-level design is near
complete so that you can account for the inter-block connections.
6/16/08 BD03: Digital Physical Design 302
Pin Placement Goals
‹ Identifying critical paths and making placement tradeoffs to optimize
the critical paths
‹ Wire length reduction

ce
‰ Achieving timing by reducing the amount of block-to-block or IO-to-block
interconnect

en
‰ Achieving via-free direct routes
‰ Achieving accurate pin matching between hierarchical boundaries

‹ Optimizing pin placement with respect to routing congestion

d
‰ Pin spacing variation in congested areas

a
6/16/08
c BD03: Digital Physical Design 303

Topics in This Module


‹ Floorplanning

e
‹ Power planning

‹ Placement

nc
d e
ca
6/16/08 BD03: Digital Physical Design 304
Power Planning
‹ Definition

e
‹ Goals

‹ Need for power planning

‹ Basics of power planning

‹ Early planning for power

nc
e
‹ Types of power routing

d
‹ Steps involved in power routing

a
‹ Multiple supply voltages

6/16/08
c BD03: Digital Physical Design 305

What Is Power Planning?


‹ Definition: The task of creating the global power plan for a design. These are
typically created as VDD/VSS rings and stripes.

‹ Example:

ce
en
a d
6/16/08
c BD03: Digital Physical Design 306
What Are Voltage (IR) Drop and Electromigration?
‹ Voltage (IR) drop is the voltage drop across a chip’s power network
caused by current and resistance associated with the power network.

ce
‹ Electromigration (EM) is the mechanical failure of metal wires because
of metal atoms migrating over a long period of time due to high current
densities, causing open circuits, short circuits, or unacceptable

n
increases in resistance.

d e
ca
6/16/08 BD03: Digital Physical Design 307

Power Planning Goals


‹ To design a global power distribution network that supplies the
appropriate power and ground nets to all the instances of the design

ce
‹ To size the power wires and choose the metal layers necessary to
deliver the required power to different parts of the chip without causing
failure

en
a d
6/16/08
c BD03: Digital Physical Design 308
Need for Power Planning
Power-related issues can

e
‹ Affect chip timing due to excessive rail voltage drop (“IR-drop”) and
ground bounce

c
‹ Lead to complete device failure due to electromigration effects

n
d e EM Failures as seen though a Scanning
Electron Microscope (SEM)

ca
The effects of IR-drop and other power-related issues can be limited by
‹ Good power-grid design

‹ Sufficient VDD and VSS pads

6/16/08 BD03: Digital Physical Design 309

Basics of Power Planning


Ensure adequate power and ground
connections by including the following

e
basic elements into the power network.

c
‹ Power pads that supply power to
the chip

n
‹ Power rings around the periphery
of the die that carry power to the

e
standard cells and macros
‰ Rings are put on higher level

d
routing layers leaving the lower
layers for signal routing

ca
‹ Power rails and trunks that cross
the entire die or sections of the die

6/16/08 BD03: Digital Physical Design 310


Early Planning for Power
‹ Simulation of major power dissipation components

e
‹ Quantification of chip power
‰ Total chip power
‰ Maximum power density

nc
‰ Total chip power fluctuations

‹ Power grid analysis

d e
‹ Allocation and coordination of chip resources
‰ Wiring tracks for power grid

a
‰ Low Vt devices
‰ Dynamic circuits

6/16/08
c
‰ Clock gating
‰ Placement and quantity of decoupling capacitors

BD03: Digital Physical Design 311

Types of Power Planning


‹ Trunks and rings
‰ Used for upper level routing

the block
‹ Uniform grid

ce
‰ Rings are placed around blocks to assure even power distribution within

n
‰ Usually used inside lower level partitions

e
a d
6/16/08
c BD03: Digital Physical Design 312
Trunks and Rings Methodology
‹ Each block has its own ring
G V G V
structure
‹ Each block has a trunk that

ce
connects the top level to the block block 3

V
‹ Rings can be shared between block 5

n
G
abutted blocks

G
‹ Requires less routing resources

e
block 2

V
‹ Changes in design may require block 4

d
changes to power structure

V
ca V
Ring block 1
Trunk

G
G V G V G V

6/16/08 BD03: Digital Physical Design 313

Uniform Chip Grid Methodology


‹ Robust and redundant power G V G V
network
Seen in microprocessors and
high-end large ASICs

ce
V

block 4
V

block 5
‹ Primary distribution through upper

n
G

metal layers
G

‹ Grids of different blocks need to

e
block 3
align with each other
V

block 4

d
G

ca block 1
V

G V G V G V

6/16/08 BD03: Digital Physical Design 314


Power Planning
‹ Power stripes
‰ Specified and created by the chip designer, typically using a place/route tool

ce
‰ Distribute power vertically within a ring
‰ Typical power routing routes horizontally in metal 1 (including standard cell row
power rails) and vertically in metal 2

n
Metal 1
Power Stripe
Power Ring

Row of cells

d e
ca Metal 2

6/16/08 BD03: Digital Physical Design 315

Power Planning (continued)


‹ Power mesh
‰ Meshes are created to cover large areas of a chip

directions

ce
‰ Created by layers of power straps going in alternate vertical and horizontal

‰ Distributes power across a chip so that IR drop and electromigration

n
targets are met

e
‹ Example:

a d
6/16/08
c BD03: Digital Physical Design 316
Steps Involved in Power Routing
Create core power rings

e
Connect core power pads to the core power rings

c
n
Add power rings around the macros

e
Add power rails to the power plan for standard cell area

d
Modify power rails for macro power rings, routing blockages, and other restrictions

ca Add vertical and horizontal stripes to reduce IR drop at power rails of cells and macros

Connect power rails to cell power pins and extend to the power rings and connect with vias

Power pins of macros are tapped to core rings or power stripes

6/16/08 BD03: Digital Physical Design 317

Connecting Power Rings Around Core


Followpins are used to

e
‹ Route power/ground along the standard cell rows
‰ Follows the pins of each cell and stitches them together

Connect ring is used to

nc
‹ Connects these routes to power rings (and vertical stripes)

e
‹ Connect dangling power routes to stripes/rings

d
‹ Connect power rings to I/O power pads

ca
6/16/08 BD03: Digital Physical Design 318
Power Consumption
‹ Power on a chip is consumed when it is active (dynamic power) as
well as inactive (leakage power).
‹ Leakage power

ce
‰ Power consumed when cells are not switching

n
‰ Main sources of leakage power are sub-threshold leakage currents, which
reduce linearly with supply voltage

e
‹ Dynamic power

d
‰ It is the power associated with switching of nets and cells
‰ It is calculated as Power = f x C x V2

a
‹ How can the power consumption on a chip be reduced?

c
6/16/08 BD03: Digital Physical Design 319

Multiple Supply Voltages


‹ Using multiple supply voltages is one method of reducing a chip’s
power consumption.

ce
‹ It aims at minimizing the supply voltage level wherever possible.
Instead of the chip operating from single uniform supply voltage, a

n
range of supply voltages are assigned to different areas of the chip.
‹ It also assigns separate power-nets to different blocks, and steps the

e
power-net voltages down wherever the chip and block performance
allow.

a d
6/16/08
c BD03: Digital Physical Design 320
Discussion Question
Assuming the following chip diagram, what considerations should be taken into
account when designing a power plan?

ce Block1
1.0V

en
a d Block2
0.8V
Block3
1.2V

6/16/08
c BD03: Digital Physical Design 321

Topics in This Module


‹ Floorplanning

e
‹ Power planning

‹ Placement

nc
d e
ca
6/16/08 BD03: Digital Physical Design 322
Placement
‹ Definition

e
‹ Placement goals

‹ Standard cell placement

‹ Timing driven placement

‹ ECO placement

nc
e
‹ Incremental placement

d
‹ Boundary scan

a
‹ Scan chain re-order

6/16/08
c BD03: Digital Physical Design 323

What Is Placement?
‹ Definition: Process of placing the standard cells in a floorplanned design

e
‹ Example: The diagram shows a die area with no cells (left), and the cells
placed within the die (right).

nc
d e
ca
6/16/08 BD03: Digital Physical Design 324
Placement Goals
‹ Goals of placement step are to
‰ Guarantee that the router can complete the routing step

ce
‰ Minimize all the critical net delays by placing cells close to each other,
thus reducing interconnect lengths
‰ Minimize the die size as much as possible

en
‰ Reduce routing congestions, if any

‹ Good placement is essential for meeting timing goals

d
‹ Bad placement can lead to sub-optimal routes and cause paths to fail
timing

ca
6/16/08 BD03: Digital Physical Design 325

Standard Cell Placement


‹ The core area of the die is defined by specifying the distance between edge
of the layout and core.

ce
‹ Standard cells are placed in rows that are drawn within the core area.

‹ Placement should be legalized, meaning standard cells are placed correctly


on the placement grid, not overlapping, and power pins of standard cells are

n
aligned correctly.
‹ Placement should be routable and meet timing requirements.

d e
a
VDD VDD
CELL

c
GND GND
Standard Cell Row

Standard Cell Rows


in Core Area

6/16/08 BD03: Digital Physical Design 326


Standard Cell Rows

Regular Orientation, Gap in Between Rows

e
VDD VDD Cells with
regular orientation
CELL

c
GND GND
Gap

n
VDD VDD
CELL

e
GND GND

d
Regular + Flipped Orientation, Shared Rows

a
VDD VDD Cell with
regular orientation
CELL

c
Shared rail GND GND Cell with
flipped orientation
CELL
VDD VDD

6/16/08 BD03: Digital Physical Design 327

Cell Row Placement


There are three ways to arrange cell rows:
‹ Sometimes a technology allows rows

e
to be flipped and abutted so the pairs
can share power and ground rails. This

c
is the most common approach.
‹ Second configuration is to flip every

n
other cell row but leave a gap between
every two cell rows mainly for routing
purposes. Creates larger power rails

e
and densely packed cell structure.
‹ Last configuration is to leave a gap

d
between every cell row and not flip the
rows. Useful when only two or three

a
metal layers are available for routing.

c
The command to run placement is

placeDesign

6/16/08 BD03: Digital Physical Design 328


Timing-Driven Placement
‹ Placement of standard cells takes into account the timing constraints.
‰ Placer balances importance of meeting setup-type timing constraints with
routability.

ce
‰ Placer identifies critical nets and performs placement to meet the
constraints. It pays less attention to meeting timing constraints on non-

n
critical nets, but more attention to enhancing routability.
‹ Why do we need this?

d e
‰ Growing interconnect versus gate delay ratios
‰ Higher levels of on-die functional integration makes global interconnects
even longer

ca
‰ Increased chip operating frequencies that makes timing closure tougher
‰ Increased number of macros and standard cells for modern designs

6/16/08 BD03: Digital Physical Design 329

Timing-Driven Placement (continued)


Timing-driven placement algorithms can be divided into two categories:

e
‹ Path-based
‰ Tries to minimize the longest path delay

optimization
‹ Net-based

nc
‰ Complexity is high since it maintains an accurate timing view during

on individual nets

d e
‰ First transforms timing constraints into either length constraints or weights

‰ This information is fed into a weighted wirelength minimization-based

a
placement engine, obtains new placement with better timing
‰ Complexity is lower compared to path-based algorithms

6/16/08
c BD03: Digital Physical Design 330
Engineering Change Order Placement
‹ Engineering change order (ECO) placement is used to place unplaced
cells to a partially or fully placed design.

ce
‰ In a partially placed design, unplaced cells are placed in timing-driven
mode followed by legalization (overlap removal).
‰ In a fully placed design, only legalization step takes place.

imported design by 10%.

en
‹ Make sure that ECO logic changes do not exceed the previously

‹ When ECO placement is run, it places only the cells that are unplaced.

d
It cannot move the cells that are fixed and makes only minor

a
modification to cells already placed.
‹ The command to run ECO placement is

6/16/08
c ecoPlace

BD03: Digital Physical Design 331

Incremental Placement
‹ Incremental placement works on an already placed design to improve
overall quality and timing.

before placing the design

ce
‹ To use incremental placement, the following command should be run

n
placeDesign –incremental

‰ Regular placement

d e
‹ The above command performs a two-pass placement flow.

a
‰ Incremental placement

c
‹ In addition to having placement information about all placed cells, it
maintains information about space available for adding new cells.

6/16/08 BD03: Digital Physical Design 332


What Is Boundary-Scan Architecture?
Boundary-scan architecture
‹ Is a method that enables the chip

ce
tester to test connectivity of the
I/O pins on the fabricated chip
‹ Provides a means to test

n
interconnects between integrated
circuits on a board without using

e
physical test probes
‹ Is synonymous with Joint Test

d
Action Group (JTAG)

a
JTAG is the name used for the
IEEE 1149.1 standard entitled

c
Standard Test Access Port and
boundary-scan architecture to test
access ports.

6/16/08 BD03: Digital Physical Design 333

Boundary Scan
‹ Boundary scan adds one or more
memory elements, called

e
boundary-scan cells, to each I/O
pin of the device, which can

c
selectively override the
functionality of that pin.

n
‹ The collection of boundary scan
cells is configured into a parallel-

e
in, parallel-out shift register.
‹ Test sequence is passed into the

d
shift register, and the data coming
out is compared.

a
‹ Boundary scan cells do not
contribute to the functionality of

c
the internal core logic.
‹ Test access port (TAP) controller
is a state machine whose
transitions are controlled by a
TMS signal.

6/16/08 BD03: Digital Physical Design 334


Boundary Scan (continued)
‹ JTAG interface, collectively known
as TAP controller, uses the

e
following signals to support
operation of boundary scan.
‹ Test data is shifted around the

c
shift register in serial mode from

n
input pin Test Data In (TDI).

e
‹ Test data is terminated at output
pin Test Data Out (TDO).

d
‹ Test Clock (TCK) synchronizes
the internal state machine

a
operation.

c
‹ Test Reset (TRST) is an optional
input pin to reset the TAP
controller’s state machine.
‹ Test Mode State (TMS)
determines the next state.
6/16/08 BD03: Digital Physical Design 335

Boundary Scan (continued)


‹ Chain integrity testing
‹ Basic form of testing by JTAG (tests that the JTAG devices meant to
be in the chain exist).

ce
‹ Each JTAG compliant device contains an ID code.
‹ Issuing a correct sequence of JTAG commands, the ID codes of all the
devices can be read out.

en
‹ The ID codes read out from JTAG chain are compared with the actual
ID codes of the device. If they match, the JTAG chain is correctly

d
connected and the devices are in place.

a
‹ Benefits of JTAG
‹ Shorter test times

c
‹ Higher test coverage
‹ Increased diagnostic capability
‹ Lower capital equipment cost

6/16/08 BD03: Digital Physical Design 336


What Is a Scan Chain?
‹ Scan chains are a technique used in Design for Test (DFT) to reduce
the time it takes on the tester to determine if a part is good or bad.

ce
‹ All the registers in the design are connected in one or more scan
chains so that their inputs can be controlled and their outputs can be
observed.

en
‹ Flip-flops have an extra signal called scan enable.
‰ When Scan Enable is de-asserted, the flip-flop behaves normally and
passes the data.

d
‰ When Scan Enable is asserted, all the flip-flops are connected into a long

a
shift register, with one end of the chain as primary input and the other end
primary output.

6/16/08
c BD03: Digital Physical Design 337

Scan Chain Re-Order


‹ Testing is done by putting flops into this test mode, shifting in a test
vector, switching back to normal mode to clock (capture) the data, and

e
finally switching back to test mode to shift out the resulting flop values.
The resulting vector is compared with a known “good” vector to

nc
determine if the chip is functioning correctly.
‹ Why do we need to re-order the scan chain?

e
‰ During placement, cells are placed to meet functional timing and minimize
congestion, and the scan chain connectivity is ignored.

d
‰ This results in long, inefficient routing between flops in the chain and
causes routing congestion.

a
‹ Re-ordering the scan chain reduces congestion by connecting the

c
cells based on their placement.
‰ It may cause hold time violations in the chain, and buffers may need to be
inserted to fix the same.

6/16/08 BD03: Digital Physical Design 338


Discussion Question
In the following example, the scan chain after logic synthesis was ordered
alphanumerically by instance name.

result?

ce
‹ How would you reorder the scan chain after initial placement to get the optimal

n
DFF U1 DFF U10

d
DFF U6

e DFF U8

ca DFF U5
DFF U7

DFF U3

DFF U2 DFF U4

DFF U9

6/16/08 BD03: Digital Physical Design 339

Summary
‹ The back-end flow starts with floorplanning. Here is where we get to
see the physical chip.

into the die area.

ce
‹ Floorplanning is a puzzle-fitting stage, where we have to fit modules

‹ Plan the power network with a view to distribute power efficiently

en
throughout the chip and meet the current requirements.
‹ Placement of the cells and macros into the core area is to be done
with the ultimate goal of meeting timing and reducing congestion.

a d
‹ Each step affects the overall goals of meeting timing and power
requirements. Quality time spent in floorplanning and power network
implementation reduces the number of iterations to achieving a

6/16/08
c
working chip that meets design specifications.

BD03: Digital Physical Design 340


Testing Your Understanding
True or false

e
1. Boundary scan adds more functional logic to the existing internal
logic.

back-end flow.

nc
2. Timing-driven placement reduces the number of iterations through the

3. The DEF file contains information on the standard cell library.

nets.

d e
4. Timing-driven placement tries to first meet routability on the critical

a
5. In floorplanning, a guide is considered to be a rigid constraint.

6. The DEF file is saved in binary format and can only be read in by the

6/16/08
c
tool.

BD03: Digital Physical Design 341

Learning Activity
In this activity, you will

e
‹ Study several examples of bad floorplans

‹ Identify the bad practices from each and how you would correct them

c
‹ Present your results to the class

n
e
20 minutes for activity
10 minutes for debriefing

d
ca
6/16/08 BD03: Digital Physical Design 342
Clock Tree Synthesis

Module 6

June 16, 2008

What Is the Difference?

e
Combinational Combinational
Combinational
logic 1 Combinational
FF FF logic 2 FF
logic 1 logic 2

nc
e
CLK

a
FF

d
Combinational
Combinational
logic 1
logic 1
FF Combinational
Combinational
logic 2
logic 2
FF

CLK

6/16/08
c BD03: Digital Physical Design 344
Module Objectives
In this module, you will be able to

e
‹ Explain a clock tree and why you need to create one

‹ Write a clock tree constraint file based on a given specification

c
‹ Describe the benefits of using useful skew versus classical zero skew

n
d e
ca
6/16/08 BD03: Digital Physical Design 345

Discussion Question
Recall the diagram of the design flow
steps to take an idea to product (chip)

ce
‹ What part of the flow does Clock
Tree Synthesis (CTS) occur?

en ? Design Flow
Step

d
Input/Output ? CTS

a
?

6/16/08
c BD03: Digital Physical Design 346
Topics in This Module
‹ Clock trees and clock tree synthesis

e
‹ Clock tree specification

‹ Analyzing CTS reports

‹ Low-power clocking techniques

nc
d e
ca
6/16/08 BD03: Digital Physical Design 347

What Is a Clock Tree?


In a synchronous digital systems, a
clock signal is used to define a time

e
reference for the movement of data
within that system.

‹ Definition: A network of buffers

nc
inserted into the clock signal path
in such a way that the overall
delay from the generator to all

e
destinations is minimized.

d
‹ Example: Instead of one electrical
signal path being optimized, the

a
path in the design was broken up
and strategically buffered to

c
minimize the delay. The resulting
network resembled a tree in that
the central clock signal branches
throughout the chip using these
buffers and ends up with the clock
signal reaching all of the leaf cells.

6/16/08 BD03: Digital Physical Design 348


Need for a Clock Tree
‹ When complexity (i.e., number of gates in a design) increases, the
need to distribute clock signals in a controlled manner becomes more

e
important.

c
‹ Reasons why we need to build a clock tree:
‰ Large chip area
‰ Different flop densities

en
‰ Non-uniform distribution of flops
‰ All flops need to get clock signal at the same time
‰ Power budget

a d
‰ Clock routing: hard problem

c
‹ The clock distribution network distributes the clock signal(s) from a
common point to all the elements that need it.

6/16/08 BD03: Digital Physical Design 349

Ideal Clock
‹ All flip-flops are clocked together

‹ Simplifies clock analysis over hierarchical boundaries

Block1

ce
‹ Used prior to clock tree insertion and place and route for timing analysis
Block2

n
A data
CLK1

e
B data
CLK2
Ideal

d
CLK C data
CLK3

ca CLK3
CLK2
CLK1
Ideal
CLK

Note: This diagram assumes zero clock skew and insertion delay.
6/16/08 BD03: Digital Physical Design 350
Propagated Clock
‹ Clock delays are extracted from clock tree routing

‹ Clock skew is correctly modeled using propagated delay

Block1

ce
‹ More accurate and used in final timing closure

Block2

n
A data
CLK1

e
B data
CLK2

d
Propagated C
CLK data
CLK3

ca CLK3
CLK2
CLK1
Propagated
CLK

6/16/08 BD03: Digital Physical Design 351

Issues Involved in Clocking


‹ Clock delivered to the memory elements from a signal pin

‹ Different net lengths means different arrival time of clock at each flip flop

pin

ce
‹ Delay and transition time affected by large number of elements connected to one

n
FF FF FF FF

d e
FF FF FF FF

a
Clock Source

c
FF FF FF FF

FF FF FF FF

6/16/08 BD03: Digital Physical Design 352


Effects on Clock Signal
The factors that can cause harm to a clock signal are

e
‹ Clock skew

‹ Clock latency

‹ Clock jitter

nc
d e
ca
6/16/08 BD03: Digital Physical Design 353

What Is Clock Skew?


‹ The measure of the difference of
FF
delay between the minimum and

e
maximum time it takes the clock to
reach different leaf cells CTS Inserted

c
Buffers FF
‰ Typically hurts performance of the
design, although in some cases

n
Clock
helps achieve timing targets Source
(useful skew) Minimum Insertion Delay

e
‹ Caused by a clock tree with Maximum Insertion Delay

unbalanced branches occurring

d
Same clock source
due to

a
‰ Different types of buffers Clock
source
‰ Varying capacitance and

c
resistance values of nets FF1

‰ Gating components
‰ Off-chip or on-chip variations FF2

Different arrival times at FF

6/16/08 BD03: Digital Physical Design 354


What Is Zero Skew?
The conventional approach to clock tree generation is called the zero skew
or classical skew approach.

ce
‹ The clock tree is treated as ideal.

‹ All combinational blocks must fit into the same fixed time period.

n
‹ All registers are clocked at the same time.

e
‹ You do not need knowledge of signal timing.

‹ Clock skew is made as small as possible to take advantage of full


clock period.

a d
‹ A good classical skew minimization strategy does not necessarily
correlate with good performance.

6/16/08
c BD03: Digital Physical Design 355

What Is Useful Skew?


‹ Useful skew is a technique that takes advantage of the difference of
arrival time at flip-flops to correct datapath timing violations.

advantage.
‹ Helps meet setup and hold time.

ce
‹ Increased latency but decreased clock period provides a net timing

period.

en
‹ Some combinational paths require more time than the allowed clock

d
‹ Adjusting clock delays to registers allows allocation of more time to
some paths and less on others.

ca
‹ Time is borrowed from neighboring paths that have positive slack.

‹ Useful skew can be done pre-CTS or post-CTS.

6/16/08 BD03: Digital Physical Design 356


Example of Useful Skew
Time Period 4 ns Time Period 4 ns
Propagated clock

e
Delay Delay
Delay Delay
of of
FF of
5 ns FF of
2ns FF

c
5 ns 2ns

en
d
Time Period 4 ns Time Period 4 ns
1 ns margin
Propagated clock
obtained by Delay Delay with useful skew

a
Delay Delay
speeding up of of
FF of
5 ns FF of
2 ns FF
source clock 5 ns 2 ns

6/16/08
c 1ns

BD03: Digital Physical Design 357

Example of Useful Skew (continued)


Time Period 4 ns Time Period 4 ns
Propagated clock

e
Delay Delay
Delay Delay
of of
FF of
5 ns FF of
2 ns FF

c
5 ns 2 ns

en
d
Time Period 4 ns Time Period 4 ns
Propagated clock
Delay Delay with useful skew

a
Delay Delay
of of
FF of
5 ns
5 ns
FF of
2 ns
2 ns
FF

6/16/08
c 1ns

BD03: Digital Physical Design


1 ns obtained by
delaying target clock

358
Discussion Questions
In a circuit after clock tree synthesis and a clock period of 5 ns, there is a
-1 ns worst-case negative slack.

ce
‹ Will useful skew always improve the timing path?

‹ How could you check to see if useful skew could be of benefit?

n
‹ What other ways could you improve the timing of this path?

d e
ca
6/16/08 BD03: Digital Physical Design 359

What Is Insertion Delay?


‹ Insertion delay is the time clock signal (rise or fall) takes to propagate from the
clock definition point (root) to a register clock pin (leaf cells).

e
‹ Insertion delay is also known as clock network latency.

c
n
FF

d e
CTS Inserted Buffers
FF

a
Clock Source

c
Minimum Insertion Delay

Maximum Insertion Delay

6/16/08 BD03: Digital Physical Design 360


What Is Clock Jitter?
Jitter is the undesired variation or fluctuation of a
signal with respect to its ideal position in time. Ideal clock

e
period
The common sources of jitter are
‹ Internal circuitry of the phase-locked loop (PLL)

c
‹ Random thermal or mechanical noise from a
crystal vibration

n
‹ Other resonating devices Jitter
‹ Signal transmitters

e
‹ Crosstalk
‹ VCC sag
‹ Ground bounce

a d
‹ Electromagnetic Interferences from nearby
devices

c
There are three types of clock jitter:
‹ Period Jitter
‹ Cycle-to-cycle jitter
‹ Long-term jitter

6/16/08 BD03: Digital Physical Design 361

What Is Period Jitter?


‹ The deviation in the output clock
Ideal clock
transition from the ideal position.

e
period
The deviation is either leading or
lagging the ideal position.

c
Ideal clock
‹ Measured and expressed in time or edge location
frequency

n
Period Jitter
‹ Used to calculate timing margins in

e
systems

a d
6/16/08
c BD03: Digital Physical Design 362
What Is Cycle-to-Cycle Jitter?
‹ Change in a clock’s output Ideal clock Lesser clock Ideal clock
transition from its corresponding period (T) period (T1) period (T)

e
position in the previous cycle.

c
‹ Large cycle to cycle jitter can Jitter = T-T1
cause a system to fail.

n
‹ Most difficult type of jitter to
measure.

d e
ca
6/16/08 BD03: Digital Physical Design 363

What Is Long-Term Jitter?


‹ Also known as phase jitter Ideal clock
period
‹ Measures the maximum change in

e
Ideal clock
edge location
a clock’s output transition from its
Cycle 0

c
ideal over a large number of Ideal clock
cycles edge location
Cycle N

n
Long-term Jitter Jitter

d e
ca
6/16/08 BD03: Digital Physical Design 364
Types of Clock Trees
A clock tree can be implemented in the following styles:

e
‹ Binary tree

‹ H tree

nc
d e
ca
6/16/08 BD03: Digital Physical Design 365

Clock Trees: Binary Tree


a
‹ Clock delay is identical for all elements.

‹ Length of a to d = Length of a to g

e
‰ Same length, so same delay.
b

c
‹ Results in a clock skew between the
c
clock signals at d and g.
‹ Drawback

en
‰ The branch affect – The clock signals
from b to e and f contribute a capacitance
d e f
Conceptual structure
g

d
that would actually increase the delay
d
from a to g.

a
‰ As the size of the clock distribution tree
e
increases, the effects on clock signal a b c

c
become worse.
f
g
Physical structure

6/16/08 BD03: Digital Physical Design 366


Clock Trees: H Tree
‹ First two stages resemble the letter “H”

‹ Maintains distributed interconnects

part of the chip

ce
‹ Provides equal propagation delays to each

‹ Minimizes skew by making connections to

n
the memory elements in equal lengths
Clock Source

e
‹ Drawback
‰ Total wire lengths is much greater compared

d
to standard clock tree
‰ Increased capacitance of the H-tree structure

ca
6/16/08 BD03: Digital Physical Design 367

What Is Clock Tree Synthesis?


To ensures that the system will work correctly at the required clock
frequency, a clock tree needs to be designed to synchronize memory

e
elements such as rams and flops.

nc
‹ Definition: Process of inserting buffers in the clock path, with the goal
to minimize clock skew and latency to optimize for timing
‹ Example: We ran clock tree synthesis on the example block and saw

e
a large clock skew due to bad clock constraints. We ended up re-
running clock tree synthesis with better constraints to get an optimal
result.

a d
6/16/08
c BD03: Digital Physical Design 368
Need for Clock Tree Synthesis
‹ Clock signals are typically loaded with the greatest fanout.

e
‹ Differences and uncertainty in the arrival times of the clock signals can
severely limit the maximum performance of the entire system.

nc
Design needs to be operated at the highest speeds of any signal.
‹ Clock signals are affected by technology scaling (Moore’s law).

e
Long global interconnect lines become significantly more resistive as
line dimensions are decreased.

a d
‹ Catastrophic race conditions can be created in which an incorrect data
signal may latch within a register.

6/16/08
c BD03: Digital Physical Design 369

CTS: Inputs and Outputs


‹ Inputs
‰ Clock tree specification file

e
‰ Verilog® netlist
‰ Timing library, which contains the timing

c
information for each discrete logic gate or
macro Tech
Netlist File

n
‰ Physical library, which contains
information about the shape and Clock Tree Specification file DEF
File

e
connectivity of the technology library cells
Clock Tree Synthesis
‰ Placement information such as a DEF file Phys
Lib

d
‹ Outputs
‰ Netlist with clock tree inserted

a
‰ Reports on the results of the run in ASCII
text or HTML format

c
‰ Routing guide files for clock tree Routing Macro
Netlist Reports
Guides Models
preroutes to be used during trial routing
‰ Macro model files for partitions or
modules

6/16/08 BD03: Digital Physical Design 370


Where Does CTS Fit in the Implementation Flow?

RTL

ce Logic Synthesis

Gates

Timing
Closure Place
and

en Floorplanning
Static
Timing
Analysis
Test

d
Power Planning

Route
Placement

ca Clock Tree Synthesis

Route

GDSII GDSII

6/16/08 BD03: Digital Physical Design 371

After CTS
‹ Clock buffer tree is built to balance output loads and minimize clock skew.

e
‹ Buffers can be added to the network to meet the minimum insertion delay

c
FF FF FF FF

en FF FF FF FF

a
Clock Source

d FF FF FF FF

6/16/08
c FF FF

BD03: Digital Physical Design


FF FF

372
Goals of CTS
‹ Deliver clock to all memory elements with
‰ Acceptable skew

ce
‰ Least amount of insertion delay

‹ Deliver clock edges with acceptable sharpness

en
a d
6/16/08
c BD03: Digital Physical Design 373

Steps Involved in CTS


‹ An initial placement of the logic
Initial placement
cells should be completed.

e
of core logic
‰ This ensures that the timing
performance of the core logic is

c
met. Scope of clock tree
‹ First define/understand scope or
extent of the clock tree

en
‰ This would include items such as
total load, routing area, distance
Define clock tree
constraints
CTS

d
the clock has to travel, available
routing layers, and routing
Define clock tree
restrictions.

a
topology

6/16/08
c BD03: Digital Physical Design
Insert clock tree

Routing of clock tree

374
Steps Involved in CTS (continued)
‹ Define the constraints that the
Initial placement
clock tree must satisfy.

e
of core logic
Include minimum and maximum

c
insertion delay and maximum
skew Scope of clock tree

n
‹ This is part of the clock tree
specification file. Define clock tree

e
constraints

CTS
a d Define clock tree
topology

6/16/08
c BD03: Digital Physical Design
Insert clock tree

Routing of clock tree

375

Steps Involved in CTS (continued)


‹ Define the way the clock tree
Initial placement
topology will be generated,

e
of core logic
including
‰ Number of levels or buffer stages

c
in the tree Scope of clock tree
‰ Type of buffers/inverters
‰ Fanout limit at each level

‹ The topology can be defined

en Define clock tree


constraints
CTS

manually by the designer or

d
automatically by a clock tree
generator tool. Define clock tree

a
topology
‹ This is part of the clock tree

c
specification file.
Insert clock tree

Routing of clock tree

6/16/08 BD03: Digital Physical Design 376


Steps Involved in CTS (continued)
‹ The clock tree is inserted, taking
Initial placement
into account the location of the

e
of core logic
logic cells.

c
‹ The buffers are placed or inserted
in strategic placed to minimize the Scope of clock tree
clock delay and routing.

en Define clock tree


constraints

CTS
a d Define clock tree
topology

6/16/08
c BD03: Digital Physical Design
Insert clock tree

Routing of clock tree

377

Steps Involved in CTS (continued)


‹ The routing is completed for all
Initial placement
clock signals simultaneously along

e
of core logic
with optimization for meeting all
timing goals.

c
‹ This step is optional and can be Scope of clock tree
done along with CTS or with the

n
routing phase.
Define clock tree

e
constraints
CTS

a d Define clock tree


topology

6/16/08
c BD03: Digital Physical Design
Insert clock tree

Routing of clock tree

378
CTS Operation Modes
There are two modes for running CTS:

e
‹ Manual CTS allows user to control
‰ Number of levels
‰ Number of buffers

nc
‰ Types of buffer at each level

‹ Automatic CTS automatically determines the number of levels and


buffers

d e
‰ Numbers depend on timing constraint in the clock tree specification file.
‰ CTS traces the clock net through buffers, inverters, and gated elements.

a
In most cases, you would use automatic CTS. In case you have issues with

c
the clock tree (skew, etc.), you can specify the CTS manually. In some
cases where the design is very regular or very high speed, an experienced
designer will manually specify the CTS constraints to better control the
output.

6/16/08 BD03: Digital Physical Design 379

Topics in This Module


‹ Clock trees and clock tree synthesis

e
‹ Clock tree specification

‹ Analyzing CTS reports

‹ Low-power clocking techniques

nc
d e
ca
6/16/08 BD03: Digital Physical Design 380
CTS Guidelines
‹ There are two CTS modes for specifying the clock tree:
‰ Manual CTS
‰ Automatic CTS

ce
‹ Both modes require a clock specification file to create the clock tree.

n
‹ In manual CTS mode, the clock tree structure has to be specified by
the user.

d e
‹ In automatic CTS mode, the tool automatically creates the clock tree
structure from the specification file.

a
‹ Automatic CTS is the preferred method of creating the clock tree.

6/16/08
c BD03: Digital Physical Design 381

CTS Guidelines (continued)


A clock tree specification file can be created by one of three methods:

e
‹ Using the Create Clock Tree Spec form (GUI)

‹ Using the createClockTreeSpec command

nc
‹ Using the specifyClockTree command with –template parameter. This
method creates a basic clock tree specification template file,
template.ctstch.

d e
Each method is similar and will allow the user to easily create a clock tree
specification. The first option uses the GUI, whereas the other commands
use the command line.

ca
The GUI command allows the user to fill in all of the values, and then a
clock specification file is generated. In the other commands, a template is
created and the user must modify the values.

6/16/08 BD03: Digital Physical Design 382


Contents of the Specification File
The sections of the clock tree
specification file must appear in the

e
order given below. Individual statements
within each section can appear in any

c
order. The contents in the specification CLOCK SPEC FILE
file are

n
Timing Constraint File

‹ Timing constraint file (optional) Naming Attributes

e
‹ Naming attributes (optional) Macro Model Data

Clock Grouping Data


‹ Macro model data (optional)

d
Router Attributes
‹ Clock grouping data (optional) Requirements for manual/automatic

a
and Gated CTS
‹ Attributes used by the routing tool

c
(optional)
‹ Requirements for manual CTS or
automatic, gated CTS

6/16/08 BD03: Digital Physical Design 383

Timing Constraint File


‹ Defines the timing constraints for
use during CTS
‹ Must be the first statement of the
clock tree specification file
‹ Example

ce CLOCK SPEC FILE

n
Timing Constraint File
TimingConstraintFile /path/cts.tcl
Naming Attributes

e
Macro Model Data

Clock Grouping Data

d
Router Attributes
Requirements for manual/automatic

a
and Gated CTS

6/16/08
c BD03: Digital Physical Design 384
Naming Attributes
‹ Allows user to customize the
name delimiter that CTS uses

e
when inserting buffers and
updating clock root and net names

c
CLOCK SPEC FILE
‹ The UseSingleDelim command
instructs CTS to use a single

n
Timing Constraint File

character, instead of multiple Naming Attributes

e
characters for the given delimiter. Macro Model Data
‰ Default: clk__L3_I2
Clock Grouping Data

d
‰ With UseSingleDelim YES: Router Attributes
clk_L3_I2 Requirements for manual/automatic

a
and Gated CTS
‹ Example

c
UseSingleDelim YES
NameDelimiter #

6/16/08 BD03: Digital Physical Design 385

Macro Model Data


‹ A macro model is a block with
synthesized clock trees, and thus

e
has delays have to specified for
pins.

c
CLOCK SPEC FILE
‹ Example

n
Timing Constraint File
MacroModel pin m1/clk 20ps 18ps 20ps
18ps 30ff Naming Attributes

e
Macro Model Data

Clock Grouping Data

d
Router Attributes
Requirements for manual/automatic

a
and Gated CTS

6/16/08
c BD03: Digital Physical Design 386
Clock Grouping Data
‹ Specifies two or more clock
domains for which you want CTS

e
to balance the skew

c
‹ The arguments are the clock root
CLOCK SPEC FILE
pin names.

n
Timing Constraint File
‹ Example
Naming Attributes
ClkGroup

e
Macro Model Data
+ U1/CGEN_1
Clock Grouping Data

d
+ U2/CGEN_2
Router Attributes
Requirements for manual/automatic

a
and Gated CTS

6/16/08
c BD03: Digital Physical Design 387

Router Attributes
‹ Defines attributes that CTS
passes to the router for routing the

e
clock net.

c
‹ Example
CLOCK SPEC FILE
RouteTypeName CK1

n
Timing Constraint File
NonDefaultRule rule1
Naming Attributes
PreferredExtraSpace 1

e
TopPreferredExtraSpace 1 Macro Model Data

BottomPreferredLayer 5 Clock Grouping Data

d
Router Attributes
Requirements for manual/automatic

a
and Gated CTS

6/16/08
c BD03: Digital Physical Design 388
Requirements for Manual/Automatic CTS and Gated CTS
‹ All of the “optional” clock tree
specification sections were

e
mentioned in the previous
sections.

c
CLOCK SPEC FILE
‹ In the next few slides, we will
discuss the requirements for the

n
Timing Constraint File

following Naming Attributes

e
‰ Manual CTS Macro Model Data

‰ Automatic CTS Clock Grouping Data

d
‰ Clock Gated CTS Router Attributes
Requirements for manual/automatic

a
and Gated CTS

6/16/08
c BD03: Digital Physical Design 389

CTS Operation Mode: Manual


Manual CTS specification file

e
ClockNetName CK
LevelNumber 2
To Flip flops

c
LevelSpec 1 2 BUFX2
LevelSpec 2 16 BUFX3

n
PostOpt YES

e
OptAddBuffer YES CK CK

End

a d
To Flip flops
Level 1, 2 BUFX2

c
Level 2, 16 BUFX3

6/16/08 BD03: Digital Physical Design 390


CTS Operation Mode: Automatic
Automatic CTS specification
file
AutoCTSRootPin

e
Phase Delay 1
AutoCTSRootPin clk_out/Y
MaxDelay 5ns

c
MinDelay 0ns clk_out/Y
Flip Flops
MaxFanout 30 CTS Buffer 2
SinkMaxTran 500ps

n
Sink Input Max Skew
CTS Buffer 1 Transition Time
BufMaxTran 500ps Phase Delay 2 FPU/CORE
MaxSkew 600ps or can be a

e
std cell
NoGating NO CTS Buffer 4
MaxDepth 10 Buffer Input
Pin FPU/CORE/A
CTS Buffer 3

d
Transition Time
RouteType CLK1_ROUTE
DetailReport YES XPU/CAM

a
RouteClkNet YES
PostOpt YES CTS Buffer 5

c
OptAddBuffer YES Pin XPU/CAM/C
Buffer BUFX2 BUFX4 BUFX8
INVX1 INVX2 INVX4
End

6/16/08 BD03: Digital Physical Design 391

Example: Clock Specification File


AutoCTSRootPin clockRootPinName Automatic CTS specification file
‹ Specifies the name of the clock root

e
pin name from which to start tracing AutoCTSRootPin clk_out/Y
MaxDelay 5ns

c
MinDelay 0ns
MaxFanout 30

n
SinkMaxTran 500ps
BufMaxTran 500ps

e
MaxSkew 600ps
NoGating NO

d
MaxDepth 10
RouteType CLK1_ROUTE

a
DetailReport YES
RouteClkNet YES

c
PostOpt YES
OptAddBuffer YES
Buffer BUFX2 BUFX4 BUFX8 INVX1
INVX2 INVX4
End

6/16/08 BD03: Digital Physical Design 392


Example: Clock Specification File (continued)
MaxDelay number{ns|ps} Automatic CTS specification file
‹ Specifies the maximum insertion

e
delay. If this statement is not specified, AutoCTSRootPin clk_out/Y
the tool automatically sets the delay to MaxDelay 5ns

c
10 ns MinDelay 0ns
MaxFanout 30

n
SinkMaxTran 500ps
BufMaxTran 500ps

e
MaxSkew 600ps
NoGating NO

d
MaxDepth 10
RouteType CLK1_ROUTE

a
DetailReport YES
RouteClkNet YES

c
PostOpt YES
OptAddBuffer YES
Buffer BUFX2 BUFX4 BUFX8 INVX1
INVX2 INVX4
End

6/16/08 BD03: Digital Physical Design 393

Example: Clock Specification File (continued)


MinDelay number{ns|ps} Automatic CTS specification file
‹ Specifies the minimum insertion delay.

e
If this statement is not specified, the AutoCTSRootPin clk_out/Y
tool automatically sets the delay to MaxDelay 5ns

c
0.0 ns MinDelay 0ns
MaxFanout 30

n
SinkMaxTran 500ps
BufMaxTran 500ps

e
MaxSkew 600ps
NoGating NO

d
MaxDepth 10
RouteType CLK1_ROUTE

a
DetailReport YES
RouteClkNet YES

c
PostOpt YES
OptAddBuffer YES
Buffer BUFX2 BUFX4 BUFX8 INVX1
INVX2 INVX4
End

6/16/08 BD03: Digital Physical Design 394


Example: Clock Specification File (continued)
MaxFanout integer Automatic CTS specification file
‹ Limits the number of leaf cells

e
connected to the clock buffer at the AutoCTSRootPin clk_out/Y
last stage of the clock tree. MaxDelay 5ns

c
MinDelay 0ns
MaxFanout 30

n
SinkMaxTran 500ps
BufMaxTran 500ps

e
MaxSkew 600ps
NoGating NO

d
MaxDepth 10
RouteType CLK1_ROUTE

a
DetailReport YES
RouteClkNet YES

c
PostOpt YES
OptAddBuffer YES
Buffer BUFX2 BUFX4 BUFX8 INVX1
INVX2 INVX4
End

6/16/08 BD03: Digital Physical Design 395

Example: Clock Specification File (continued)


SinkMaxTran number{ns|ps} Automatic CTS specification file
‹ Specifies the maximum input transition

e
time constraint for the sinks. The AutoCTSRootPin clk_out/Y
maximum value is 10,000 ns. The MaxDelay 5ns

c
default value is 400 ps. MinDelay 0ns
MaxFanout 30

n
SinkMaxTran 500ps
BufMaxTran 500ps

e
MaxSkew 600ps
NoGating NO

d
MaxDepth 10
RouteType CLK1_ROUTE

a
DetailReport YES
RouteClkNet YES

c
PostOpt YES
OptAddBuffer YES
Buffer BUFX2 BUFX4 BUFX8 INVX1
INVX2 INVX4
End

6/16/08 BD03: Digital Physical Design 396


Example: Clock Specification File (continued)
BufMaxTran number{ns|ps} Automatic CTS specification file
‹ Specifies the maximum input transition

e
time constraint for buffers. The AutoCTSRootPin clk_out/Y
maximum value is 10,000 ns. The MaxDelay 5ns

c
default value is 400 ps. MinDelay 0ns
MaxFanout 30

n
SinkMaxTran 500ps
BufMaxTran 500ps

e
MaxSkew 600ps
NoGating NO

d
MaxDepth 10
RouteType CLK1_ROUTE

a
DetailReport YES
RouteClkNet YES

c
PostOpt YES
OptAddBuffer YES
Buffer BUFX2 BUFX4 BUFX8 INVX1
INVX2 INVX4
End

6/16/08 BD03: Digital Physical Design 397

Example: Clock Specification File (continued)


MaxSkew number{ns|ps} Automatic CTS specification file
‹ Specifies the maximum skew between

e
sinks (clock pins). The default value is AutoCTSRootPin clk_out/Y
300 ps. MaxDelay 5ns

c
‹ The lower the skew, the better the MinDelay 0ns
clock tree, and hence the better overall MaxFanout 30

n
timing performance for the design. SinkMaxTran 500ps
BufMaxTran 500ps

e
MaxSkew 600ps
NoGating NO

d
MaxDepth 10
RouteType CLK1_ROUTE

a
DetailReport YES
RouteClkNet YES

c
PostOpt YES
OptAddBuffer YES
Buffer BUFX2 BUFX4 BUFX8 INVX1
INVX2 INVX4
End

6/16/08 BD03: Digital Physical Design 398


Example: Clock Specification File (continued)
NoGating { rising | falling | NO} Automatic CTS specification file
Sets the criteria for tracing through logic

e
gates AutoCTSRootPin clk_out/Y
‹ Rising: Stops tracing through a gate MaxDelay 5ns

c
(including buffers and inverters) and MinDelay 0ns
treats the gate as a rising-edge- MaxFanout 30
triggered flip-flop clock pin.

n
SinkMaxTran 500ps
‹ Falling: Stops tracing through a gate BufMaxTran 500ps
(including buffers and inverter) and

e
MaxSkew 600ps
treats the gate as a falling-edge-
triggered flip-flop clock pin. NoGating NO

d
MaxDepth 10
‹ NO: Default behavior for gated-clock
RouteType CLK1_ROUTE
designs. Allows CTS to trace through

a
clock gating logic. DetailReport YES
RouteClkNet YES

c
PostOpt YES
OptAddBuffer YES
Buffer BUFX2 BUFX4 BUFX8 INVX1
INVX2 INVX4
End

6/16/08 BD03: Digital Physical Design 399

Example: Clock Specification File (continued)


MaxDepth number Automatic CTS specification file
‹ Sets the maximum depth of clock tree

e
tracing. The default value is 1024, i.e., AutoCTSRootPin clk_out/Y
CTS limits the number of levels of MaxDelay 5ns

c
clock tree tracing to 1024. MinDelay 0ns
‹ Tracing is done by CTS (before MaxFanout 30

n
inserting buffers) to understand the SinkMaxTran 500ps
logical structure of the design and see BufMaxTran 500ps
that there are no feedback loops.

e
MaxSkew 600ps
NoGating NO

d
MaxDepth 10
RouteType CLK1_ROUTE

a
DetailReport YES
RouteClkNet YES

c
PostOpt YES
OptAddBuffer YES
Buffer BUFX2 BUFX4 BUFX8 INVX1
INVX2 INVX4
End

6/16/08 BD03: Digital Physical Design 400


Example: Clock Specification File (continued)
RouteType routeTypeName Automatic CTS specification file
‹ Specifies the name of the clock whose

e
routing attributes are being defined. AutoCTSRootPin clk_out/Y
MaxDelay 5ns

c
MinDelay 0ns
MaxFanout 30

n
SinkMaxTran 500ps
BufMaxTran 500ps

e
MaxSkew 600ps
NoGating NO

d
MaxDepth 10
RouteType CLK1_ROUTE

a
DetailReport YES
RouteClkNet YES

c
PostOpt YES
OptAddBuffer YES
Buffer BUFX2 BUFX4 BUFX8 INVX1
INVX2 INVX4
End

6/16/08 BD03: Digital Physical Design 401

Example: Clock Specification File (continued)


DetailReport YES | NO Automatic CTS specification file
‹ Determines whether CTS provides a

e
detailed report, which includes timing AutoCTSRootPin clk_out/Y
information for every component in the MaxDelay 5ns

c
design. Default behavior is not to MinDelay 0ns
generate a detailed report.
MaxFanout 30

n
SinkMaxTran 500ps
BufMaxTran 500ps

e
MaxSkew 600ps
NoGating NO

d
MaxDepth 10
RouteType CLK1_ROUTE

a
DetailReport YES
RouteClkNet YES

c
PostOpt YES
OptAddBuffer YES
Buffer BUFX2 BUFX4 BUFX8 INVX1
INVX2 INVX4
End

6/16/08 BD03: Digital Physical Design 402


Example: Clock Specification File (continued)
RouteClkNet YES | NO Automatic CTS specification file
‹ Determines whether CTS routes the

e
clock nets. Default behavior is not to AutoCTSRootPin clk_out/Y
route the clock net. MaxDelay 5ns

c
MinDelay 0ns
MaxFanout 30

n
SinkMaxTran 500ps
BufMaxTran 500ps

e
MaxSkew 600ps
NoGating NO

d
MaxDepth 10
RouteType CLK1_ROUTE

a
DetailReport YES
RouteClkNet YES

c
PostOpt YES
OptAddBuffer YES
Buffer BUFX2 BUFX4 BUFX8 INVX1
INVX2 INVX4
End

6/16/08 BD03: Digital Physical Design 403

Example: Clock Specification File (continued)


PostOpt YES | NO Automatic CTS specification file
‹ Specifies whether CTS runs

e
optimization, i.e., it resizes buffers or AutoCTSRootPin clk_out/Y
inverters, refines placements, and MaxDelay 5ns

c
corrects routing for signal and clock MinDelay 0ns
wires. Default: YES,
MaxFanout 30

n
SinkMaxTran 500ps
BufMaxTran 500ps

e
MaxSkew 600ps
NoGating NO

d
MaxDepth 10
RouteType CLK1_ROUTE

a
DetailReport YES
RouteClkNet YES

c
PostOpt YES
OptAddBuffer YES
Buffer BUFX2 BUFX4 BUFX8 INVX1
INVX2 INVX4
End

6/16/08 BD03: Digital Physical Design 404


Example: Clock Specification File (continued)
OptAddBuffer YES | NO Automatic CTS specification file
‹ Controls whether CTS adds buffers

e
during optimization. Effective only if AutoCTSRootPin clk_out/Y
PostOpt YES is specified. MaxDelay 5ns

c
‹ Tries to meet the trigger edge skew MinDelay 0ns
constraints as defined in the clock tree MaxFanout 30

n
specification file. SinkMaxTran 500ps
BufMaxTran 500ps

e
MaxSkew 600ps
NoGating NO

d
MaxDepth 10
RouteType CLK1_ROUTE

a
DetailReport YES
RouteClkNet YES

c
PostOpt YES
OptAddBuffer YES
Buffer BUFX2 BUFX4 BUFX8 INVX1
INVX2 INVX4
End

6/16/08 BD03: Digital Physical Design 405

Example: Clock Specification File (continued)


Buffer cell1 cell2 cell3 Automatic CTS specification file
‹ Specifies the names of buffer cells to

e
use during automatic gated CTS. AutoCTSRootPin clk_out/Y
MaxDelay 5ns

c
MinDelay 0ns
MaxFanout 30

n
SinkMaxTran 500ps
BufMaxTran 500ps

e
MaxSkew 600ps
NoGating NO

d
MaxDepth 10
RouteType CLK1_ROUTE

a
DetailReport YES
RouteClkNet YES

c
PostOpt YES
OptAddBuffer YES
Buffer BUFX2 BUFX4 BUFX8 INVX1
INVX2 INVX4
End

6/16/08 BD03: Digital Physical Design 406


Topics in This Module
‹ What is clock tree synthesis

e
‹ How to create a clock tree specification file

‹ Analyzing a CTS report

‹ Low-power clocking techniques

nc
d e
ca
6/16/08 BD03: Digital Physical Design 407

CTS Report
‹ After running CTS on a design, a report is created containing
information about the clock tree constructed. The report contains

e
several sections.

c
‰ Library Information: The process information used to create the clock tree.

Example
#
#
#

en
Complete Clock Tree Timing Report

CLOCK: cgen/i_5/Y

a
#
#
#
d
Mode: preRoute
Library Name : slow
Operating Condition : slow

c
# Process : 1
# Voltage : 1.62
# Temperature : 125

6/16/08 BD03: Digital Physical Design 408


CTS Report (continued)
‹ Clock Tree Structure Information: Gives details on the number of
buffers, subtrees, sinks, levels

Example
Nr. of Subtrees : 1

ce
n
Nr. of Sinks : 343
Nr. of Buffer : 9

e
Nr. of Level (including gates) : 2
Max trig. edge delay at sink(F):

d
TPRAM/mod1/CK 477.7(ps)

a
Min trig. edge delay at sink(R):
TPRAM/mod2/CK 459.6(ps)

6/16/08
c BD03: Digital Physical Design 409

CTS Report (continued)


‹ Delay, skew, and transition information

e
Example

c
Actual) (Required)
Rise Phase Delay : 459.6~477.7(ps) 0~5000(ps)
Fall Phase Delay : 432.8~446.7(ps) 0~5000(ps)

n
Trig. Edge Skew : 18.1(ps) 250(ps)
Rise Skew : 18.1(ps)

e
Fall Skew : 13.9(ps)
Max. Rise Buffer Tran : 238.5(ps) 550(ps)
Max. Fall Buffer Tran : 141.4(ps) 550(ps)

d
Max. Rise Sink Tran : 366.2(ps) 550(ps)
Max. Fall Sink Tran : 204.5(ps) 550(ps)

a
Min. Rise Buffer Tran : 120(ps) 0(ps)
Min. Fall Buffer Tran : 120(ps) 0(ps)
Min. Rise Sink Tran : 340.6(ps) 0(ps)

c
Min. Fall Sink Tran : 192(ps) 0(ps)

6/16/08 BD03: Digital Physical Design 410


CTS Report (continued)
‹ Maximum transition time violation

Example

ce
***** Max Transition Time Violation *****
Pin Name (Actual) (Required)

n
-----------------------------------------------------------------
reg/CK [406 353.5](ps) 400(ps)
reg2/CK [406 353.4](ps) 400(ps)

e
clk0__L6_I2/A [345.5 288.1](ps) 300(ps)
clk0__L7_I4/A [346.2 296.3](ps) 300(ps)
clk0__L9_I11/A [351.6 299.9](ps) 300(ps)

d
clk0__L9_I10/A [361.5 305.9](ps) 300(ps)

ca
6/16/08 BD03: Digital Physical Design 411

CTS Report (continued)


‹ Skew distribution information

e
Example

c
cgen/i_5/Y delay[0 0] ( CK__L1_I0/A )
********** Skew Distribution **********
LEVEL 1 Buffer:

n
Input Delay Range Nr of Buffers
[0.6 0.6] 1
(max, min, avg, skew) = (0.6(ps) 0.6(ps) 0.6(ps) 0(ps))

e
-------------------------------------------------------
Output Delay Range Nr of Buffers

d
[195.5 195.5] 1
(max, min, avg, skew) = (195.5(ps) 195.5(ps) 195.5(ps) 0(ps))LEVEL 2
Buffer:

a
-------------------------------------------------------
Input Delay Range Nr of Buffers

c
[212.8 212.8] 1
(max, min, avg, skew) = (212.8(ps) 212.8(ps) 212.8(ps) 0(ps))

6/16/08 BD03: Digital Physical Design 412


Topics in This Module
‹ What is clock tree synthesis

e
‹ How to create a clock tree specification file

‹ Analyzing a CTS report

‹ Low-power clocking techniques

nc
d e
ca
6/16/08 BD03: Digital Physical Design 413

Need for Low-Power Clocking


‹ Clock distribution network takes a significant fraction of the power consumed
by a chip .

output is not needed.

ce
‹ Significant power can be wasted in transitions within blocks, even when their

en
a d
6/16/08
c BD03: Digital Physical Design 414
Low-Power Clocking Technique
Gated clocks
‹ Involves adding logic gates to the clock distribution tree

ce
‹ Prevents switching in the areas of the chip not being used

‹ Exact savings are very design dependent, but around 20-30% is often
achievable.

en
FF FF

a d FF
Gated Clock Section

c
Clock Source FF

FF

6/16/08 BD03: Digital Physical Design 415

Summary
‹ Clock signal is needed to synchronize all memory elements in a chip.

e
‹ Clock tree has to be created to provide clock signal with the least
amount of skew and insertion delay.

combinational logic.

nc
‹ Skew affects the amount of clock period available for the

‹ Useful skew takes advantage of the difference of arrival time at flip-

e
flops to correct datapath timing violations.

d
‹ Clock tree has to provide an acceptable input transition to all the flip
flops.

a
‹ Low-power designs make use of gated clocks.

c
6/16/08 BD03: Digital Physical Design 416
Testing Your Understanding
True or false

e
1. Clock tree adds more wire into the design as compared to a clock
mesh.

exact same time.

nc
2. Propagated clock signal arrives at all flip-flops within a design at the

3. Skew can be good or bad for a design.

and automatic CTS.

d e
4. A clock tree specification file is needed in the case of both manual

a
5. Default behavior of clock tree synthesis is to only place the clock
buffers into the design and not route them.

6/16/08
c BD03: Digital Physical Design 417

ce
en
a d
6/16/08
c BD03: Digital Physical Design 418
Routing

Module 7

June 16, 2008

What Is the Difference?

e
DFF BUF NAND

c
CKBUF CKBUF CKBUF DFF

DFF

n
CKBUF NOR CKBUF INV

d e
a
DFF BUF NAND

CKBUF CKBUF CKBUF DFF

6/16/08
c CKBUF
DFF

NOR

BD03: Digital Physical Design


CKBUF INV

420
Module Objectives
In this module, you will be able to

e
‹ Analyze benefits of timing-driven versus congestion-driven routing

‹ Predict downstream detail routing issues by running trial routing

nc
‹ Explain how trial routing works and its benefits

‹ Explain what is meant by incremental routing, process antenna, and SI

e
fixing

a d
6/16/08
c BD03: Digital Physical Design 421

Discussion Question
Recall the flowchart diagram of the
design flow steps to take an idea to

e
product (chip)

c
‹ In what part of the flow does
routing occur?

en ? Design Flow
Step

d
Input/Output ? Routing

a
?

6/16/08
c BD03: Digital Physical Design 422
Topics in This Module
‹ Routing

e
‹ Various types of routing

‹ Comparison of different types of routing

nc
d e
ca
6/16/08 BD03: Digital Physical Design 423

Routing
‹ Definition

e
‹ Routing inputs and outputs

‹ Router

‹ Types of routers

‹ Routing tracks

nc
e
‹ Goals of routing

d
‹ Steps involved in routing

a
‹ Congestion

6/16/08
c BD03: Digital Physical Design 424
What Is Routing?
After placement of the individual
standard cells and macros, the

e
connections between the pins of the
cells need to be formed using metal

c
wires and vias. All wires connecting the
placed components have to obey the

n
design rules.

e
Definition: Process of connecting the
pins of the standard cells, macros, and

d
IOs of a digital design to specific metal
layers in the process technology to

a
match the schematic.

6/16/08
c BD03: Digital Physical Design 425

Routing: Inputs and Outputs


‹ Inputs
‰ Verilog® netlist

e
‰ Timing library (.lib) contains the Tech
Netlist File
timing information for each

c
discrete logic gate or macro
‰ Technology library (LEF) contains

n
DEF
information about the routing File
layers and their rules
Routing

e
‰ Physical cell library (LEF) contains
information about the shape and Phys
connectivity of the technology Lib

d
library cells
‰ Placement information such as a

a
DEF file
‰ Routing guides from clock tree Routed Congestion

c
synthesis (CTS) (optional) Design table

‹ Outputs
‰ Routed design
‰ Congestion table

6/16/08 BD03: Digital Physical Design 426


Where Does Routing Fit into the Implementation Flow?

RTL

ce Logic Synthesis

Gates

Timing
Closure Place
and

en Floorplanning
Static
Timing
Analysis
Test

d
Power Planning

Route
Placement

ca Clock Tree Synthesis

Route

GDSII GDSII

6/16/08 BD03: Digital Physical Design 427

Router
To handle various cost functions and constraints of deep submicron
layouts, router needs the capability to handle
‹ Variable wire widths

ce
‹ Variable spacing requirements

n
‹ Shielding and interleaving

e
‹ Minimum area rules

‹ Process antenna rules

a d
6/16/08
c BD03: Digital Physical Design 428
Types of Routers: Grid-Based
‹ Most commonly used router, because it is fast and mature
‹ Performs well for flat designs less than 3 million gates and for 130 nm
and larger designs
‹ Used for block-based designs
‹ Relatively high-speed router

ce
the routing area

en
‹ Superimposes a mesh-like grid running horizontally and vertically over

d
‹ Each vertical and horizontal grid intersection point on the mesh is
maintained as a pointer in memory

a
‹ The larger the design grows or the smaller the process geometry, the
more grid points need to be allocated in memory and the more time it

c
takes for routing
‹ A trial-router is a type of grid-based router used to quickly perform
global and detail routing to estimate congestion and timing at the early
stages of the physical implementation flow

6/16/08 BD03: Digital Physical Design 429

Types of Routers: Shape-Based


‹ Limited to small designs or top-level designs with approximately
20,000 to 30,000 nets

grid restrictions

ce
‹ Does not need to adhere to the concept of grid and so is not limited by

‹ Preferred solution for top-level routing and can handle complex and
custom requirements

en
a d
6/16/08
c BD03: Digital Physical Design 430
Types of Routers: Graph-Based
‹ Combines the performance characteristics of a grid-based router with
the flexibility of a shape-based router

ce
‹ Fast tool capable of handling all aspects of routing complex multi-
million gate designs, both at block level and top level
‹ Views a design similar to a grid-based router in that there are grid

en
lines in both the vertical and horizontal direction, however it considers
these grids only as a guideline for routing
‹ Does not require that every grid intersection on the design be

a d
allocated a pointer in memory, only the grid points in the vicinity of the
routing task will be considered as needed
‹ Through efficient memory handling, graph-based routers can handle

6/16/08
c
significantly larger design sizes

BD03: Digital Physical Design 431

Types of Routers
Super Threading
Multi-CPU

ce Graph-based
router
100-million-gate SoC
designs with hierarchy
65-nm variable pitch

n
Speed and Capacity

e
Designs of one million
Grid-based standard cells

d
routers Best for flat 130 nm
and above

ca Flexibility
Shape-based
routers
60–80K nets
structured custom
(Top level)

6/16/08 BD03: Digital Physical Design 432


Routing Tracks
‹ Metal routes must meet minimum width and spacing design rules to prevent
open and short circuits during fabrication

e
‹ In grid-based routing systems, design rules determine the minimum center-to-
center distance for each metal layer

nc
‹ Congestion occurs if there are more wires to be routed than available tracks
‹ Detailed routing track is track for actual wire locations
‹ Global routing track is coarser track for global routing

Detailed
routing

d e Global
routing

a
track track

6/16/08
c BD03: Digital Physical Design 433

Goals of Routing
‹ Responsible for functionally connecting all signal nets, power nets,
and buses in a design

ce
‹ Route the design quickly and be free from design rule check (DRC),
layout versus schematic (LVS), and signal integrity (SI) errors
‹ Effectively meet design for manufacturability and overall timing
specifications

en
a d
6/16/08
c BD03: Digital Physical Design 434
Steps Involved in Routing
‹ Global routing
‰ Assigns nets to specific metal layers and global routing cells

ce
‰ Tries to avoid congested global cells while minimizing detours
‰ Tries to avoid prerouted power and ground signal, placement, and routing
blockages

n
‹ Track assignment

e
‰ Assigns each net to a specific track
‰ Tries to avoid large number of vias

d
‰ Operates on the entire design at once

a
‹ Detail routing
‰ Tries to fix DRC violations using a fixed-size, small area known as SBox

c
‰ Traverses the whole design box by box until entire routing pass is
complete
‹ Search and repair
‰ Fixes any shorts or violations that are present

6/16/08 BD03: Digital Physical Design 435

What Is Congestion?
‹ Congestion occurs when
‰ Design is densely routed

ce
‰ More wires are needed at a location than the number of available tracks

‹ Congestion is shown as red diamond-shaped markers after an initial

n
trial route

d e
ca
6/16/08 BD03: Digital Physical Design 436
Analyzing Congestion
Actions to consider:

e
‹ Block placements can be adjusted to make sure that connecting pins
face each other.

nc
‹ Check for obstructions that may cause the congestion in the area.

‹ A partial placement blockage can be used to lower the congestion in a


specific area.

d e
‹ Read the log files for congestion information during global route, as
well as violation and iteration information during detail route.

ca
6/16/08 BD03: Digital Physical Design 437

Congestion Analysis Table


NanoRoute Groute congestion analysis table from the encounter.log file.
# Congestion Analysis:

e
#
# OverCon OverCon OverCon OverCon

c
# #Gcell #Gcell #Gcell #Gcell %Gcell
# Layer (1-2) (3-4) (5-6) (7-17) OverCon Worst case

n
# -------------------------------------------------------------------------------------------------------- on Metal2
# Metal 1 1625(2.35%) 34(0.05%) 0(0.00%) 0(0.00%) (2.40%)
# Metal 2 11546(16.7%) 6353(9.19%) 4728(6.84%) 3787(5.48%) (38.2%)

e
# Metal 3 8500(12.3%) 904(1.31%) 37(0.05%) 1(0.00%) (13.7%)
# Metal 4 14951(21.6%) 764(1.11%) 20(0.03%) 0(0.00%) (22.8%)

d
# Metal 5 8473(12.3%) 37(0.05%) 0(0.00%) 0(0.00%) (12.3%)
# Metal 6 854(1.24%) 0(0.00%) 0(0.00%) 0(0.00%) (1.24%)

a
# --------------------------------------------------------------------------------------------------------
# Total 45949(11.1%) 8092(1.95%) 4785(1.15%) 3788(0.91%) (15.1%)
#

c
# The worst congested Gcell OverCon (routing demand over resource in number of tracks) = 17
Note: Overflow/OverCon = (Demand – Supply) per gcell

6/16/08 BD03: Digital Physical Design 438


Violation Trends
#start 19th optimization iteration ...
# completing 10% with 98611 violations
# completing 20% with 98648 violations Steps

e

9 Run the search and repair
# completing 100% with 98663 violations
up to the 19th iteration.

c
# number of violations = 98663
#Complete Detail Routing. 9 Check the log file on which
#Total wire length = 559006793 um. layers the violations occur.

n
#Total half perimeter of net bounding box =
471147199 um. 9 Check the violations
graphically if there are lots

e
#Total wire length on LAYER MT1 = 18600362 um.
… of violations (>1000).
#Total number of vias = 31662170

d
#Total number of vias on LAYER MT1 to MT2 =
11200471

a

#Total number of DRC violations = 98663
#Total number of violations on LAYER MT1 = 61945

c
#Total number of violations on LAYER MT2 = 5161
#Total number of violations on LAYER MT3 = 280
#Total number of violations on LAYER MT4 = 258
#Total number of violations on LAYER MT5 = 124
#Total number of violations on LAYER MT6 = 414
#Total number of violations on LAYER MT7 = 30481

6/16/08 BD03: Digital Physical Design 439

Topics in This Module


‹ Routing

e
‹ Various types of routing

‹ Comparison of different types of routing

nc
d e
ca
6/16/08 BD03: Digital Physical Design 440
Various Types of Routing
‹ Trial route

e
‹ Global routing

‹ Detail routing

‹ Timing-driven routing

‹ Congestion-driven routing

nc
e
‹ Incremental routing

d
‹ Process Antenna Effect (PAE)-aware routing

a
‹ SI-aware routing

‹ Clock routing

6/16/08
c
‹ Super-threading routing

‹ Diagonal routing

BD03: Digital Physical Design 441

What Is Trial Route?


Performs quick global and detailed routing to

e
‹ Estimate and view routing congestion: Produces a congestion map
that is viewed to get early feedback on whether a design is routable

nc
‹ Estimate parasitic values for optimization and timing analysis: Creates
actual wires to get good representation of RC and coupling

d e
ca
6/16/08 BD03: Digital Physical Design 442
Trial Route Effort Level
‹ Prototyping
‰ Runs quickly to gauge the feasibility of the netlist

‹ Medium effort
‰ Default selection

ce
‰ Components in the design might not be routed at legal locations

‹ High effort

en
‰ For additional iteration to lower congestion

d
‹ Low effort
‰ For quick routing, and it completes without congestion detouring

a
‰ At this effort level, you throw away the route information

c
‰ This mode is typically used only when you partition the design

‹ Use “Prototype” or “Low” effort if you want to have a very quick look at
the routability of the design. Use “Medium” effort in most cases, and
“High” effort if “Medium” is showing congestion.

6/16/08 BD03: Digital Physical Design 443

Trial Route

Advantages Disadvantages
Routes the design quickly

ce
Estimates congestion and parasitic data
Does not fix DRC violations or give DRC
clean routing results
Routes are only used to estimate

n
early in the design cycle parasitic values for timing analysis and

e
not signal integrity analysis

Creates a congestion table showing the Cannot fix timing violations

d
amount of congestion in each metal
layer

ca
6/16/08 BD03: Digital Physical Design 444
Discussion Questions
‹ What are the benefits of running trial route?

e
‹ What issues can be predicted by running trial route?

nc
d e
ca
6/16/08 BD03: Digital Physical Design 445

What Is Global Routing?


‹ Guides the detailed router in large designs
‰ Creates a coarse routing plan for detailed router to follow

ce
‰ Does not create actual routing wires

‹ May perform quick, initial detail routing

n
‹ Commonly used in cell-based design, chip assembly, and datapath

e
‹ Also used in floorplanning and placement

a d
6/16/08
c BD03: Digital Physical Design 446
Global Routing Goals
‹ Minimize the wire length
‰ Total wire length calculated by global router should be within a few

‹ Minimize worst congestion value

ce
percentage points of that estimated by the placer

‰ Congestion value is associated with each boundary crossing (edge)

en
between adjacent global routing cells (gcells) on a specific layer
‹ Optimize routes for timing and signal integrity
‰ Tries to meet hold and setup timing

d
‰ Minimizes design rule violations

a
6/16/08
c BD03: Digital Physical Design 447

Global Routing Steps


‹ Router breaks the routing portion
of the design into rectangles gcells

e
Start

‹ Router then assigns the signal


gcell

c
nets to the gcells
‹ Router attempts to find the

n
shortest path through the gcells
‰ No actual connections are made

tracks within the gcells

d e
‰ No nets are assigned to specific

ca End

6/16/08 BD03: Digital Physical Design 448


Global Routing Steps (continued)
‹ Tries to avoid assigning more nets
to a gcell than the tracks can

e
accommodate Start
gcell

nc Start

e
Start

d
Start

ca End

End
End

End

6/16/08 BD03: Digital Physical Design 449

Global Routing Steps (continued)


‹ Router then generates a map of
the gcells (congestion map)

e
Start
‰ Congestion map uses colors to gcell
indicate whether there are too

c
few, too many, or the correct
Start
number of nets assigned to the

n
gcells
‰ gcells are marked over-congested

e
Start
if router assigns too many nets to
a gcell

d
Start

ca Edge has
3
crossings

End

End
End

End

Edge has
2
crossings

6/16/08 BD03: Digital Physical Design 450


Congestion Map GUI
Congestion maps from trial route and global route are displayed differently

ce
en
a d
6/16/08
c
Trial Route View Global Route View

BD03: Digital Physical Design 451

What Is Detail Routing?


‹ Connects all pins in each net

e
‹ Must understand most or all design rules

‹ Necessary in all applications

nc
‹ Goal is to complete all of the required interconnects without violations

‹ All nets will be routed, even if they contain violations (It is better to

e
have a route with a violation, than no route at all.)

a d
6/16/08
c BD03: Digital Physical Design 452
Detail Routing Steps
‹ Router divides the chip into areas called switch boxes (SBoxes)
‰ SBoxes align with gcell boundaries

ce
‹ Router follows global routing plan
‰ Lays down actual wires that connect the pins to the corresponding nets

n
‰ Creates shorts or spacing violations rather than leave unconnected nets

‹ Router runs search and repair

d e
‰ Locates the shorts and spacing violations
‰ Reroutes affected areas to eliminate as many violations as possible

a
‹ Runs post-route optimization
‰ Runs rigorous search-and-repair steps

6/16/08
c
‹ Stops once it cannot make further progress on routing the design

BD03: Digital Physical Design 453

Timing-Driven Routing
‹ Routing along the timing-critical path is given priority

e
‹ Creates shorter and faster connections along the critical path

‹ Non-critical paths are routed around critical paths

nc
‰ Reduces routing congestion problems for critical paths
‰ Does not adversely impact timing of non-critical paths

e
‹ Input files needed for timing-driven routing
‰ Physical libraries in LEF

a d
‰ Timing library in .lib format
‰ Timing constraints in .sdc format or a timing graph
‰ Extended capacitance table

6/16/08
c
‰ Verilog Netlist
‰ Placed design in DEF

BD03: Digital Physical Design 454


Congestion-Driven Routing
‹ Router tries to reduce congestion
‰ Routing occurs based on a cost function

ce
‰ Congestion reduction is given the highest priority

‹ Nets that are in the congested area are spread apart and routed

n
through other areas

d e
ca
6/16/08 BD03: Digital Physical Design 455

Discussion Questions
‹ Why would you run timing-driven routing?

e
‹ Why would you run congestion-driven routing?

‹ What are the tradeoffs between the two?

nc
‹ What is your design is both congested and not meeting timing? Which
routing type would you run first and why?

d e
ca
6/16/08 BD03: Digital Physical Design 456
What Is Incremental Routing?
‹ Provides an incremental rip-up and reroute capability

e
‹ Reroutes partial routes and nets without routes

‹ Retains fully prerouted nets and pin-to-pin paths

nc
‹ Might use dangling paths to complete routes, but removes dangling
wires left over from global routing

e
‹ Keeps connectivity within the bounding box, but does not constrain
layers or positions

d
‰ The router might change the routing path of another net and route it on a
different layer or in a different position.

ca
‰ The router does not support re-routing of wires with the FIXED keyword.
Change FIXED to ROUTED to reroute these wires.

6/16/08 BD03: Digital Physical Design 457

PAE-Aware Routing
‹ During manufacturing, static charge builds up on metal traces
‰ Metal with static charge accumulated on it, when connected, will discharge

ce
onto a gate, passing high current through it.
‰ The discharge can damage the oxide that insulates the gate and cause
the chip to fail.

n
‰ Antenna ratio is the maximum allowable ratio of metal area to gate area.

e
‰ The router calculates antenna ratio to determine the extent of PAE.

‹ Process antenna violations are fixed when the router finds a net with

d
an antenna ratio for a specified layer that exceeds the maximum
allowed value.

a
‹ Router fixes process antenna violations by

c
‰ Inserting diodes to provide alternate path to discharge static charge and
protect the gate
‰ Changing (jogging) the routing layers connected to a gate to decrease the
area of a metal layer connected to a gate to meet the antenna ratio

6/16/08 BD03: Digital Physical Design 458


Tradeoffs of Fixing PAE
‹ Reverse biased diode insertion
‰ Causes leakage
‰ Increases area
‰ Timing penalties

ce
n
‹ Bridging (breaking antenna by hopping to higher layer)
‰ Extra wiring
‰ Congestion

d
‰ More vias are created

e
ca
6/16/08 BD03: Digital Physical Design 459

What Are SI Effects?


‹ Nanometer designs (130 nm or less) suffer from increased sensitivity
to signal integrity (SI) effects such as

ce
‰ Crosstalk-induced delay changes
‰ Functional failures caused by crosstalk glitches

‹ Caused by

‰ Coupling capacitance

en
‰ Decreased interconnect pitch and features size

d
‰ Higher clock frequencies
‰ Lower supply voltages

ca
6/16/08 BD03: Digital Physical Design 460
SI-Aware Routing
‹ Crosstalk effects such as glitch and delay are measured after the
physical wires are made available.

ce
‹ Router tries to reduce crosstalk between wires.

‹ Creates routes with reduced coupling capacitances by

n
‰ Parallel wire minimization: Limiting the distance that two wires travel
adjacent to each other

e
‰ Layer switching: Changing the track assignment for a wire so that potential
victim nets can be moved away from a strongly driven signal net

a d
‰ Net shielding: Using power and ground lines to shield critical high-speed
signals such as clocks
‰ Track reassignment: Assign tracks to parallel wires that are further apart

c
with in-between tracks assigned to shorter, less noise-sensitive wires
‰ Soft spacing: Making use of available free space to spread wire segments
apart

6/16/08 BD03: Digital Physical Design 461

Discussion Questions
‹ Why would you use incremental routing?

e
‹ For process antennas, how is the router constrained to fix these?

‹ How can a router make choices that will reduce the effect of signal
integrity?

nc
d e
ca
6/16/08 BD03: Digital Physical Design 462
Clock Routing
‹ Usually routed same way as signals, but we can choose to route clock
nets by themselves before routing other nets.

ce
‹ Clock nets given priority during global routing.

‹ Clock nets are routed as straight as possible.

n
‹ When clock nets are routed, one track of spacing can be added
around these nets to improve coupling capacitance.

d e
Shielding can also be added to clock net for additional signal integrity.
‹ Clock routes can be marked as fixed, so that post-route optimization

a
will not reroute the clocks and alter the clock skew, timing, etc.

6/16/08
c BD03: Digital Physical Design 463

Wide Wire Routing


‹ Routing is done with wider wires for post-route yield optimization.

e
‹ Router widens wires where resources are available.
‰ Does not add DRC

‰ Does not affect timing

nc
‰ Does not add antenna violations

‹ Wire widening uses non-default rules.

d e
ca
6/16/08 BD03: Digital Physical Design 464
Super-Threading Routing
Portions of the design flow can be Multiple Threading Distributed Processing
accelerated using multiple-CPU

e
processing. There are three modes: Job Job

c
‹ Multiple threading
‰ Job is divided into several threads

n
Thread
‰ Multiple processors in a single
machine process each thread

e
concurrently
Processor Processor Processor
‹ Distributed processing

d
‰ Job is processed by two or more Super Threading
networked computers running

a
concurrently Job

c
‹ Super threading
‰ Combination of multithreading and
distributed processing Thread Thread

‰ Delivers scalable performance and


capacity
Processor Processor

6/16/08 BD03: Digital Physical Design 465

Super-Threading Routing (continued)


‹ Combines advantages of multi-threaded routing with flexibility of
distributed parallel routing

10X

ce
‹ Boosts routing performance on 600K to 400M gate designs by up to

n
Reduces design cycles significantly without sacrificing quality
‹ Tasks are partitioned among different CPUs automatically by the

e
router

d
‹ Speedup is nearly linear as the number of CPUs grows

ca
6/16/08 BD03: Digital Physical Design 466
What Is Diagonal Routing?
‹ Some routers take advantage of
45-degree “diagonal” routes on

e
certain metal layers.

c
‹ M1 and M2 are “orthogonal” so
that the connections to the M8 - Vertical
standard cells are preserved,

n
M7 – Horizontal
while M7 and M8 (the top layers)
M6 – 45 Degree Left

e
are also orthogonal for power grid
creation. M5 – 45 Degree Right

d
M4 – 45 Degree Left
‹ The middle layers can be 45
degrees offset and alternate M3 – 45 Degree Right

a
direction between metal layers. M2 - Vertical

c
M1 - Horizontal

6/16/08 BD03: Digital Physical Design 467

Diagonal Routing: Pros and Cons


Pros
‹ Can achieve good routing quality,
avoid routing congestion, and
improve timing
‹ May decrease the overall area of

ce M8 - Vertical

n
the design because of the routing M7 – Horizontal
efficiency M6 – 45 Degree Left

Cons

d e
‹ Must have a special library and
vendor who will accept the
M5 – 45 Degree Right

M4 – 45 Degree Left

M3 – 45 Degree Right

a
diagonal routes M2 - Vertical

c
‹ Must have special tools for M1 - Horizontal

place/route, as well as physical


verification

6/16/08 BD03: Digital Physical Design 468


Topics in This Module
‹ Routing

e
‹ Various types of routing

‹ Comparison of different types of routing

nc
d e
ca
6/16/08 BD03: Digital Physical Design 469

Global Route vs. Detail Route

e
Global Route Detail Route
Runs on the entire design. Can route the entire design, an area, or

c
selected nets.

n
Finds generalized pathways without Lays down physical wires based on the
laying down actual wires. global routing plan.

minimize use of vias.

d e
Iterative passes are made to optimize
global routing, shorten wire length, and
Fixes DRC violations during search and
repair routing.

ca
Congestion map is updated. If antenna rules are included in the LEF,
antenna repair will also be done during
detail routing.

6/16/08 BD03: Digital Physical Design 470


Timing-Driven vs. Congestion-Driven Routing

e
Timing-Driven Congestion-Driven

c
Router routes critical nets to meet Router routes nets keeping low
timing constraint. congestion as a high priority.

routed as short as possible.

en
A critical net will be forced to be Nets will be forced to be spread apart
from a heavily congested area.

a d
May create congestion if many critical
nets have to be forced into a small
channel.

6/16/08
c BD03: Digital Physical Design 471

Summary
‹ After placement, a trial route is run to get an estimate of congestion
and parasitic values.

ce
‹ Early detailed routing provides physical information necessary for
prevention of problems for physical synthesis.
‹ Congestion is displayed as a red diamond after trial route, and colored
lines after global route.

en
‹ Clock nets are routed first and fixed into position so that the router
does not alter them in subsequent runs.

a d
‹ The number of wires assigned to a gcell should not exceed the
number of tracks available.

6/16/08
c BD03: Digital Physical Design 472
Testing Your Understanding
True or false

e
1. Congestion map is created after running detail routing.

2. Shape based routers are limited to small designs and are the

nc
preferred solution for top-level routing.
3. Global router provides guidance to the detailed router.

e
4. Detailed routing stops once the congestion map is created.

5. Super threading increases the runtime for the routing phase.

a d
6/16/08
c BD03: Digital Physical Design 473

Learning Activity
In this activity, you will

e
‹ Study metrics from several routing log files

‹ Identify potential downstream issues

c
‹ Present your results to the class

n
e
20 minutes for activity
10 minutes for debriefing

a d
6/16/08
c BD03: Digital Physical Design 474
Power Consumption and
Power Grid Analysis
Module 8

June 16, 2008

How Has Power Influenced Technology?

ce
en
a d
6/16/08
c BD03: Digital Physical Design 476
Module Objectives
In this module, you will

e
‹ Identify the inputs and outputs of power-consumption and power grid
analysis tools

nc
‹ Explain the three components of power (leakage, switching, internal)

‹ Articulate the difference between static and dynamic power


consumption

d e
‹ Identify the types of power grid analysis, the difference between static
and dynamic power grid analysis, and what each is used for

a
‹ Recognize low-power design issues and apply three power-saving
design techniques

6/16/08
c BD03: Digital Physical Design 477

Discussion Questions
‹ What affects power consumption in a chip?

e
‹ How does power affect the cost of a chip?

nc
d e
ca
6/16/08 BD03: Digital Physical Design 478
Topics in This Module
‹ Power consumption and analysis (PowerMeter power calculation
functionality)

‹ Low-power design

ce
‹ Power grid analysis (VoltageStorm® power and power rail verification)

en
a d
6/16/08
c BD03: Digital Physical Design 479

Power Consumption and Analysis


‹ What is power consumption?

e
‹ Inputs and outputs for power consumption calculation

‹ Static power consumption

‹ Dynamic power consumption

nc
d e
ca
6/16/08 BD03: Digital Physical Design 480
What Is Power Consumption?
Power consumption is a critical design criteria. Today, for most system-on-
a-chip (SoC) designs, the power budget is one of the most important design

e
goals of the project.

c
‹ Definition: Power consumption is the amount of energy over time that
must be supplied to a circuit to maintain normal operation. Power

n
consumption is measured in watts (W).

e
‹ Example: The increasing speed and complexity in today’s
microprocessor chips has resulted in a significant increase in the

d
power requirement and determines the battery life in hours for portable
devices.

ca
6/16/08 BD03: Digital Physical Design 481

Why Is Power Consumption an Issue?


250
‹ Exponential increase in chip density
Leakage Power (W)
‰ Tens of millions of gates implemented on a 200 Active Power (W)

e
reasonably small die
Power (W)

‰ Increase in power density and total power 150

c
dissipation
100
‰ Limits of what packaging, cooling, and other
infrastructure can support exceeded

n
50
‰ Battery life has declined as features have been
added faster than power (per feature) has been

e
0
reduced.
250 180 130 90 70
‹ Deep submicron technology, 90 nm and below Technology (nm) *Source = Intel

d
‰ Leakage current is increasing dramatically.
‰ Microprocessor chips can dissipate up to 100-150W of power.

a
‰ Power density causes large number of local hot spots on the die.
 Poses reliability problems (mean time to failure decreases exponentially with

c
temperature)
 Timing degrades and leakage increases with temperature
‹ These problems are all expected to get worse as we move to the next
technology nodes.

6/16/08 BD03: Digital Physical Design 482


Benefits of Reducing Power Consumption
Reducing system power consumption

e
‹ Extends battery life in portable systems

‹ Reduces system temperature


‰ Improves timing
‰ Reduces leakage

nc
e
‹ Reduces system fan noise (on some models)

‹ Provides better reliability

‹ Lowers cooling cost

a d
‹ Simplifies power supply and delivery

6/16/08
c BD03: Digital Physical Design 483

Components of Power Consumption


‹ Static power component due to leakage
‰ Leakage power: Power consumed when cells are not switching

ce
‹ Dynamic power component: Related to charging and discharging of load
capacitance and due to a path from Vdd to ground
‰ Switching power: Power consumed through charge and discharge of gate

n
capacitance. The total gate capacitance consists of the sum of the capacitance of
internal gate nodes and capacitance of the gate output load.

e
‰ Short circuit power: Power consumed when both N and P devices are ON at the
same time. Current path established from power rail to ground. It is a function of

d
output load and input slew.

ca Ptotal = Pstatic + Pdynamic


6/16/08 BD03: Digital Physical Design 484
Power Consumption and Analysis
‹ What is power consumption?

e
‹ Inputs and outputs for power consumption calculation

‹ Static power consumption

‹ Dynamic power consumption

nc
d e
ca
6/16/08 BD03: Digital Physical Design 485

What Is a Power Library?


‹ Definition: A power library is a collection of cells described in a particular
format to represent the power characteristics for those cells.

ce
‹ Example: The ASIC (Application Specific Integrated Circuit) vendor of a
design library (standard cells and macro’s) provides its customer with a Liberty
(.lib) version of their cells, which apart from timing information, contains power

n
information that the power analysis tool can use to calculate leakage and
active power consumption for the cells.

e
Example: power information in a .lib file
cell (INVXL) { values ( ……… );
cell_footprint : inv; }

d
area : 6.6528; fall_power(energy_template_7x7) {
pin(A) { index_1 ("0.0250, 0.0800, 0.3000,
direction : input; 0.7000, 1.2000, 1.7000, 2.3000");

a
capacitance : 0.00270; index_2 ("0.00018, 0.01050, 0.01925,
} 0.04200, 0.07350, 0.11550, 0.15575");
pin(Y) { values ( ……… );
direction : output; }

c
capacitance : 0.0; }
function : "(!A)"; timing() {
internal_power() { …
related_pin : "A"; }
rise_power(energy_template_7x7) { max_capacitance : 0.15575;
index_1 ("0.0250, 0.0800, 0.3000, }
0.7000, 1.2000, 1.7000, 2.3000"); cell_leakage_power : 0.0173;
index_2 ("0.00018, 0.01050, 0.01925, }
0.04200, 0.07350, 0.11550, 0.15575");

6/16/08 BD03: Digital Physical Design 486


Inputs and Outputs for Power Consumption Calculation
‹ Power libraries provide power analysis tool with the
following information:

e
‰ Functional information (.lib)
.libs, .cl, SPEF, ‰ Pin capacitances (.lib)
SDC, TWF, VCD
‰ Leakage power (.lib)

c
‰ Internal power tables (.lib)
‰ Internal decoupling cap (.cl)

n
Power ‰ Physical size and location of power ports (.cl)
Consumption ‰ Internal power net resistance (.cl)

e
Tool
‰ Tap currents (.cl)
‹ Function of a power analysis tool

d
‰ Calculates instance-based static and dynamic power
Power consumption
Consumption

a
‰ Runs in two modes:
 Vector driven: Use actual switching activity from a VCD
file

c
 Vector-less: Probabilistically project the activity
To throughout a design
Power Grid Analysis ‹ We use the results of the power consumption tool to
Tool perform static and/or dynamic power grid analysis
‹ Produce reports on the power consumed by each cell,
cell type, or hierarchical block in the design

6/16/08 BD03: Digital Physical Design 487

What Is Switching Activity?


‹ Switching activity (α) is the number of transitions (0-to-1 and 1-to-0)
for every net in a circuit when input stimuli are applied.

relationship to that period?

ce
‹ Within a given CLK period, how often will an input switch in

n
clock
cycles

Net A

d e
ca
‹ In the example, net A switches two times but clock switches six times.

‹ Activity of the clock is set at 1 since the clock is always switching.

‹ Calculation would be 2/6 = .333 (net A’s activity).

6/16/08 BD03: Digital Physical Design 488


Input and Output, Format
‹ Input
‰ Gate-level netlist in the Verilog®

e
language and or DEF (tool dependent)
‰ Power characterized libraries in tool-

c
specific format
‰ Timing libraries in Liberty (.lib) format VCD

n
‰ Timing constraints in SDC format
Gates + SDC TWF
SPEF
‰ Extraction data in SPEF format DEF

e
‰ Timing windows file (TWF)
‰ Value-change-dump file (VCD) Power Analysis

d
‹ Output Logical Power
Libraries Libraries

a
‰ Textual output
 Reports on the power consumed
by each cell or block (.pwr) file. Reports

c
‰ Graphical output
 Instance-based power and power
density
 Power consumption of the clock
distribution network

6/16/08 BD03: Digital Physical Design 489

Power Consumption and Analysis


‹ What is power consumption?

e
‹ Inputs and outputs for power consumption calculation

‹ Static power consumption

‹ Dynamic power consumption

nc
d e
ca
6/16/08 BD03: Digital Physical Design 490
Static Power Consumption

Pstatic = VDD x Ileakage

ce
en
‹ Silicon devices are not ideal switches.

‹ Static power dissipation is the power that is lost while circuit signals are not

d
actively switching.
‹ This power dissipation includes leakage and standby power dissipation (i.e.,

a
leakage power when voltage is applied even if circuit is not switching).

c
‹ Static power consumption is the summation of leakage, state dependent
leakage, and averaging of internal and switching over time.

6/16/08 BD03: Digital Physical Design 491

What Is Static Power Analysis?


‹ Static power analysis is the calculation of leakage power

e
‹ Computes average power consumption based upon various
assumptions

‹ Much faster than simulation

nc
‹ It is a full-chip and instance-based power consumption analysis

e
‹ Less accurate than simulation
‰ Hard to model real delays

a
simulation vectors

d
‰ Probabilities model the environment in a less accurate way than

6/16/08
c BD03: Digital Physical Design 492
How Is Static Power Analysis Done?
‹ No simulation is done to determine actual net activity.
Vector-independent (probabilistic activity-based with
optional VCD)

ce
‹ By understanding the logic functionality and the activity at
the input pins, the activity at the output pins is predicted .
‹ Analysis types
.libs, .cl, SPEF,
SDC, TWF, VCD

n
‰ Area-based
Power

e
 A power per unit area is assumed and multiplied by the Consumption
total die area Tool

d
 Easy, but not very accurate and is used in floorplanning
‰ Cell-based

a
Power
 Power for each cell is taken from the library entry Consumption
 More accurate and is used by synthesis tools prior to

c
place and route
‰ Instance-based
 Takes in consideration output load of each instance Static Power Analysis
Reports
 Calculates power from library tables
 Most accurate, but requires information from place and
route
6/16/08 BD03: Digital Physical Design 493

Power Consumption and Analysis


‹ What is power consumption?

e
‹ Inputs and outputs for power consumption calculation

‹ Static power consumption

‹ Dynamic power consumption

nc
d e
ca
6/16/08 BD03: Digital Physical Design 494
Dynamic Power Consumption

Pdynamic = α x CL x VDD2 x f

CL

ce Where,
α – Switching activity
f – Operating frequency

is switching.

en
‹ A circuit does not draw constant current. Current draw increases when a cell

‹ Dynamic power consists of power dissipated inside a cell (mostly due to short-

d
circuit current during switching) and power dissipated to charge/discharge net
capacitance.

ca
‹ Dynamic power is a function of voltage, toggle rate, and net loading.

‹ Dynamic power consumption is the power of each instance over time, taking
into account simultaneous switching activity.
‹ Timing Window File (TWF) provides windows when nets are switching relative
to clock edges; default input activity or VCD provides switching activity (toggle
rate.
6/16/08 BD03: Digital Physical Design 495

What Is Dynamic Power Analysis?


‹ Computes actual power consumption using actual net activity derived
from a simulation

e
‰ Best analysis since it takes into account that not all nets are driven at the
same frequency

c
‰ Dependent on the actual test vectors used to derive the net activity
‰ Requires significant CPU time in simulation
‹ Gate-level analysis

en
‰ Net activity information from simulation vectors
‰ Time-based input slew and output load for each cell

d
‰ Cell power characterization from the library
‰ Usually performed during analysis since it is faster, but not very accurate

a
‹ Transistor-level analysis

c
‰ Simulation vectors for at least the I/Os (such as running SPICE on a full
design
‹ Performed at signoff since it takes a long time, but is very accurate
What if vectors are not available for simulation?

6/16/08 BD03: Digital Physical Design 496


How Is Dynamic Power Analysis Done?
‹ SIMULATION with a representative set of vectors
‰ Derived by designer

e
Simulation
‰ Vector based
 Uses VCD for switching activity and timing and TWF for

c
input slews Toggle
 Most accurate solution if “right” vectors are provided by Rates

n
user
‰ Vector-less

e
Simulation
 Uses TWF for input slews and timing Driven
 Best approach to obtain full-chip transient information Power

d
Analysis
‹ Transistor level

a
‰ Very accurate
Power
‰ Much faster than SPICE

c
Report
‹ Gate level
‰ Faster than transistor level
To
‰ Still very accurate due to good modeling of power dissipation Power Grid Analysis
at cell level Tool

6/16/08 BD03: Digital Physical Design 497

Difference between Static and Dynamic Power Analysis

Static Power Analysis Dynamic Power Analysis

e
It is the average power over time for It is the average or peak power over time
each instance, resulting in one power resulting in current waveforms, i.e., at

c
number. each time step across the simulation
window.

n
Calculates average IR drop. Calculates the worst case IR drop
transients.

dissipation to calculate a constant

d e
Static IR drop analysis is a first-order
approximation that uses the total power
Dynamic IR drop analysis deals with the
voltage drop of current surges.

a
current draw.
Is a fast process available early in the Provides visibility of simultaneous

c
design phase and provides correct switching, decap optimization to control
information on power grid issues. leakage power, and the effect of
packaging.

6/16/08 BD03: Digital Physical Design 498


Static vs. Dynamic IR Drop Analysis

Static (average) Dynamic IR drop

e
IR drop (worst-case)

nc
17 mV increase
in IR drop due
to switching

d e
ca
6/16/08 BD03: Digital Physical Design 499

Review Questions
1. What are the three components of power consumption?
2. What is the purpose of a power library?

power consumption?

ce
3. What is the difference between static power consumption and dynamic

en
a d
6/16/08
c BD03: Digital Physical Design 500
Topics in This Module
‹ Power consumption and analysis

e
‹ Power grid analysis

‹ Low-power design

nc
d e
ca
6/16/08 BD03: Digital Physical Design 501

Power Grid Analysis


‹ What is power grid analysis?

e
‹ Inputs and outputs of power grid analysis

‹ Tasks of power grid analysis

‹ Static power consumption

‹ Dynamic power consumption

nc
d e
ca
6/16/08 BD03: Digital Physical Design 502
What Is a Power Grid?
IC power distribution systems are designed to provide
G V G V
needed voltages and currents to the transistors that

e
perform the logic functions of a chip.

V
block 4

V
c
‹ Definition: The system that distributes the block 5

needed voltages and currents evenly

G
throughout the chip to ensure the correct logic

n
block 3
functioning is achieved through a network of

V
block 4

e
wires called the power grid.

V
G
‹ Example: In our design, we over-engineered block 1

d
V
our power grid to avoid IR-drop problems, but in

G
doing so, we did not have enough resources to

a
properly route our design. G V G V G V

6/16/08
c BD03: Digital Physical Design 503

What Is Power Grid Analysis?


‹ Voltage drops occur in the power distribution network because of interconnect
resistance.

e
‹ Power grid analysis evaluates how power is distributed from the voltage
source to the transistors and gates in the design.

c
‹ It is the analysis of the power grid and not power consumption in a design.

n
e
VDD_1

a d
VDD_2

6/16/08
c +
-
Resistance of
interconnect

BD03: Digital Physical Design 504


What Is the Purpose of a Power Grid Analysis?
‹ Checking the integrity of the supply voltage
‰ Detects voltage (IR) drops on VDD nets

ce
‰ Detects ground bounce on VSS nets

‹ Reduce the effect of nets affected on a design's overall timing and

n
functionality
‹ Reduces cause of silicon failure

e
‰ Reduces electromigration (EM) effects

d
ca
6/16/08 BD03: Digital Physical Design 505

How Is Power Grid Analysis Done?


‹ Power grid analysis at transistor-level
‰ Transistors are modeled as current sources attached to the power grid.

ce
‰ A tap current (currents arising from transistor to power grid connection) data file
provides the details for each current source.
‰ These currents are used to perform either a simple steady-state analysis or a

n
dynamic analysis of the power grid.

VDD

d e
ca
6/16/08 BD03: Digital Physical Design 506
How Is Power Grid Analysis Done? (continued)
‹ Power grid analysis at cell/gate-level
‰ The current distribution within a cell or a block is done on an instance-by-instance

e
basis.

c
‰ An instance-based power consumption file or current data file supplies the power
consumed on an instance-by-instance basis in Watts.

n
‰ Current source applied as black box or gray box.

VDD

d e
ca Port view
Detailed view

6/16/08 BD03: Digital Physical Design 507

Power Grid Analysis


‹ What is power grid analysis?

e
‹ Inputs and outputs of power grid analysis

‹ Tasks of power grid analysis

‹ Static power consumption

‹ Dynamic power consumption

nc
d e
ca
6/16/08 BD03: Digital Physical Design 508
Input and Output, Format
‹ Input
.lib,
‰ Gate-level netlist in the Verilog LEF/ SPEF, etc.

e
GDSII
language + DEF

c
‰ Power grid cell view library Power
Library view consumption
‰ Power consumption data Generator analysis tool

n
‹ Output
‰ Graphical display

e
Power
Power Grid Consumption
‰ Plots View Library

d
‰ Reports
DEF/GDSII

a
Hierarchical power-
grid analysis tool

6/16/08
c BD03: Digital Physical Design
Analysis Results

509

Typical Sequence to Run Power Grid Analysis


‹ Create the power grid view libraries or get them from
your library provider LEF/
.lib,

e
SPEF, etc.
GDSII
‹ Create the top-level DEF/GDS of your design

c
‹ Create power consumption data Power
Library view consumption
‰ Provide power consumed on per-instance basis
Generator analysis tool

n
‰ Provide power consumed on a per-cell basis
‰ Area-based power distribution based on total number

e
Power
‰ More data for cells = more accurate power consumption Power Grid Consumption
data View Library

d
‹ Run power grid analysis
‰ Link to the power grid view libraries DEF/GDSII

a
‰ Load the power consumption data
Hierarchical power-
‰ Set up and run the analysis grid analysis tool

6/16/08
c BD03: Digital Physical Design
Analysis Results

510
Power Grid Analysis
‹ What is power grid analysis?

e
‹ Inputs and outputs of power grid analysis

‹ Tasks of power grid analysis

‹ Static power consumption

‹ Dynamic power consumption

nc
d e
ca
6/16/08 BD03: Digital Physical Design 511

Tasks of Power Grid Analysis


‹ IR drop and ground bounce

e
‹ Electromigration (EM)
‰ Current density Neither of these two

c
are handled by
‰ Joule heating or wire self-heat (signal nets)
power grid analysis

n
‹ Hot electron effects but are factors that
affect the overall

e
power analysis

a d
6/16/08
c BD03: Digital Physical Design 512
What Is Voltage (IR) Drop and Ground Bounce?
‹ IR drop

e
Voltage drops caused by current flowing from the power source through the
resistive power network to the on-chip devices is called IR drop.

c
‹ Ground bounce

Voltage spikes caused by current flowing from on-chip devices though the

n
resistive ground network to the ground pins (or bumps)

e
‹ IR drop and ground bounce combine to impact silicon performance.

d
VDD = 1.20V

a
VDD = 1.1V

c
CLK

VDD = 1.17V

6/16/08 BD03: Digital Physical Design 513

IR Drop Impacts on Setup and Hold Time


‹ In the case where the IR drop occurs within the signal path, the signal is
slowed, potentially causing setup time violations for this signal path.

e
Setup Time Violation

c
CLK CLK

Latch Latch DATA

n
IR drop
DATA +
IR drop

e
Setup

d
‹ In the case where IR drop occurs on a clock buffer, the clock signal beyond
this buffer is slowed, potentially causing hold time violations for all signals

a
clocked by this clock branch.
Hold Time Violation
IR drop

c
CLK
CLK Hold

DATA
Latch Latch

CLK +
IR drop Hold

6/16/08 BD03: Digital Physical Design 514


How Does a Power Rail IR Drop Occur?

3. Current through a resistor 4. IR drop reduces operating

e
causes voltage drop (Ohm’s voltage and impacts
Law). circuit performance.

VDD
Current

nc
e
1.1V
1.2V

a d 2. Load
capacitance
charges up.

c
1. Input signal
switches.
circuit

6/16/08 BD03: Digital Physical Design 515

Example Colors for IR Drop

e
VDD 3.300 volts

c
3.266 volts
Color 8
Color 7

n
Below
Color 6 Incremental values for color 2 - 7 also
transistor

e
Color 5 operating
Color 4 voltage
Color 3

d
3.062 volts
Color 2
3.000 volts

a
Color 1

VSS

c
0.0 volts

6/16/08 BD03: Digital Physical Design 516


IR Drop Example for Chip

ce Abrupt IR drop color


change shows
locations where there

n
is a discontinuity in
the power grid.

d e These RAMs connect

a
well to the power grid.

6/16/08
c BD03: Digital Physical Design
These RAMs do not
connect well to the
grid.

517

What Is Electromigration?
Electromigration is a wear-out mechanism of metal wires.

e
‹ Metal atoms migrate over a period of time, causing open circuits,
shorts circuits, or unacceptable increases in resistance.

nc
‹ There are two main causes of electromigration failure:
‰ High (DC) current densities
‰ Joule heating, which is caused by high alternating currents

e
‹ These wear-out mechanisms can take extended periods of time.

d
a void e migrated ions
(short hazard)

c
(open)

6/16/08 BD03: Digital Physical Design 518


Causes of Electromigration
‹ Electromigration is mechanical failure in the wire caused by frequently varying
thermal conditions.

heat above oxide temperature.

ce
‹ As pulses go through the wire, the power dissipated by the wire causes it to

‹ The difference in the thermal constants between the oxide and the wire

n
causes mechanical stress, and the wire can eventually fail resulting in chip
failure in the field.

e
EM failures as seen though a scanning electron microscope (SEM)

d
ca FESEM micrograph of aluminum
lines exhibiting classic
electromigration voiding.
Hillocks formed in a Cu line
during electromigration test.
www.nd.edu

6/16/08 BD03: Digital Physical Design 519

Electromigration Damages

Voids

ce
en
a
Hillock

d
c
www.diei.unipg.it/RICERCA/www_em/voidhill.gif

6/16/08 BD03: Digital Physical Design 520


High (DC) Current Densities
‹ Physical migration of metal atoms due to “electron wind” can
eventually create a break in a wire.

Equation)

ce
‰ MTTF (mean time to failure) ∝ 1/J2 where J= current density (Blacks

‰ Current density must not exceed specification Æ wire Ii/wi < Jspec

n
‰ Specified as mA per μm wire width (e.g., 1mA/ μm) or mA per via cut

e
‹ EM occurs both in signal (AC=bidirectional) and power wires
(DC = unidirectional)

d
‰ Much worse for DC than AC; DC occurs inside cells and in power buses

a
6/16/08
c BD03: Digital Physical Design 521

Example: Current Density


There is a high
current density

ce due to a narrow
metal3 power grid
strap connecting

n
to the internal
RAM.

d e A failure here is
catastrophic.

ca
6/16/08 BD03: Digital Physical Design 522
What Is Joule Heating?
Wire Self-Heat (WSH)
‹ May also be called signal wire electromigration, or Joule heating, since it is related to the

e
power that is dissipated into the interconnect.
‹ WSH is the rise in temperature due to the electron movement within a conductor, i.e.,

c
wire heats above oxide temperature as pulses go through.
‹ Depends on metal composition, signal frequency, wire sizes, slew rates, and amount of

n
capacitance driven
‹ Self-heating = More EM

e
‰ Since SH increases temperature, self-heating on a metal line can aggravate EM effects.
‰ SH on a line can also increase EM effects on neighboring lines.

d
‹ Because self-heating contributes to electromigration, failures are typically labeled as EM,
not SH.

ca Wire self-heat

6/16/08 BD03: Digital Physical Design 523

Hot Electron Effect (Short Channel Effect)


‹ Caused by extremely high electric fields between source and drain
‰ Occurs when voltages are not scaled as fast as dimensions

ce
‹ Electrons pick up speed in the channel

‹ Fastest electrons damage the oxide and interface near the drain

n
‹ Transistor threshold and mobility change over the life of the part, i.e.,
threshold eventually moves to a point where the device no longer meets

e
specifications
Oxide and/or interface

d
is damaged here.
Gate
+++

ca
Electrons pick up speed in channel;
“hot” electrons are the fastest of a
statistically fast bunch.
+++

N+ diffusion

Impact ionization occurs here.

6/16/08 BD03: Digital Physical Design 524


Power Grid Analysis
‹ What is Power Grid Analysis?

e
‹ Inputs and Outputs of Power Grid Analysis

‹ Tasks of Power Grid Analysis

‹ Static Power Grid Analysis

‹ Dynamic Power Grid Analysis

nc
d e
ca
6/16/08 BD03: Digital Physical Design 525

What Is Static Power Grid Analysis?


‹ Simple approach providing comprehensive coverage without the
requirement of extensive circuit simulations

ce
‹ Solves Ohm's and Kirchoff's laws for a given power network while
ignoring localized switching effects on the power grid
‹ Detects and fixes major supply grid problems

en
‹ Main challenge of the static approach is accuracy

‹ Local dynamic effects are not accounted for

a d
6/16/08
c BD03: Digital Physical Design 526
How Is Static Power Grid Analysis Done?
‹ Select the power grid view libraries to be used in the
power-rail analysis.

e
Read in the
‹ The parasitic resistance of the power grid is extracted, and Power grid views

a resistor matrix of the power grid is built.

nc
‹ An average current for each transistor or gate connected
to the power grid is calculated.
‹ The average currents are distributed around the resistance
Extract
Power grid
parasitic
information

e
matrix based on the physical location of the transistor
gate. Create

d
resistor
‹ At every VDD I/O pin, a source of VDD is applied to the matrix
matrix.

a
‹ A static matrix solve is then used to calculate the currents Calculate
and IR drops throughout the resistance matrix. average

c
current
‹ Calculation of an instance-based static power
consumption is done, which contains the instance-based
Calculate
power-consumption data for all instances of each cell and current and IR
block in the design. drop

6/16/08 BD03: Digital Physical Design 527

What Does Static Power Grid Analysis Find?


Static IR drop analysis finds power grid weakness caused by

e
‹ Missing vias

‹ Insufficient vias

‹ Missing power connections

‹ Insufficient power route widths

nc
e
‹ Power planning decisions

a d
Power grid electromigration analysis lets you do the following:
‹ Run a comprehensive analysis that is not vector dependent

c
‹ Find problems in both vias and routing

‹ Run checks against current density rules

‹ Run analysis checks against Black’s equation

6/16/08 BD03: Digital Physical Design 528


Power Grid Analysis
‹ What is power grid analysis?

e
‹ Inputs and outputs of power grid analysis

‹ Tasks of power grid analysis

‹ Static power consumption

‹ Dynamic power consumption

nc
d e
ca
6/16/08 BD03: Digital Physical Design 529

What Is Dynamic Power Grid Analysis?


‹ Involves a comprehensive dynamic circuit simulation of the power grid
network, which includes localized switching effects

grid to be extracted

ce
‹ Analysis requires that both resistance and capacitance of the power

‹ Localized dynamic and package inductance effects are taken into


account

en
‹ Results can be extremely accurate

a d
6/16/08
c BD03: Digital Physical Design 530
How Is Dynamic Power Grid Analysis Done?
‹ Select the power grid view libraries to be used in the
power-rail analysis.

e
Read in the
power grid views
‹ The parasitic resistance and capacitance of the power

c
grid and the signal nets are extracted.
‹ The dynamic tap currents are passed from the power Extract

n
calculation tool. power grid
and signal-net
‹ Power calculator calculates the currents over time. parasitic

e
information

‹ Rail analysis calculates where the current is varying

d
over time based on the calculations of the power
calculator. Calculate

a
dynamic current

6/16/08
c BD03: Digital Physical Design
Perform
rail analysis

531

Purpose of Dynamic Power Grid Analysis


The purpose is to obtain a quantitative analysis, measured against vectors.
Some specific reasons for dynamic analysis are to

‹ Simulate a specific test vector

ce
‹ Calculate the power grid characteristics over time

weakness

en
‹ Identify which specific test vector activated an implementation

d
‹ Examine the time correlation of tap current

a
‹ Obtain a better estimation of the precise magnitude of IR drop

c
How do you identify the test vectors?
In addition to using vectors in a dynamic power grid analysis, there are
methods that do not require vectors, but use a timing window file (TWF)
instead.

6/16/08 BD03: Digital Physical Design 532


Analysis Output from Power Grid Analysis

IR drop

ce Transistor device

n
currents

d e
ca
Current congestion

Electromigration
6/16/08 BD03: Digital Physical Design 533

Method of Reducing IR Drop


Input

e
Current Drawn
from VDD

VDD
IR Drop with

nc 1.2V
1.1V

e
Decoupling
Adding decoupling capacitors
makes a static approach more

d
accurate. Decoupling capacitors act
as a local charge source.

VDD

1.2V

ca Decoupling
Capacitors
Input

6/16/08 BD03: Digital Physical Design 534


Method of Reducing IR Drop (continued)
‹ The red area means a voltage drop of more than
10% of the nominal supply voltage. The solution

e
is to use wider power stripes or use more metal
on higher levels.

nc
‹ Additional power stripes are added to the design

e
and are marked in cyan and magenta.

d
ca
‹ This IR drop plot is made after an increase of the
number of power stripes.
‹ This plot shows a very low voltage drop, which is
required for a functional chip.

6/16/08 BD03: Digital Physical Design 535

Review Questions
‹ What is a power grid?

e
‹ What are the tasks of power grid analysis?

‹ What is the difference between static power grid analysis and dynamic
power grid analysis?

nc
d e
ca
6/16/08 BD03: Digital Physical Design 536
Topics in This Module
‹ Power consumption and analysis

e
‹ Power grid analysis

‹ Low-power design

nc
d e
ca
6/16/08 BD03: Digital Physical Design 537

Low-Power Design
‹ Need for low-power design

e
‹ Low-power design techniques
‰ Clock gating
‰ Multi-threshold Logic

nc
‰ Multi-voltage with shut-off

d e
ca
6/16/08 BD03: Digital Physical Design 538
Need for Low-Power Design
‹ Exponential increase in chip density.

e
‹ In deep submicron technology (130 nm, 90 nm, and below), leakage
current increases dramatically.

current.

nc
‹ In some 65 nm designs, leakage current is nearly as large as dynamic

d e
ca
6/16/08 BD03: Digital Physical Design 539

Where and When to Save Power


‹ Power is a constraint like timing and area -> good optimization potentiality.

‹ Switching intensive networking applications use 50% -> watch the clock tree
and its sequential elements.

ce
‹ The earlier in the design process power consumption is addressed, the bigger
the impact.

n
‹ At higher levels of abstraction, there are more degrees of freedom for large

e
changes to the design implementation.

a d
6/16/08
c BD03: Digital Physical Design 540
Power Saving Techniques
Some of the low-power design techniques discussed today are
‹ Circuit and chip design
‰ Clock gating
‰ Multi-voltage with shut-off

‹ Process

ce
n
‰ Multi-threshold logic

e
RTL

d
Synthesis RTL clock-gating for dynamic

ca Floorplanning

Physical
Implementation
Power grid planning for multi-voltage
IR drop and EM analysis

Muti-Vdd optimization
Dual-Vth optimization for leakage
Physical clock gating

6/16/08 BD03: Digital Physical Design 541

Low-Power Design
‹ Need for low power design

e
‹ Low-power design techniques
‰ Clock gating
‰ Multi-threshold logic
‰ Multi-voltage with shut-off

nc
d e
ca
6/16/08 BD03: Digital Physical Design 542
Clock Gating
‹ Clock distribution network contributes to a
significant portion of total power

e
consumption
‹ Clock buffers have the highest toggle rate,

c
and often have a high drive strength to
minimize clock delay

n
‹ Flip-flops with an active clock dissipate
some dynamic power even if the inputs

e
and outputs are unchanged.
‹ Shut-off the clock during periods of

d
inactivity to avoid unnecessary power
consumption

ca
Clock gating
Multi-threshold logic
Multi-voltage with shut-off

6/16/08 BD03: Digital Physical Design 543

Clock-Gating Styles
Designer has the following control:
Latch-free {OR}

e
‹ Latch-based or latch-free gating
style EN
GCLK

‹ Which register banks to gate or


exclude from gating
‹ Positive (AND) or negative (OR)

nc CLK

Latch-free {INV NAND BUF}

e
gating logic EN

d
GCLK
‹ Minimal bit-width of gated CLK

registers

ca
Clock gating
Multi-threshold logic
Multi-voltage with shut-off
EN

CLK
Latch-based {NAND INV}

GCLK

6/16/08 BD03: Digital Physical Design 544


Implementation of Clock Gating
Clock gating is a two-step process:
‹ Step1: Identify enable conditions

‹ Done using

ce
‹ Step 2: inserting clock-gating cells into the clock path using the enable logic

n
‰ Simple combinational logic (output hold on a register)
‰ More complex sequential logic that spans multiple clocks

automatically.

d e
‹ Commercially available synthesis tools accomplish the second task

a
D_in D_in
D_out
D_out

c
CG CG
CG

Non-Optimized With-Clock Gate Combinational clock gating


Sequential clock-gating

6/16/08 BD03: Digital Physical Design 545

Clock Gating Advantages and Disadvantages


Advantages
‹ Reduces the dynamic power consumption by the clock network

ce
‹ Reduced internal power consumption at the clock-gated flip-flops
‹ No need for muxes to re-circulate the data for these flip-flops (saves power
and area)

Disadvantages
‹ No effect on leakage

en
d
‹ May result in setup time or hold time violations
‹ Clock gating has to be inserted before clock tree synthesis (CTS) in most

a
power design flows and hence presents design issues

c
‹ Affects testability by introducing multiple clock domains (solved if we use a
latch-based design)
‹ Adding clock gating may not always be accompanied by reduced power
‹ Clock gating adds logic that consumes power

6/16/08 BD03: Digital Physical Design 546


Multi-Threshold Logic
‹ Using libraries with multiple VT has become a common way of reducing
leakage current as geometries have shrunk (130 nm, 90 nm)

ce
‹ Sub-threshold leakage depends exponentially on VT.

‹ Today, many libraries offer two or three versions of their cells: Low VT,
Standard VT, and High VT.

n
‹ The implementation tools can take advantage of these libraries to optimize

e
timing and power simultaneously.
Leakage Delay

d
100%
80%

Clock gating

ca
Multi-threshold logic
Multi-voltage with shut-off
60%
40%
20%
0%
LVt SVt HVt
Leakage vs. Delay at 90 nm

6/16/08 BD03: Digital Physical Design 547

Implementing a Multi-Threshold Logic


‹ A “Dual VT” flow is common during synthesis.

e
‹ Minimize total number of fast, leaky low VT transistors by deploying
them only when required to meet timing.

nc
‹ Involves an initial synthesis targeting a primary library followed by an
optimization step targeting additional libraries with differing thresholds.
‹ Examples
‰ Goal: High performance

d e
Synthesizing with high-performance, high-leakage library first and then
relaxing back any cells not on the critical path by swapping them for lower

a
performing, lower leakage equivalent
‰ Goal: Minimum leakage

c
Target the low-leakage library first and then swap in higher performing,
high-leakage equivalents to meet timing in critical paths
Clock gating
Multi-threshold logic
Multi-voltage with shut-off

6/16/08 BD03: Digital Physical Design 548


Multi-Threshold Logic: Advantage and Disadvantages
Advantages

e
‹ Can reduce leakage power without compromising performance.

‹ Delay has a much weaker dependence on VT.

Disadvantages

nc
‹ Leakage current increases exponentially with VT reduction.

d e
‹ In terms of cost, requires one additional mask.

‹ Reducing leakage power may compromise performance.

Clock gating

ca
Multi-threshold logic
Multi-voltage with shut-off

6/16/08 BD03: Digital Physical Design 549

Multi-Voltage with Shut-Off


‹ Dynamic power is proportional to VDD2, lowering VDD on selected blocks
helps reduce power significantly.

ce
‹ Different blocks have different performance objectives and constraints.

‹ A lower supply rail means that the dynamic and static power will be lower for
the cells on this rail.

n
‹ Partition the internal logic of the chip into multiple voltage regions or power

e
domains, each with its own supply.
‰ For example, processor needs to run as fast as the semiconductor technology will

d
allow; high supply voltage is required.
‰ In a USB block run at a relatively slow

a
Cache RAMS
frequency dictated by protocol, a 1.2V
lower supply rail may be sufficient

c
for the block to meet its timing SOC
constraints.
0.9V
CPU
Clock gating 1.0V
Multi-threshold logic
Multi-voltage with shut-off

Multi-Voltage Architecture
6/16/08 BD03: Digital Physical Design 550
Techniques to Achieve Multi-Voltage
To achieve multi-voltage on a chip, the following techniques are
implemented:
‹ Voltage scaling interfaces – level shifters

‹ Power gating

ce
n
‰ Signal isolation cell
‰ State retention power gates
‰ Sleep transistors

d e
Clock gating

ca
Multi-Threshold Logic
Multi-Voltage with shut-off

6/16/08 BD03: Digital Physical Design 551

Level Shifters (Voltage Scaling Interfaces)


VDD1 VDD2

e
Logic Logic

VSS

nc
‹ Ensure signals going from one domain to another (e.g., 0.9V to 1.2V) will not
turn on both the NMOS and PMOS networks, causing crowbar currents.

e
‹ Domain gets the voltage swings (and rise- and fall-times) that it expects.

clk

a
Q
VDDL

VSS

d
OUTL
D

clk
Q
VDDL

VSS
VDDH

OUTH

6/16/08
c
1.2V Domain

1.1V Domain
0.9V Domain

High-to-low level shifter cells


Implemented using two inverters in series
1.2V Domain 0.9V Domain

1.1V Domain
Low-to-high level shifter cells
More complex - Implemented using a buffered and an inverted
form of the lower voltage signal used to drive a cross-coupled
transistor structure running at the higher voltage
BD03: Digital Physical Design 552
Power Gating
‹ The technique used to turn off blocks that are not being used is known as
power gating.
‹ Reduce the overall leakage power of a chip.

ce
‹ Selectively powering down certain blocks in the chip while keeping other
blocks powered up.

n
‹ Goal: To maximize power savings while minimizing the impact on

e
performance.
Activity Profile with Power Gating

d
SL W SL W SL
EE AK EE AK EE
P E P E P

a
200 mW

SLEEP events –

c
Power

Dynamic Dynamic Dynamic


Power Power Power Initiate entry to the low power
Activity 1 Activity 2 Activity 3 mode
20 mW
Leakage Power Leakage Power Leakage Power
WAKE events –
10 mW
Activity 1 (e.g., Clock Gated) Activity 2 Initiate return to active mode
Time
6/16/08 BD03: Digital Physical Design 553

Signal Isolation Cells


‹ Powering down regions on a chip should not
result in crowbar current or spurious behavior

e
at the inputs of powered–up blocks.
‹ Inputs to the power gated blocks can be

c
driven to valid logic values by powered up
blocks without creating electrical (or
Vdd

n
functional) problems in the powered down
block. Pwr Isolation cell
Switch

e
Iso
‹ The outputs of powered down blocks must be
controlled by using an isolation cell to clamp

d
the output to a specific, legal value.
Iso
‹ Three basic types of isolation cell

a
‰ Those that clamp the signal to “0”,(use AND
gate)

c
‰ Those that clamp it to “1”, and (use OR-gate)
‰ Those that latch it to the most recent value

6/16/08 BD03: Digital Physical Design 554


State Retention Power Gates
‹ Retention strategy prevents loss of state
information when block is powered down.

ce
‹ On power up, state of block must be restored
from external source or build up state form reset
condition.

n
‹ Time and power requirement can be significant.

e
‹ Methods of saving and restoring the internal
state of a power gated block

d
‰ Software approach: Based on reading and writing Vdd
registers (state info stored in processor memory) Pwr
Switch

a
‰ Scan-based approach: Based on using a
dedicated set of scan chains to store state of chip Vdd VRET

c
D Q
‰ Register-based approach: Uses retention SRPG
registers (contains a “shadow” register) to Clk Cell
preserve the registers state during power down Ret
and restore it at power up
Vss

6/16/08 BD03: Digital Physical Design 555

Sleep Transistors: Fine-Grain Power Gating


‹ Switches are embedded inside cells/IP.

‹ A power gating control signal “SLEEP” (or “SLEEPN”) controls the sleep

ce
transistor to switch on and off the power supply to the cell.
‹ A PMOS sleep transistor is used to switch VDD supply and is called “header
switch.” The NMOS sleep transistor controls VSS supply and is called “footer

n
switch.”

INPUTS

d e
VDD

OUTPUTS*
SLEEP
VDD

a
OUTPUTS*
SLEEPN INPUTS

Clock gating

c
Multi-threshold Logic
Multi-voltage with shut-off

6/16/08
VSS

BD03: Digital Physical Design


VSS

556
Sleep Transistors: Coarse-Grain Power Gating
‹ Dedicated cells that can switch off the entire power or ground network of
particular row of cells

ce
‹ A power gating control signal “SLEEP” controls the sleep transistors
connected in parallel between permanent and virtual power networks

n
VDD

e
SLEEP

VVDD

a d OUTPUTS*
INPUTS

Clock gating

c
Multi-threshold logic
Multi-voltage with shut-off

6/16/08 BD03: Digital Physical Design 557

Sleep Transistors: Advantages and Disadvantages


Advantages Disadvantages
‹ Allows design functionality ‹ Lowering the voltage also increases the
and performance that
would not be achievable
without multi-voltage

ce delay of the gates in the design.


‹ Mixing blocks at different VDD supplies
adds some complexity to the design.

n
‹ Minimizes leakage, which
‹ Multiple power domains require more
provides greatest

e
careful and detailed floorplanning.
reduction in power
‹ Power grids become more complex.

d
‹ Multi-voltage designs require additional
resources on the board (additional

a
regulators to provide the additional

c
supplies)
‹ Power up and power down sequencing.
Clock gating There may be a required sequence for
Multi-Threshold Logic powering up the design to avoid deadlock.
Multi-Voltage with shut-off

6/16/08 BD03: Digital Physical Design 558


Typical Design with Multi-Voltage
0.8v lib 1.0v
1.2v
Level shifters

e
1.0v lib
0.8v
Level shifters

c
1.2v lib
A general multi-voltage implementation showing libraries for the various power domains on the same chip.

Library Domain 2
(1.2V)
Power Domain 2

en Iso_cell

Level Shifter (LS)


Library Domain 3
Power Domain 3
(1.0V) Memory

d
Library
Iso_cell Domain 2
Power
Low Vt Normal Vt High Vt Level Shifter (LS) Domain 3

a
(High Speed) (Low leakage,
lower Speed)
Library Domain 4
Library Domain 1 Power Domain 4 LS

c
1.2 V (0.8V)
Power domain 1
Iso_cell
Iso_cell
Level Shifter (LS)

A more detailed block-level diagram showing the various elements that interface between the different power domains.

6/16/08 BD03: Digital Physical Design 559

Summary Impact of Standard Low-Power Techniques

Technique Power Timing Area Impact Impact Impact Impact

e
Penalty Architecture Design Verification Place and
Route

c
Clock Medium Little Little Low Low None Low
Gating

n
Multi Vt Medium Little Little Low Low None Low

e
Multi- Large Little Little High Medium Low Medium
Voltage

d
Power Large Little Medium High Medium Low Medium
Gating

a
~ Large

6/16/08
c BD03: Digital Physical Design 560
Review Questions
‹ What is clock gating?

e
‹ How is multi-threshold logic implemented?

‹ How is multi-voltage achieved?

nc
d e
ca
6/16/08 BD03: Digital Physical Design 561

Summary
‹ The tasks of a power consumption tool are to calculate static (leakage)
and dynamic (switching and internal) power for each instance in the

e
design.

c
‹ The tasks of a power grid analysis tool are to use the instance power
(static) and current (dynamic) results to check for IR drop, ground

n
bounce, and electromigration in a design.

e
‹ The earlier in the design process power consumption is addressed,
the bigger the impact since there are more degrees of freedom for

d
large changes to the design implementation.

a
‹ Low-power design helps achieve significant power reduction at the
cost of addition design complexity.

6/16/08
c BD03: Digital Physical Design 562
Testing Your Understanding
True or false

e
1. In a power library, look-up tables are implemented by creating multiple
templates of common information that can be used to represent internal

c
power.
2. The effect of IR drop on a signal path is that the signal path is slowed,

en
thus causing a hold violation.
3. Wire electromigration is related to the power that is dissipated into the
interconnect.

d
4. Dynamic power consists of power dissipated inside a cell and power
dissipated to charge/discharge net capacitance.

a
5. By using multi-threshold logic, the implementation tool can take

c
advantage of HVT/LVT/SVT libraries to optimize timing and power
simultaneously.
6. A lower supply rail means that the dynamic and static power will be
lower for the cells on this rail.

6/16/08 BD03: Digital Physical Design 563

Sources
Power Library

e
‹ Library Compiler™ User Guide: Modeling Timing, Signal Integrity, and
Power in Technology Libraries, version A-2007.12, December 2007

Low-Power Design

nc
‹ Voltage Storm Data Prep Manual, version 6.1.2

e
‹ Low-Power Methodology Manual for System-on-Chip Design by
Michael Keating, David Flynn, Robert Aitken, Alan Gibbons, and

d
Kaijian Shi

ca
6/16/08 BD03: Digital Physical Design 564
Reference: Formulae for Power Consumption Calculation
Ptotal = Pstatic + Pdynamic

e
Pstatic = VDD x Ileakage

c
‹ Ileakage = [Number of transistors (logic gates + memory array) *
Average length of transistor in meter] * [Subthreshold leakage + Gate

n
Leakage]

‹ Length of transistor is give in terms of its channel length denoted by λ

calculation purpose.

Pdynamic = α x CL x VDD2 x f

d e
where 1λ = 0.04 μm/λ in this example and must be used in μm for

‹ Where

ca
α – Switching activity
f – Operating frequency
CL = [Number of transistors (logic gates + memory array) * Average
length of transistor in meter]
6/16/08 BD03: Digital Physical Design 565

Reference: Example
Operating Voltage = 1.2V
Number of transistors = 200 million
Average logic transistor = 8λ (where 1λ = 0.04 μm/λ)

e
Subthreshold Leakage = 30 nA/μm
Gate Leakage = 2 nA/μm
Static power dissipation:
P static = I static * VDD
Transistors:

nc
[(200*10e6) * (8λ * (0.04 μm/λ)] = 6.4*10e6 μm

e
On an average, half the transistors are OFF and contribute subthreshold leakage.
Total static current is

d
(64*10e6 μm) * [(30 nA/μm)/2 + (2 nA/μm)] = 1088 mA
1088 mA * 1.2V = 1305.6 mW

a
Dynamic power dissipation:
P dynamic = α * C * VDD2 * f

c
Transistors:
200 * 10e6 * 8 λ * 0.04 μm/λ * 2 fF/μm = 128 nF
Dynamic Power Consumption per MHz or GHz:
[(0.1 * 12.8nF) + (0.05 * 25.6nF)] * (1.2)2 = 3.68 mW/MHz or 3.68W at 1 GHz

6/16/08 BD03: Digital Physical Design 566


Extraction and Delay Calculation

Module 9

June 16, 2008

How Is Delay in a Circuit Estimated or Calculated?

e
reg r1, r2;
always @ (posedge clk) During

c
r2 <= !r1; RTL Coding

r1

en
u1
r2 After

d
Synthesis

ca
u1
r1

r2
During
Place/Route

6/16/08 BD03: Digital Physical Design 568


Module Objectives
In this module, you will be able to

e
‹ Articulate how extraction and delay calculation are run using standard
parasitic and delay formats

c
‰ Compare the different extraction models, including parallel plate, 2.5D,
and 3D

(SPEF) file

en
‰ State the various sections of a Standard Parasitic Exchange Format

‰ Describe the concepts of propagation delay, transition time, and slew

d
‰ State the various sections of a Standard Delay Format (SDF) file

a
‰ Describe how delays are annotated during various phases of the design
flow

6/16/08
c BD03: Digital Physical Design 569

Topics In This Module


‹ Parasitic extraction

e
‹ Delay calculation

nc
d e
ca
6/16/08 BD03: Digital Physical Design 570
Discussion Questions
‹ What is capacitance?

e
‹ What is resistance?

nc
d e
ca
6/16/08 BD03: Digital Physical Design 571

What Is Capacitance?
‹ Definition: Capacitance is a
measure of the amount of electric

e
charge stored between two plates
for a potential difference (voltage)

c
conductor1
across the plates.
‹ Capacitance (C) is proportional to

n
the cross sectional area (A) of the
distance capacitance

e
plates, and inversely proportional
to the distance (D) between them. conductor2

d
C = K * A/D, where K is the
dielectric value of the

a
material between the plates

c
‹ Example: The long wires in the
Cross-sectional area
design incurred a very large
capacitance between them, and,
therefore, the timing of the design
was compromised.

6/16/08 BD03: Digital Physical Design 572


What Is Resistance?
‹ Definition: Electrical resistance is
a measure of the degree to which

e
an object opposes an electric
current through it.
‹ Resistance (R) is proportional to

nc
the length (L) of the wire and
inversely proportional to the cross- conductor1

e
sectional area (A).

R = K * L/A resistance

‹ Example: For our current

a d
technology, wire resistance is
estimated with a factor measuring

c
resistance per unit length.

6/16/08 BD03: Digital Physical Design 573

What Is Parasitic Extraction?


‹ Definition: The process of
extracting the capacitance and

e
resistance values for all of the
interconnects (wires) in a circuit.

c
conductor1
‹ Example: After routing, we ran
parasitic extraction and examined

n
the output files to make sure the

e
resistance and capacitance values capacitance
were below our maximum limit.
conductor2

a d resistance

6/16/08
c BD03: Digital Physical Design 574
Parasitic Extraction
‹ Extraction models

e
‹ SPEF file

‹ Correlation

nc
d e
ca
6/16/08 BD03: Digital Physical Design 575

Interconnects (Wires)
‹ Extraction deals with the wires or
connections in a design. W
‹ Interconnects (wires) in a given

e
S

c
technology will have several rules P
and specifications associated with
each metal layer.
‹ Among the many rules
‰ Width (W)
‰ Pitch (P)

en
‰ Spacing (S)

a d
‰ Resistance per square unit
(RPSQ)
RPSQ

6/16/08
c m2

BD03: Digital Physical Design


m2

576
Interconnects (Wires) (continued)
The thickness of the wires in a given
TABLE OF WIRE VALUES FOR 90nm PROCESS
technology is assumed to be constant.

e
METAL minimum
‹ Resistance is characterized per square LAYER
width pitch
spacing
RPSQ

unit (RPSQ).

c
M8 0.42 0.84 0.42 2.7500e-02
Most technologies have three different

n
grades of interconnects: M7 0.42 0.84 0.42 2.7500e-02

‹ Internal cell routes

e
M6 0.14 0.28 0.14 8.0600e-02
‰ M1
‰ Finest width, spacing

d
M5 0.14 0.28 0.14 8.0600e-02

‹ Signal routes

a
‰ M2 to M(N-2) M4 0.14 0.28 0.14 8.0600e-02

‰ Medium width, spacing

c
M3 0.14 0.28 0.14 8.0600e-02
‹ Global/power routes
‰ M(N-1) to MN M2 0.14 0.28 0.14 8.0600e-02
‰ Largest width, spacing
‰ Thick metal M1 0.12 0.28 0.12 1.3000e-01

6/16/08 BD03: Digital Physical Design 577

Interconnects (Wires) Examples

e
Signal Routes

VDD

nc
GND

d e
ca Internal Cell Routes
Power Routes

6/16/08 BD03: Digital Physical Design 578


Resistance and Capacitance
Resistance calculations are typically m2
simple:
‹ Single layer

‹ Vias and via arrays

ce m1
via12

n
Capacitance calculations can be very
complex:

e
‹ Multi-layer m2

d
‹ Multi-dimension

‹ Coupling capacitances

a
m1
‰ Line-to-ground (net to substrate)

c
‰ Line-to-line (nets on same layer) m1

‰ Crossover (nets on different


layers)
substrate

6/16/08 BD03: Digital Physical Design 579

Parallel Plate or 1D Model


Parallel plate simply models the “line-to-ground.”
‹ Very quick extraction and calculation

e
‹ Typically used in iterations during place/route

c
en B

a d substrate

6/16/08
c BD03: Digital Physical Design 580
Near Body Effects
‹ Near body effects are coupling capacitances between adjacent layers of metal

‹ There are several types:


‰ Area capacitance (Ca)
‰ Coupling capacitance (Cc)

ce
‰ Fringe or sidewall capacitance (Cf)

n
‰ Crossover capacitance (Cr)

e
a d Cr
Cc

6/16/08
c Ca
Cf

BD03: Digital Physical Design 581

2D or 2.5D Model
2D or 2.5D models: Some of the “near-
body” effects
C
‹ Much slower to extract
capacitance vs. 1D model
because there is more
information.

ce A B D
‹ Much more accurate for crosstalk

en
and noise effects because the
coupling capacitances that E

d
contribute to crosstalk and noise
are extracted.

a
‹ Used during detailed analysis substrate

c
during or after place/route.

6/16/08 BD03: Digital Physical Design 582


3D Model
3D models: All of the “near-body”
effects
C

e
F
‹ Very, very slow

c
‹ Extremely accurate

‹ Used for critical parts of a design,

n
A B D
usually the high-speed areas in

e
need of very accurate analysis

E G

a d
c
substrate

6/16/08 BD03: Digital Physical Design 583

Input and Output, Format


Parasitic Extraction
Routed Design
‹ Input

e
TCL
‰ Routed design in the Verilog®

c
language or other HDL + DEF or DEF or
GDSII
GDSII

n
‰ Physical libraries in LEF format
Extraction
‰ Tool-specific libraries, map files,

e
etc.
Physical
‰ Extraction constraints and Library

d
SPEF
commands in TCL
‹ Output

a
Parasitic File
‰ SPEF file containing all of the RC

c
information for the routed nets in
the design

6/16/08 BD03: Digital Physical Design 584


Parasitic Extraction in Flow
Extraction is performed during various
stages of place/route.

e
Floorplanning Place/Route
‹ Rough estimates based on Specification

c
“virtual” routes after placement Designer Placement

Micro-

Physical Synthesis
‹ Detailed estimates based on Architecture Scan Reorder

Static Timing Analysis


Design Optimization
“actual” routes after routing Designer

Delay Calculation
PostPlace

Signal Integrity
Extraction
RTL CTS

e
Output of extraction (SPEF) is used in Design Optimization
PostCTS
Logic Synthesis
many other steps in the flow. Route

d
Synthesized Design Optimization
‹ Delay calculation for nets Gates Gates PostRoute

a
Design Verification
‹ Signal integrity values for nets
Mask Prep

‹ Delay values for static timing

c
GDSII
GDSII
analysis
‹ Power and reliability analysis
during physical verification

6/16/08 BD03: Digital Physical Design 585

Parasitic Extraction
‹ Extraction models

e
‹ SPEF file

‹ Correlation

nc
d e
ca
6/16/08 BD03: Digital Physical Design 586
What Is SPEF?
‹ Definition: IEEE standard for *SPEF "IEEE 1481-1999"
representing parasitic data of *DESIGN “Sample“

e
*DATE “13:03:59 Monday December 18, 2007”
wires in a chip in ASCII format *VENDOR “Sample Tool Vendor”
*PROGRAM “Parasitics Generator”

c
‹ Example: In order to perform *VERSION “1.1.0”
signoff, we ran parasitic extraction *DESIGN_FLOW “EXTERNAL_LOADS”
*DIVIDER /
and wrote out a SPEF file, which

n
*DELIMITER :
contained all of the capacitance *BUS_DELIMITER [ ]
*T_UNIT 1 NS

e
and resistance information of our *C_UNIT 1 PF
design. We input the SPEF file *R_UNIT 1 OHM
*L_UNIT 1 HENRY
into our timing and power analysis

d
tools to finalize our specification *POWER_NETS VDD
*GND_NETS VSS
for performance/Watt.

a
*PORTS
Note: SPEF also contains “inductance” CONTROL O *L 30 *S 0 0

c
FARLOAD O *L 30 *S 0 0
information, which is used for advanced INVX1FNTC_IN I *L 30 *S 5 5
processes or highly detailed analysis. NEARLOAD O *L 30 *S 0 0
TREE O *L 30 *S 0 0
We will not discuss inductance in this
*D_NET INVX1FNTC_IN 0.033
course.

6/16/08 BD03: Digital Physical Design 587

IEEE Std 1481-1999


This is from the IEEE specification for SPEF.

e
‹ 9.1 Introduction

c
The Standard Parasitic Exchange Format (SPEF) provides a standard
medium to pass parasitic information between EDA tools during any

n
stage in the design process. Parasitics can be represented on a net-
by-net basis in many different levels of sophistication, from a simple

e
lumped capacitance, to a fully distributed RC tree, to a multiple pole
AWE representation.

a d
6/16/08
c BD03: Digital Physical Design 588
IEEE Std 1481-1999 (continued)
‹ 9.2 Targeted applications for SPEF
SPEF is suitable for use in many different tool combinations. Because

e
parasitics can be represented in various levels of sophistication, SPEF_files
can communicate parasitic information throughout the design flow process. A

c
design can be distributed between multiple SPEF_files. The files can also
communicate information such as slews and the “routing confidence”

n
indicating at what stage of the design process and/or how the parasitics were
generated. A diagram of how SPEF interfaces with various example

e
applications is shown in Figure 15.

a d
6/16/08
c BD03: Digital Physical Design 589

Discussion Questions
‹ Where does SPEF come from?

e
‹ Where is it used?

nc
d e
ca
6/16/08 BD03: Digital Physical Design 590
What’s in an SPEF File?
Here are the basic elements of an SPEF file SPEF File
‹ Header

e
Header
‰ Contains all of the basic information of
the SPEF file’s origin and specifications

c
Name Map
‹ Name map

n
‰ Substitution of net names for symbols
Power and Ground Nets
‹ Power and ground nets

e
‰ Names of the power and ground nets Externals, Ports
‹ Externals, ports

d
‰ Specifies the port name, direction,
coordinates, capacitive load, slew, etc.

a
‹ Internals
Internals

c
‰ Detailed or reduced view of signal and
power nets in the design

‹ Hierarchical entities
‰ Used to reference instantiated
components with a sub-module SPEF
Hierarchical Entities

6/16/08 BD03: Digital Physical Design 591

What’s in the Header Section?


The header of the SPEF file includes origin
information, design specifics, and unit SPEF File

e
definitions.
Header
‹ SPEF_version

c
‹ design_name Name Map
‹ date

n
‹ vendor Power and Ground Nets

e
‹ program_name
Externals, Ports
‹ program_version

d
‹ unit_def
‹ Pin/bus/hierarchy definitions

ca
The SPEF version is important, since syntax
will change and tools will support different
versions of SPEF.
Also, the program name and version are
important for debugging problems, possibly
wit faulty tool versions.
Internals

Hierarchical Entities

6/16/08 BD03: Digital Physical Design 592


What’s in the Name Map Section?
The name map section simply has aliases for
long net names. SPEF File

e
name_map ::= *NAME_MAP Header

c
name_map_entry {name_map_entry}
name_map_entry ::= index mapped_item Name Map

n
index ::= *<pos_integer>
mapped_item ::= identifier | Power and Ground Nets
bit_identifier | path | name |

e
physical_ref
Externals, Ports

d
Example:

*NAME_MAP

a
*1 NET_1
*2 NET_2 Internals

c

*20 NET_20

Name maps are optional and reduce the


overall text in the SPEF.
Hierarchical Entities

6/16/08 BD03: Digital Physical Design 593

What’s in the Power and Ground Nets Section?


This section simply states the names of
SPEF File
the power and ground nets.

e
Header
Example:

c
*POWER_NETS VDD Name Map

n
*GND_NETS VSS
Power and Ground Nets

e
Externals, Ports

a d Internals

6/16/08
c BD03: Digital Physical Design
Hierarchical Entities

594
What’s in the Externals, Ports Section?
The externals and ports section
SPEF File
describes the interfaces to the design,

e
including name, direction (I or O), Header
capacitive load (L), slew (S), and other

c
timing information.
Name Map
Example:

n
Power and Ground Nets
*PORTS

e
A O *L 30 *S 0.0 0.0
Externals, Ports
B O *L 30 *S 0.0 0.0

d
C O *L 30 *S 0.0 0.0
D O *L 30 *S 0.0 0.0

a
E I *L 30 *S 5000 5000
Internals

c
A,B,C,D,E = Port
I/O = Input or Output
L = Load
S = Slew
Hierarchical Entities

6/16/08 BD03: Digital Physical Design 595

What’s in the Internals Section?


Internals describe the signal and power nets
SPEF File
in the design and can be of the following

e
type:
Header
‹ d_net

c
‹ r_net Name Map

n
‹ d_pnet
Power and Ground Nets
‹ r_pnet

e
d_net and r_net are detailed and reduced Externals, Ports
representations for signal nets.

a
representation for power nets

d
d_pnet and r_pnet are detailed and reduced

Internals

c
The d_net representations are detailed and
have much more information, while the r_net
representations are more compact and less
accurate. Use the appropriate type for the
part of the flow, d_net for signoff, r_net for
intermediate analysis. Hierarchical Entities

6/16/08 BD03: Digital Physical Design 596


Internals
Syntax

e
internal_def ::= nets {nets}
nets ::= d_net | r_net | d_pnet | r_pnet

c
d_net ::=
*D_NET net_ref total_cap

n
[routing_conf] [conn_sec] [cap_sec] [res_sec] [induc_sec] *END
r_net ::=

e
*R_NET net_ref total_cap [routing_conf] {driver_reduc} *END
d_pnet ::=

d
*D_PNET pnet_ref total_cap

a
[routing_conf] [pconn_sec] [pcap_sec] [pres_sec] [pinduc_sec] *END
r_pnet ::=

c
*R_PNET pnet_ref total_cap [routing_conf] {pdriver_reduc} *END

We will show examples of “d_net” and “r_net” in the next few slides, and omit the “pnet”
examples.

6/16/08 BD03: Digital Physical Design 597

Internals: d_net
A d_net is a detailed description of a net
in a design. // d_net example for SPEF

e
*D_NET INVX1FNTC 2.033341
It is comprised of several sections, *CONN

c
among them *I FL_1281:X O *L 0.0
*I I1184:A I *L 0.343
‹ *D_NET declaration *I FL_1000:A I *L 0.343

n
*I NL_1000:A I *L 0.343
*I TR_1000:A I *L 0.343
‹ Net reference

e
*CAP
‹ Total capacitance 216 FL_1000:A 0.346393
217 I1184:A 0.344053

d
‹ Connectivity (*CONN) section 218 INVX1FNTC_IN 0
219 INVX1FNTC_IN:10 0.0154198
‹ Capacitance (*CAP) section 220 INVX1FNTC_IN:11 0.0117827

a

‹ Resistance (*RES) section *RES

c
152 INVX1FNTC_IN INVX1FNTC_IN:18 8.39117
In the case where a specific net has a 153 INVX1FNTC_IN INVX1FNTC_IN:5 25.1397
154 INVX1FNTC_IN:11 INVX1FNTC_IN:20 4.59517
very high capacitance, you can search 155 INVX1FNTC_IN:12 INVX1FNTC_IN:13 3.688
through the section to see if the value is …

reasonable. *END

6/16/08 BD03: Digital Physical Design 598


Internals: r_net
An r_net is a reduced description of a
net in a design.

‹ *R_NET declaration

ce
It is comprised of several sections,
among them
// r_net example for SPEF
*R_NET NE_794 2.67137

n
*DRIVER NL_1039:X
‹ Net reference *CELL INVX
*C2_R1_C1 1.0039 367.972 1.66747

e
‹ Total capacitance *LOADS
*RC NL_1040:A 1.25641
‹ driver information (*DRIVER) *RC NL_2039:A 714.176

d
‹ pie_model (*C2_R1_C1) *END

a
‹ load information (*LOADS)
‹ RC information (*RC)

c
During timing analysis, you may need to
inspect sections of the SPEF file, like
the r_net section to make sure the
values are reasonable.

6/16/08 BD03: Digital Physical Design 599

What’s in the Hierarchical Entities Section?


Hierarchical entities are references to SPEF File
submodules that are instantiated in the

e
given design and a have their own local Header
SPEF file.

c
Syntax Name Map

n
define_def ::= define_entry
{define_entry} Power and Ground Nets

e
define_entry ::= Externals, Ports
*DEFINE inst_name

d
{inst_name} entity

a
| *PDEFINE physical_inst
entity
Internals

c
entity ::= qstring

Example

*DEFINE blk1 “subBLOCK”


Hierarchical Entities

6/16/08 BD03: Digital Physical Design 600


SPEF Example 1: Basic d_net File
*SPEF "IEEE 1481-1999" *CAP
*DESIGN “Sample“ 216 FL_1000:A 0.346393
*DATE “13:03:59 Monday December 18, 2007” 217 I1184:A 0.344053
*VENDOR “Sample Tool Vendor” 218 INVX1FNTC_IN 0

e
*PROGRAM “Parasitics Generator” 219 INVX1FNTC_IN:10 0.0154198
*VERSION “1.1.0” 220 INVX1FNTC_IN:11 0.0117827
*DESIGN_FLOW “EXTERNAL_LOADS” …
*DIVIDER / Header 240 NL_1000:A 0.344804
*DELIMITER : 241 TR_1000:A 0.34506

c
*BUS_DELIMITER [ ]
*T_UNIT 1 NS *RES
*C_UNIT 1 PF 152 INVX1FNTC_IN INVX1FNTC_IN:18 8.39117
*R_UNIT 1 OHM 153 INVX1FNTC_IN INVX1FNTC_IN:5 25.1397

n
*L_UNIT 1 HENRY 154 INVX1FNTC_IN:11 INVX1FNTC_IN:20
4.59517
*POWER_NETS VDD Power and …
*GND_NETS VSS 175 INVX1FNTC_IN:9 INVX1FNTC_IN:10 10.8533
Ground Nets 176 INVX1FNTC_IN:9 INVX1FNTC_IN:11 1.05164

e
*PORTS *END
CONTROL O *L 30 *S 0 0
FARLOAD O *L 30 *S 0 0 *D_NET NE_794 1.98538
Externals/ Internals
INVX1FNTC_IN I *L 30 *S 5 5
Ports *CONN

d
NEARLOAD O *L 30 *S 0 0
TREE O *L 30 *S 0 0 *I NL_1039:X O *L 0 *D INVX
*I NL_2039:A I *L 0.343
*D_NET INVX1FNTC_IN 0.033 *I NL_1040:A I *L 0.343

a
*CONN *CAP
*P INVX1FNTC_IN I 3387 NE_794 0
*I FL_1281:A *L 0.033 3388 NE_794:1 0.0792492
*END …

c
*D_NET INVX1FNTC 2.033341 Internals 3413 NL_1040:A 0.344453
3414 NL_2039:A 0.343427
*CONN
*I FL_1281:X O *L 0.0 *RES
*I I1184:A I *L 0.343 2879 NE_794:1 NE_794:13 66.1953
*I FL_1000:A I *L 0.343 2880 NE_794:1 NE_794:2 0.311289
*I NL_1000:A I *L 0.343 …
*I TR_1000:A I *L 0.343 2903 NL_1039:X NE_794:25 1.00317
2904 NL_2039:A NE_794:23 0.171175
*END

6/16/08 BD03: Digital Physical Design 601

SPEF Example 2: Basic r_net File


*SPEF "IEEE 1481-1999"
*DESIGN “Sample”
*DATE “Fri Feb 9 15:29:56 2007”
*VENDOR “Sample Tool Vendor”

e
*PROGRAM “Parasitics Generator”
*VERSION “1.1.0”
*DESIGN_FLOW “EXTERNAL_LOADS” “EXTERNAL_SLEWS”
*DIVIDER / Header

c
*DELIMITER :
*BUS_DELIMITER [ ]
*T_UNIT 1.0 PS
*C_UNIT 1.0 PF

n
*R_UNIT 1.0 OHM
*L_UNIT 1.0 HENRY

*POWER_NETS VDD

e
*GROUND_NETS VSS Power and Ground Nets
*PORTS
TREE O *L 30 *S 0.0 0.0

d
FARLOAD O *L 30 *S 0.0 0.0
NEARLOAD O *L 30 *S 0.0 0.0
CONTROL O *L 30 *S 0.0 0.0
INVX1FNTC_IN I *L 30 *S 5000 5000
Externals/Ports

a
*R_NET NE_794 2.67137
*DRIVER NL_1039:X
*CELL INVX

c
*C2_R1_C1 1.0039 367.972 1.66747
*LOADS
*RC NL_1040:A 1.25641
*RC NL_2039:A 714.176
*END
*D_NET INVX1FNTC_IN 0.033
Internals
*CONN
*P INVX1FNTC_IN I
*I FL_1281:A *L 0.033

*END

6/16/08 BD03: Digital Physical Design 602


SPEF Example 3: Top Level with Name Map
*SPEF “IEEE 1481-1999”
*DESIGN “topLevel”
*DATE “MON Sep 9 9:34:01 2008”
*VENDOR “Sample Tool Vendor”

e
*PROGRAM “ParasiticsGenerator”
*VERSION “1.0 ALPHA”
*DESIGN_FLOW “EXTERNAL_SLEWS” “EXTERNAL_LOADS”
*DIVIDER | Header

c
*DELIMITER :
*BUS_DELIMITER [ ]
*T_UNIT 1.0 PS
*C_UNIT 1.0 PF

n
*R_UNIT 1.0 OHM
*L_UNIT 1.0 UH

*NAME_MAP

e
*1 IN1
*2 net1a
*3 blk1 Name Map
*4 net3b

d
*5 OUT1

*PORTS
*5 O *L 0.05 Externals/Ports

a
*1 I *S 5000 5000

*DEFINE *3 “subBLOCK”
Hierarchical Entity

c
*D_NET *4 0.32429
*CONN
*I *3:OUT2 O
*I I104:I I *L 0.044
*CAP
1 *3:OUT2 0.011307
2 I104:I 0.128838
3 *4:1 0.140145
Internals
*RES
5 *3:OUT2 *4:1 7.128
6 *4:1 I104:I 2.55215
*END

6/16/08 BD03: Digital Physical Design 603

Parasitic Extraction
‹ Extraction models

e
‹ SPEF file

‹ Correlation

nc
d e
ca
6/16/08 BD03: Digital Physical Design 604
Extraction Correlation
There are two types of extraction that
are run in the physical implementation Optimization

e
Netlist Extraction
flow.

c
‹ Extraction during optimization
‰ Extraction during optimization is Place/Route

n
done because it is much faster.

‹ Extraction during signoff

e
‰ Extraction during signoff is done GDSII

because it is more accurate, but

d
slower and requires special
inputs, such as GDSII, tools Signoff

a
specific libraries, and mapping Extraction
files.

6/16/08
c BD03: Digital Physical Design
SPEF Parasitic File

605

Running Extraction with QRC


QRC is Cadence’s extraction tool.

e
Steps
Create Extraction Libraries
‹ Create extraction libraries

c
QRC requires special libraries
generated from technology specific

n
files. Input Routed Design
‹ Input routed design

DEF, GDSII, etc.


‹ Create command file

d e
Commands and directives for setup
Create Command File

a
and extraction.
Run Extraction

c
‹ Run extraction

Runs the extraction algorithms with the


options specified in the command file.
Generate Output File
‹ Generate output file

Generates SPEF or other format.

6/16/08 BD03: Digital Physical Design 606


Running Extraction with QRC (continued)
Command line:
# QRC Command File : GDSII -> SPEF

e
qrc –cmd script.cmd –log process_technology \
-technology_library_file assura_tech.lib \
logfile.log -technology_name tsmc13

c
output_setup \ TCL
-net_name_space schematic \
Command file includes many options, -temporary_directory_name QRCRun \
-file_name QRC_coupled.spef
among them:

n
extraction_setup \ Physical
-max_fracture_length infinite \ Library
-net_name_space layout \
‹ process_technology -max_fracture_length_unit micron

e
input_db \ Routed Design
‹ setup commands (input, output, -type assura \
-directory_name ../rundir \
and extraction) -run_name EngineX4 \

d
GDSII
-format GDS \
-design_file ../routed1.gds \
‹ input_db -design_cell_name EngineX4

a
output_db -type spef Extraction
‹ output_db extract -selection all -type rc_coupled
global_nets -nets VDD VSS
capacitance -decoupling_factor 1.0

c
‹ extract filter_coupling_cap \
SPEF
-coupling_cap_threshold_absolute 0.01
filter_cap \
‹ global_nets -exclude_floating_nets true
filter_res \ Parasitic File
-remove_dangling_res true \
‹ capacitance -merge_parallel_res true

‹ filter commands

6/16/08 BD03: Digital Physical Design 607

Discussion Questions
‹ In a SPEF header, why are the program_name and program_version
important?

used?

ce
‹ What is the difference between d_net and r_net? When are they

‹ What is the difference between coupling cap and fringe cap? Which

integrity?

en
kind of capacitance do we need to be concerned about for signal

‹ What do you do with an SPEF file?

a d
6/16/08
c BD03: Digital Physical Design 608
Topics in This Module
‹ Parasitic extraction

e
‹ Delay calculation

nc
d e
ca
6/16/08 BD03: Digital Physical Design 609

Delay Calculation
‹ Delay calculation fundamentals

e
‹ SDF

‹ Back-annotation and forward-annotation

nc
d e
ca
6/16/08 BD03: Digital Physical Design 610
What Is Propagation Delay?
‹ Definition: The propagation delay is the time difference between the input
signal crossing a voltage threshold and the output signal crossing a voltage

e
threshold.

c
‹ Example: The inverter had a propagation delay of 10 ps.

VH
voltage

VTH_50
input signal

en
d
VL
propagation
delay
INV

a
VH tprop = 10ps

c
VTH_50
VL
output signal

time

6/16/08 BD03: Digital Physical Design 611

What Is Slew/Transition Time?


‹ Definition: The slew time of a signal is measured as the rate of its transition,
typically in volts/ns. The transition time is the time it takes for the signal to

e
pass through two specified voltage thresholds. The threshold points are
usually defined as a certain percentage of the voltage swing.

c
‹ Example: The slew of the output signal was 0.01 volt/ps, whereas the
transition time to go from 10% of VDD to 90% of VDD was 10 ps.

n
e
voltage transition transition
time time

d
VH
VTH_90

VTH_10
VL

ca slew
slew

time

6/16/08 BD03: Digital Physical Design 612


Delays in a Timing Path
A timing path consists of the sum of delays between a start point and an end point.
The delays can include
‹ Cell delay
‹ Interconnect delay

The start points and end points can include

ce
‹ Register clocks, inputs
‹ Ports of the design
‹ Pins of a macro inside the design

en
a d path delay
D

c
start end
point point

CK->Q

cell interconnect cell interconnect cell interconnect cell


delay delay delay delay delay delay delay

6/16/08 BD03: Digital Physical Design 613

What Is Cell Delay (tcell)?


The cell delay (propagation delay) is the delay through the cell as determined by
‹ Cell’s “intrinsic” delay

‹ Load on the cell

‹ Slew of the input signal

ce
tcell = tinstrinsic + tload_slew

en
d
Intrinsic Delay Load Slew
Cell delay with zero load The larger the load, the The larger the input slew,
longer the delay the longer the delay

ca tinstrinsic

6/16/08 BD03: Digital Physical Design 614


Load and Slew: tload_slew
The load and slew dependent portion of the cell delay is calculated via tables in a
technology timing library.

ce
‹ Library vendor characterizes each cell in the library for timing.

‹ Table values serve as boundaries, so the delays can be estimated between


the given table values.

slew

en
# Table for load/slew dependent cell delay
Model(ioDelayRiseModel

d
(Spline
(Input_Slew_Axis 0.050 0.200 1.000 4.000 20.000)
delay
(Load_Axis 0.0446 0.892 3.568 14.275)

a
values
data((0.7210 0.8471 1.2849 3.05673)
(0.8119 0.9380 1.3758 3.1475)

c
(0.9975 1.1236 1.5612 3.3322)
(1.4293 1.5552 1.9922 3.7609)
load (3.3955 3.5204 3.9542 5.7101))

6/16/08 BD03: Digital Physical Design 615

What Is Interconnect Delay?


The delay through the nets or interconnects of a design are calculated by the
resistance and capacitance of the nets. These values can be
‹ Estimated

‹ Reduced

‹ Detailed

ce
Estimated

en
Reduced Detailed

d
Uses a “wire load model” (WLM) Uses a reduced SPEF and Uses a detailed SPEF and
that estimates the net delay annotates a lumped RC value annotates the detailed RC
based on load and slew to the net values to the net

ca WLM
SPEF
r_net
SPEF
d_net

6/16/08 BD03: Digital Physical Design 616


Calculating the Path Delay
So, the path delay in our original example would be the sum of the cell and
interconnect delays.

ce
‹ tpath = tc1 + ti1 + tc2 + ti2 + tc3 + ti3 + tc4, where
‰ tc1 is the clock-to-q delay of the starting register
‰ ti1, ti2, and ti3 and the interconnect delays

en
‰ tc2 and tc3 are the cell delays of the logic in between the registers
‰ tc4 is the setup time of the ending register

start

a d tpath
D
end

c
point point

CK->Q

tc1 ti1 tc2 ti2 tc3 ti3 tc4

6/16/08 BD03: Digital Physical Design 617

Discussion Questions
‹ How does slew and load affect delay?

e
‹ How does a library vendor get the timing data for its technology
libraries?

nc
d e
ca
6/16/08 BD03: Digital Physical Design 618
Input and Output, Format
Delay Calculation
‹ Input
‰ Routed design in the Verilog
language or other HDL + DEF

ce
‰ Parasitic extraction file (SPEF)
Routed Design

Gates +
DEF
TCL

n
SPEF
‰ Logical timing libraries in Liberty
format Delay Calculation

e
‰ Optional: Physical libraries in LEF Logical Physical
Library Library
format SDF

d
‰ Constraints and commands in
TCL Delay File

‹ Output

ca
‰ Standard Delay Format (SDF) file
containing all of the delay
information in the design

6/16/08 BD03: Digital Physical Design 619

Delay Calculation in Flow


Delay calculation is performed at all stages
of the place/route flow, including logic

e
synthesis.
‹ Rough estimates based on wire load Specification Floorplanning Place/Route

c
models in logic synthesis Designer Placement

‹ Better estimates after floorplanning, Micro-


Physical Synthesis

Architecture Scan Reorder

n
placement, and CTS
Static Timing Analysis

Design Optimization
Designer
Delay Calculation

PostPlace
‹ Best estimates based on extracted
Signal Integrity
Extraction

RTL

e
CTS
parasitics after routing
Design Optimization
PostCTS
Logic Synthesis
Output of delay calculation (SDF) is used in

d
Route
many other steps in the flow. Synthesized Design Optimization
Gates Gates PostRoute

‹ Internally, it is used during logic

a
synthesis during optimization. Design Verification

Mask Prep
‹ In signal integrity, delay calculation

c
creates incremental SDF for timing GDSII
GDSII
analysis, based on the SI parasitics.
‹ In static timing analysis, the SDF file
is used to annotate timing on cells
and nets.

6/16/08 BD03: Digital Physical Design 620


Delay Calculation
‹ Delay calculation fundamentals

e
‹ SDF

‹ Back-annotation and forward-annotation

nc
d e
ca
6/16/08 BD03: Digital Physical Design 621

What Is SDF?
‹ Definition: An IEEE standard for Example SDF File
(DELAYFILE
the representation and

e
(SDFVERSION "3.0")
interpretation of timing data for (DESIGN "BIGCHIP")
(DATE "March 12, 1995 09:46")
use at any stage of an electronic (VENDOR "Southwestern ASIC")

c
(PROGRAM "Fast program")
design process (VERSION "1.2a")
(DIVIDER /)
(VOLTAGE 5.5:5.0:4.5)
‹ Example: In our design flow, we

n
(PROCESS "best:nom:worst")
have a standalone delay (TEMPERATURE -40:25:125)
(TIMESCALE 100 ps)

e
calculator that outputs SDF. We (CELL
(CELLTYPE "BIGCHIP")
loaded the SDF into our static (INSTANCE top)
(DELAY
timing analysis tool to verify our

d
(ABSOLUTE
(INTERCONNECT mck b/c/clk (.6:.7:.9))
design meets its performance (INTERCONNECT d[0] b/c/d (.4:.5:.6))
requirements. )

a
)
)
(CELL

c
(CELLTYPE "AND2")
(INSTANCE top/b/d)
(DELAY
(ABSOLUTE
(IOPATH a y (1.5:2.5:3.4) (2.5:3.6:4.7))
(IOPATH b y (1.4:2.3:3.2) (2.3:3.4:4.3))
)
)
)

6/16/08 BD03: Digital Physical Design 622


SDF Specification Version 3.0
This is from the SDF specification.
Introduction

e
The SDF file stores the timing data generated by EDA tools for use at any stage in the design
process. The data in the SDF file is represented in a tool-independent way and can include

nc
‹ Delays: Module path, device, interconnect, and port
‹ Timing checks: Setup, hold, recovery, removal, skew, width, period, and nochange
‹ Timing constraints: Path, skew, period, sum, and diff

‹ Incremental and absolute delays

d e
‹ Timing environment: Intended operating timing environment

‹ Conditional and unconditional module path delays and timing checks

a
‹ Design/instance-specific or type/library-specific data

c
‹ Scaling, environmental, and technology parameters

Throughout a design process, you can use several different SDF files. Some of these files can
contain pre-layout timing data. Others can contain path constraint or post-layout timing data.

6/16/08 BD03: Digital Physical Design 623

What’s in an SDF File?


Here are the basic elements of an SDF file.
SDF File
‹ Header

e
‰ Contains all of the basic information of Header
the SDF file’s origin and specifications

c
‹ Cell entries
‰ Identifies a cell or macro that contains Cell Entries

n
timing data to be applied
‰ Within a cell entry, there can be delay,

e
timing check, and timing environment
entries
‹ Delay entries

d
‰ Identifies I/O paths, ports, and
interconnects that contain timing data
Delay Entries

a
to be applied
‹ Timing check entries

c
‰ Associate timing check limit values with
specific cell instances
‹ Timing environment entries Timing Check Entries
‰ Contains timing environment
information, constraints, etc. Timing Environment Entries

6/16/08 BD03: Digital Physical Design 624


Header
The header contains basic information of the SDF
file’s origin and specifications, including among SDF File
others

e
‹ SDF version Header
‹ Design name

c
‹ Vendor, program name, and version
‹ Process Information, timescale
Cell Entries

n
Example:

e
(DELAYFILE
(SDFVERSION "3.0")
(DESIGN “MYCHIP")
(DATE “December 30, 2007 12:08")

d
(VENDOR "ASIC_vendor")
(PROGRAM “SDF_program")
(VERSION “2.4.1") Delay Entries

a
(DIVIDER /)
(VOLTAGE 1.5:1.3:1.1)

c
(PROCESS "best:nom:worst")
(TEMPERATURE -40:25:125)
(TIMESCALE 100 ps)

The SDF version, vendor, and program name are Timing Check Entries
important to note for debug reasons. Also, process,
temperature, voltage, and timescale information
should be consistent with your timing analysis. Timing Environment Entries

6/16/08 BD03: Digital Physical Design 625

Cell Entries
Cell entries identify the cells and macros in a
design with the following information: SDF File

e
‹ Cell type Header

c
‹ Cell instance name

Example: Cell Entries

(CELL
(CELLTYPE “DFF”)
(INSTANCE u1/u2/u3_reg)

en
d

Delay Entries

a
)
)

c
)

The delay, timing check, and timing Timing Check Entries


environment entries can be located inside of
each cell entry.
Timing Environment Entries

6/16/08 BD03: Digital Physical Design 626


Delay Entries
There are three types of delays in a delay entry:
‹ Absolute

‹ Incremental

‹ Pulse width

ce
n
It is important to differentiate them. For a typical analysis with crosstalk, the
absolute delays are first annotated, then the incremental delays due to crosstalk are

e
annotated and added to the existing delays.

d
Absolute Incremental Pulse Width
SDF delays overwrite existing SDF delays are added to existing Pulse limits are set for specific

a
delays during annotation. delays during annotation. Points.

c
in1 limit1
2ns 2ns
in2

Incr out Pulse


2ns SDF 3ns
SDF limit2 Width

1ns 1ns

6/16/08 BD03: Digital Physical Design 627

Delay Entries (continued)


Delay entries associate delay values with the elements
of its cell entry and can include the following delay SDF File
types.

e
‹ Absolute Header
‰ SDF replaces existing delay values in the

c
design during annotation.
‹ Increment
‰ SDF adds to existing delay values in the Cell Entries

n
design during annotation.
‹ Pathpulse

e
‰ Pulse width limits

Examples:

d
(DELAY
(ABSOLUTE
Delay Entries

a
(IOPATH (posedge clk) q (22:28:33) (25:30:37))
(PORT clr (32:39:49) (35:41:47))
)
)

c
(DELAY
(INCREMENT
(IOPATH (posedge clk) q (-4::2) (-7::5))
(PORT clr (2:3:4) (5:6:7))
) Timing Check Entries
)
(DELAY
(PATHPULSE i1 o1 (13) (21))
) Timing Environment Entries

6/16/08 BD03: Digital Physical Design 628


Setup, Hold, Recovery, and Removal
Review on timing checks
‹ Setup and hold

‹ Recovery

‹ Removal

ce
Setup/Hold
Setup: Limit of time where data

en
Recovery
Limit of time between the
Removal
Limit of time between an

d
must remain stable before the removal of an asynchronous active clock edge and the
clock edge signal (not data) and an removal of an asynchronous

a
active clock edge signal (not data)
Hold: Limit of time where data
a_rstb a_rstb
must remain stable after the

c
clk clk
clock edge clk clk
a_rstb a_rstb

recovery removal

6/16/08 BD03: Digital Physical Design 629

Timing Check Entries


Timing check entries associate timing check limit
values with specific cell instances. Among the SDF File
available types

e
‹ Setup Header

c
‹ Hold
‹ Recovery
‹ Removal Cell Entries
Example:

(TIMINGCHECK
(SETUP din (posedge clk) (12))

en
d
)
(TIMINGCHECK
Delay Entries

a
(HOLD din (posedge clk) (9.5))
)
(TIMINGCHECK

c
(RECOVERY (posedge clearbar) (posedge
clk) (11.5))
)
Timing Check Entries
(TIMINGCHECK
(REMOVAL (posedge clearbar) (posedge
clk) (6.3)) Timing Environment Entries
)

6/16/08 BD03: Digital Physical Design 630


Timing Environment Entries
Timing environment entries associate
SDF File
constraint values on critical paths, as

e
well as provide information about the Header
environment the circuit will operate.

c
Among the entries are
‹ Constraints for path, period, skew, Cell Entries

n
etc.

e
‹ Time for arrival, departure, slack

‹ Waveform

In most cases, all of this information is

a
contained in the Standard Design
Constraints (SDC) file
d Delay Entries

6/16/08
c BD03: Digital Physical Design
Timing Check Entries

Timing Environment Entries

631

SDF Example: Basic File


Example
(DELAYFILE
(SDFVERSION "3.0")
(DESIGN "BIGCHIP")
(DATE "March 12, 1995 09:46")

e
(VENDOR "Southwestern ASIC")
(PROGRAM "Fast program")
(VERSION "1.2a")
Header
(DIVIDER /)
(VOLTAGE 5.5:5.0:4.5)

c
(PROCESS "best:nom:worst")
(TEMPERATURE -40:25:125)
(TIMESCALE 100 ps)
(CELL
(CELLTYPE "BIGCHIP")

n
(INSTANCE top)
(DELAY
(ABSOLUTE
(INTERCONNECT mck b/c/clk (.6:.7:.9))
Cell 1 – Top with interconnects
(INTERCONNECT d[0] b/c/d (.4:.5:.6))

e
)
)
)
(CELL
(CELLTYPE "AND2")
(INSTANCE top/b/d)

d
(DELAY
(ABSOLUTE
Cell 2 – AND gate with delays
(IOPATH a y (1.5:2.5:3.4) (2.5:3.6:4.7))
(IOPATH b y (1.4:2.3:3.2) (2.3:3.4:4.3))
)

a
)
)
(CELL
(CELLTYPE "DFF")
(INSTANCE top/b/c)

c
(DELAY
(ABSOLUTE
(IOPATH (posedge clk) q (2:3:4) (5:6:7))
(PORT clr (2:3:4) (5:6:7))
Cell 3 – Register with delays, setup checks
)
)
(TIMINGCHECK
(SETUPHOLD d (posedge clk) (3:4:5) (-1:-1:-
1))
(WIDTH clk (4.4:7.5:11.3))
)
)
(CELL
. . .
)
More Cells
)

6/16/08 BD03: Digital Physical Design 632


Running Delay Calculation
SignalStorm® NDC is Cadence’s delay
calculation tool.

e
Steps Generate SignalStorm
‹ Generate SignalStorm libraries Libraries

c
‰ Generates libraries for more accurate
delay calculation Generate SignalStorm

n
‹ Generate a SignalStorm design Design Database
database
‰ Imports netlist information

e
Import SPEF
‹ Import SPEF
‰ Imports parasitics

d
Setup Conditions
‹ Setup conditions

a
‰ Sets up the boundary conditions for the
design, including slew and load
information Calculate Delay

c
‹ Calculate delay
‰ Core algorithm to calculate delay for
the design Generate Output File and
Reports
‹ Generate output file and reports
‰ Generate SDF and reports

6/16/08 BD03: Digital Physical Design 633

Running Delay Calculation (continued)


Command line:
TCL

e
sndc –S script.cmd –L
logfile.log

c
SPEF

Command file includes many options,


among them

n
Tech
Libraries
# QRC Command File : GDSII -> SPEF
‹ Create and open design database db_open demo
db_install -spef test.spef

e
db_setup -setup test.st -process worst Routed Design
‹ Import SPEF db_load TEST_CHIP
db_delay -process worst
db_xtk -process worst
‹ Import setup commands

d
db_report sdf -p worst -report test.sdf DEF
-design TEST_CHIP -xtk_min fast
‹ Load and link design -xtk_max slow
db_close

a
Delay Calculation
‹ Calculate delay

‹ Write output files and reports

c
SDF

Delay File

6/16/08 BD03: Digital Physical Design 634


Delay Calculation
‹ Delay calculation fundamentals

e
‹ SDF

‹ Back-annotation and forward-annotation

nc
d e
ca
6/16/08 BD03: Digital Physical Design 635

Back-Annotation
‹ Delay calculation produces an
SDF timing file, based on

e
*SPEF
technology information, SPEF,
and design information (netlist).

nc
‹ The analysis tool can now read in
the SDF, as well as the design
and technology information, to
Tech
Lib
Delay
Calculator

e
produce its reports.
‹ Since the SDF is already created,

d
all of the timing information can be SDF Netlist

used by all subsequent tools in the

a
flow, thus ensuring consistency.

c
Tech Analysis
Lib Tool

6/16/08 BD03: Digital Physical Design 636


Forward-Annotation
‹ An analysis tool can take the user
constraints and technology library

e
information and create and SDF
file with more granular constraints Tech Analysis User

c
Lib Tool Constraints
to drive an implementation tool.
‹ The implementation tool

n
(synthesis, or place/route) can use

e
the information in the forward- SDF
annotated SDF file to more Constraints

accurately constrain the design

d
and possibly make better choices
to meet its overall constraints.

ca Implementation
Tool

6/16/08 BD03: Digital Physical Design 637

Summary
‹ There are various type of extraction models, which vary accuracy with
runtime. They include parallel plate, 2.5D, and 3D models.

ce
‹ The Standard Parasitic Exchange Format (SPEF) file is the IEEE
standard to store parasitic information for a design. It has several
sections, including header, externals, and internals.

en
‹ Fundamentally, delay calculation is based on the concepts of
propagation delay, transition time, and slew. We saw that delay is a
function of transition time and slew, among other variables.

a d
‹ The Standard Delay Format (SDF) file is the IEEE standard to store
delay information. It has several sections, including header, cell, and
delay entries.

6/16/08
c
‹ SDF delay data can be back-annotated to analysis tools, whereas
SDF constraint data can be forward-annotated to implementation tools.

BD03: Digital Physical Design 638


Testing Your Understanding
True or false

e
1. 3D extraction models are used for quick and relatively inaccurate
parasitic calculations.

of the tool.

nc
2. In the header section of SPEF, there is a place to annotate the version

3. Recovery is similar to a hold time check, and removal is similar a

e
setup time check.

d
4. In the “cell entry” of an SDF file, all of the relevant timing information
for the cell is included within its boundaries.

ca
5. One advantage of using a delay calculator’s SDF is that the timing
calculations will be consistent throughout the entire flow.

6/16/08 BD03: Digital Physical Design 639

Learning Activity
In this activity, you will

e
‹ Study the physical implementation flowchart

‹ Add SDF and SPEF files at the appropriate step in the flow

c
‹ Present your results to the class

n
e
20 minutes for activity
10 minutes for debriefing

a d
6/16/08
c BD03: Digital Physical Design 640
Sources
Standard Parasitic Exchange Format (SPEF), IEEE Standard 1481-1999

e
Standard Delay Format Specification Version 3.0, Open Verilog
International: http://www.eda.org/sdf/sdf_3.0.pdf

nc
d e
ca
6/16/08 BD03: Digital Physical Design 641

ce
en
a d
6/16/08
c BD03: Digital Physical Design 642
Static Timing Analysis and
Signal Integrity Analysis
Module 10

June 16, 2008

How Do You Check if a Circuit Meets Timing?

ce
en
a d
6/16/08
c BD03: Digital Physical Design 644
Module Objective
In the class, you will be able to

e
‹ Explain static timing and signal integrity (SI) analysis and identify
problems

nc
d e
ca
6/16/08 BD03: Digital Physical Design 645

Discussion Questions
‹ What determines the speed at which a circuit works?

e
‹ How do you gauge if your circuit works correctly at the required
speed?

nc
d e
ca
6/16/08 BD03: Digital Physical Design 646
Topics in This Module
‹ Static timing analysis (STA)

e
‹ Signal integrity analysis

nc
d e
ca
6/16/08 BD03: Digital Physical Design 647

Static Timing Analysis


‹ Timing analysis

e
‹ Timing constraints

‹ Constraint checking and report timing: slacks and violations

‹ Timing exceptions

nc
‹ Setup and hold timing violations

e
‹ On-chip variation (OCV) and clock path pessimism removal (CPPR)

d
‹ Multi-mode multi-corner (MMMC) design

a
‹ Timing correlation

‹ Design rule verification

6/16/08
c BD03: Digital Physical Design 648
Purpose for Timing Analysis
‹ The goal of timing analysis is to verify that a design meets timing
requirements under a specified set of timing constraints.

incurring timing violations.

ce
‹ Timing analysis lets you determine how fast a design can run without

‹ The results of timing analysis can be used to fine tune and debug the

n
speed-limiting, critical paths in a design.

e
a d
6/16/08
c BD03: Digital Physical Design 649

Types of Timing Analysis


‹ Static timing analysis
‰ Adds delays for all elements in a timing path together and compares with

e
given timing constraints

c
‰ Analyzes all possible timing paths in a short period of time
‰ Ignores functionality of circuit, thus analyzing paths that cannot be

n
exercised and must be eliminated by the designer
‰ Preferred method for signoff

‹ Dynamic timing analysis

d e
‰ Designer creates timing test vectors that are simulated using a gate-level
netlist to verify timing

a
‰ No false paths exist
‰ Easy to miss paths by not including them in vectors

c
‰ Requires a significant amount of CPU time to do simulations
‰ This is a mandatory late-stage run to ensure that paths not tested by static
timing analysis are checked
‹ In this course, we will only cover static timing analysis.
6/16/08 BD03: Digital Physical Design 650
What Is Static Timing Analysis?
The preferred method for timing
signoff
‹ Definition: Process of

e
Floorplanning Place/Route
Specification

c
computing the timing of Designer Placement

logically related paths for a Micro-

Physical Synthesis
Architecture Scan Reorder
digital design without regard

Static Timing Analysis


Design Optimization
Designer
to large scale functional

Delay Calculation
Pre-CTS

Signal Integrity
Extraction
RTL

e
behavior CTS

Design Optimization
PostCTS

‹ Example: To determine the Logic Synthesis

d
Route
timing of the design, we ran Synthesized
Gates Design Optimization
Netlist PostRoute
static timing analysis after

a
Detail
detail route, and saw several Routed
Design
GDSII

paths violating their setup

c
Layout Design Verification
time requirements.
GDSII GDSII Mask Prep

6/16/08 BD03: Digital Physical Design 651

Input and Output, Format


Static timing analysis
‹ Input

ce
‰ Design in the Verilog® language or
other HDL (Note: STA can be run
on a design at any stage of the back-
end flow.)
SPEF

SDF Routed Design

n
SDC TCL
Incremental
‰ Constraints in Synopsys Design SDF Gates

e
Constraints (SDC) format
‰ Logical timing libraries in Liberty Static Timing
Analysis

d
(.lib) format
Logical
‰ Constraints and commands in TCL Library

a
‰ SPEF, SDF, and
incremental SDF (SI analysis) Reports

‹ Output

6/16/08
c
‰ Timing reports, including noise-on-
delay effects (SI analysis)

BD03: Digital Physical Design 652


Static Timing Analysis
‹ Timing analysis

e
‹ Timing constraints

‹ Constraint checking and report timing: slacks and violations

‹ Timing exceptions

nc
‹ Setup and hold timing violations

e
‹ On-chip variation (OCV) and clock path pessimism removal (CPPR)

d
‹ Multi-mode multi-corner (MMMC) design

a
‹ Timing correlation

‹ Design rule verification

6/16/08
c BD03: Digital Physical Design 653

What Are Timing Constraints?


‹ Timing constraints represent the performance goals for your designs.

e
‹ Software tools use the timing constraints to guide the timing-driven
optimization tools in order to meet these goals.

‰ Clocks definition

n
‰ Input delay/arrival time
c
‹ Some of the timing constraints that STA tool follows are

‰ Operating conditions

d e
‰ Output delay/required time

ca
6/16/08 BD03: Digital Physical Design 654
What Is a Clock Definition?
‹ Clock period: The time difference between two consecutive rising or falling
clock edges when they cross a specific reference level

rectangular waveform

ce
‹ Duty cycle: The ratio between the pulse duration (t) and the period (T) of a

en
Pulse Duration t
Duty Cycle = t/T

a
Rising Falling
Edges Edges
d Clock Period T

6/16/08
c BD03: Digital Physical Design 655

Types of Clocks
‹ Ideal clocks
‰ To simplify clock analysis, we assume that under ideal condition all flip-flops are

e
clocked together at a time reference (time = 0 ns).

c
‰ In ideal mode, clock tree has zero insertion delay.

‹ Propagated clocks

n
‰ Insertion delay is the known delay of the clock tree to any given end point.
‰ Clock uncertainty = clock skew + clock jitter, is the unknown variation in clock

e
delays.
‰ Clock delays are calculated from clock tree routing and extracted delays.

a
Ideal Clock

d
‰ Provides more accuracy and is used for final timing closure.

c
Clock Insertion Delay Clk

skew

Clock uncertainty Clk

Propagated Clock jitter

6/16/08 BD03: Digital Physical Design 656


Pre-CTS and Post-CTS Constraints
‹ Pre-CTS
Ideal Clock
‰ Ideal clocks with uncertainty are

e
Clock pin
used. C. logic

c
‰ Uncertainty consists of margin
(extra delay the design team Delay

n
adds), clock skew, and clock Clock Source Network
latency latency
jitter. source

e
‰ Estimated latency is considered.

d
‹ Post-CTS
Propagated Clock
‰ Propagated clocks are used.

a
Clock pin
‰ Uncertainty consists of margin C. logic

c
and clock jitter.
‰ Propagated latency is Delay

considered. Clock Source Network


source latency latency

6/16/08 BD03: Digital Physical Design 657

What Are Arrival Time and Required Time?


‹ The input delay time or the arrival time is the time that the data is presented to
the inputs of the module or register, respectively.

e
‹ The external delay time or the required time is the time determined by external

c
logic before the next rising edge of the clock.

Input delay

en Output delay

a d
6/16/08
c clock period
Data arrival timing
clock period clock period
Data Required timing

BD03: Digital Physical Design 658


What Is an Operating Condition?
‹ Integrated circuits display performance differences depending on the fabrication process, voltage and
temperature (PVT) characteristics.

e
Each wafer batch is made with a slightly different set of process parameters and thus, inherently, the
die will run at different speeds. In fact, there can even be variations across a single die, (OCV), which

c
will be discussed later.

‹ This constraint describes the process, voltage, and temperature conditions of design.

n
There are three conditions: worst, best, and typical.
‹ Operating conditions can be set from a single set of libraries (min, typ, or max) or from multiple

e
libraries (min and max), and used to perform setup and hold analysis.
‹ The technology libraries contain information on how to scale the cell parameters with variation in

d
process parameters and operating conditions that can be used to calculate accurate cell delay.

ca
WORST case,
HIGH temperature,
LOW voltage,
BEST process
WORST case,
HIGH temperature,
LOW voltage,
WORST process

STD cell library


BEST case,
HIGH temperature,
HIGH voltage,
WORST process
BEST case,
LOW temperature,
HIGH voltage,
BEST process

6/16/08 BD03: Digital Physical Design 659

Static Timing Analysis


‹ Timing analysis

e
‹ Timing constraints

‹ Constraint checking and report timing: slacks and violations

‹ Setup and hold timing violations

‹ Timing exceptions

nc
e
‹ On-chip variation (OCV) and clock path pessimism removal (CPPR)

d
‹ Multi-mode multi-corner (MMMC) design

a
‹ Timing correlation

‹ Design rule verification

6/16/08
c BD03: Digital Physical Design 660
Check Constraints
‹ When a design is loaded into an STA tool and constraints are applied
‰ Checks for consistency and completeness of the timing constraints specified for a

e
design

c
‰ The timing constraints should be complete before running a timing debug

‹ STA tools come with specific commands that run these checks.

en
CHECK Constraints

a d Clean?
NO

c
YES

TIMING

6/16/08 BD03: Digital Physical Design 661

Check Constraints (continued)


‹ Some of the common checks are
‰ Connectivity checks for clock and data to ensure that clock and data
signals are propagated

ce
‰ Arrival time and required time for each clock in a multiple clock system
‰ Clock gating points

en
‰ Combinational loop in the design
‰ Constant collision/contradiction on a net connected to the pin
‰ Multiple clocks arriving at a leaf cell

a d
6/16/08
c BD03: Digital Physical Design 662
What Is a Timing Report?
‹ The timing report is a summary of the final timing information.

‹ There are separate reports for setup time analysis and hold timing analysis.

‰ Header

ce
‹ The report usually consists of the following parts:

n
‰ Body

• Start point: Endpoint pair for which timing is being

e
calculated
Header • End point arrival time calculation
• Slack calculation

a d Timing information for all paths from:

c
• An external input pin to an internal
Body register
• An internal register (or input select
pin) to an output pin
• An internal register to another
internal register (C2C)

6/16/08 BD03: Digital Physical Design 663

Example Timing Report


Endpoint: data_out[4] (^) checked with leading edge of 'vclk1'
Beginpoint: DATA_BUS_MACH_INST/reg_4/Q (^) triggered by leading edge of 'vclk1'
Other End Arrival Time 0.000

e
+ Source Insertion Delay 3.000
- External Delay 2.000
+ Phase Shift 10.000 Header
- Uncertainty 0.250

c
= Required Time 10.750
- Arrival Time 7.447
= Slack Time 3.303
Clock Rise Edge 0.000

n
+ Source Insertion Delay 4.000
= Beginpoint Arrival Time 4.000
+-------------------------------------------------------------------------------------+

e
| Instance | Arc | Cell | Delay | Arrival | Required |
Body| | | | | Time | Time |
|----------------------------+---------------+-----------+-------+---------+----------|

d
| i_150 | Y ^ | | | 4.000 | 7.303 |
| DTMF_INST/m_clk__L1_I1 | A ^ -> Y v | CLKINVX20 | 0.327 | 4.327 | 7.630 |
| DTMF_INST/m_clk__L2_I2 | A v -> Y ^ | CLKINVX20 | 0.278 | 4.604 | 7.908 |

a
| DATA_BUS_MACH_INST/reg_4 | CK ^ -> Q ^ | SDFFRHQX1 | 0.507 | 5.112 | 8.415 |
| TDSP_CORE_GLUE_INST/i_9712 | A ^ -> Y v | INVXL | 0.135 | 5.247 | 8.550 |
| TDSP_CORE_GLUE_INST/i_9713 | A v -> Y ^ | INVXL | 0.101 | 5.348 | 8.651 |
| PORT_BUS_MACH_INST/i_9761 | A ^ -> Y v | INVXL | 0.095 | 5.443 | 8.747 |

c
| PORT_BUS_MACH_INST/i_9762 | A v -> Y ^ | INVX2 | 0.122 | 5.566 | 8.869 |
| FE_OFC1146_tdsp_portO_4_ | A ^ -> Y ^ | BUFX12 | 0.172 | 5.738 | 9.041 |
| IOPADS_INST/Ptdspop04 | I ^ -> PAD ^ | PDO04CDG | 1.709 | 7.447 | 10.750 |
| | data_out[4] ^ | | 0.000 | 7.447 | 10.750 |
+-------------------------------------------------------------------------------------+

6/16/08 BD03: Digital Physical Design 664


Static Timing Analysis
‹ Timing analysis

e
‹ Timing constraints

‹ Constraint checking and report timing: slacks and violations

‹ Timing exceptions

nc
‹ Setup and hold timing violations

e
‹ On-chip variation (OCV) and clock path pessimism removal (CPPR)

d
‹ Multi-mode multi-corner (MMMC) design

a
‹ Timing correlation

‹ Design rule verification

6/16/08
c BD03: Digital Physical Design 665

What Are Setup Time and Hold Time?


Synchronous inputs have setup/hold specification relative to clock.

e
‹ Setup Time: The time a synchronous input must be stable before active clock
edge.

c
‹ Hold Time: The time a synchronous input must be stable after active clock
edge.

en
Input Data Valid

d
Setup Time Hold Time

ca Clock

6/16/08 BD03: Digital Physical Design 666


Setup Time and Hold Time Violations
‹ A setup time violation is when a signal arrives too late and misses the time
when it should advance.

clock cycle before it should.

ce
‹ A hold time violation is when a signal arrives too early and advances one

en Input Data

d
Setup Time Hold Time
Violation Violation

ca
Clock

6/16/08 BD03: Digital Physical Design 667

Timing Report for Setup Violations


Path 1: VIOLATED Setup Check with Pin
reg_2/CK Arrival Required
Endpoint: reg_2/D (v) checked with leading Instance Arc Cell Delay Time Time
edge of ’CLK1’

e
Beginpoint: reg_1/Q (v) triggered by leading clk ^ 0.000 0.088
edge of ’CLK1’
ck_0 A ^ -> Y ^ BUFX2 0.091 0.091 0.178
Other End Arrival Time 0.104
- Setup 0.167

c
ck_1 A ^ -> Y ^ BUFX2 0.097 0.188 0.275
+ Phase Shift 2.000
= Required Time 1.937 ck_2 A ^ -> Y ^ BUFX2 0.094 0.282 0.369
- Arrival Time 1.946 ck_3 A ^ -> Y ^ BUFX2 0.092 0.374 0.462

n
= Slack Time -0.009
Clock Rise Edge 0.000 ck_4 A ^ -> Y ^ CLKAND2X2 0.150 0.524 0.612
= Beginpoint Arrival Time 0.000
reg_1 CK^ -> Q v DFFRHQX1 0.288 0.812 0.900

e
t_1 A ^ -> Y ^ BUFX8 0.111 0.923 1.011
t_2 A ^ -> Y ^ BUFX8 0.092 1.015 1.103
reg_1 reg_2

d
t_3 A ^ -> Y ^ BUFX8 0.092 1.107 1.195

Q D t_4 A ^ -> Y ^ BUFX8 0.092 1.199 1.287

a
CK CK
t_5 A ^ -> Y ^ BUFX4 0.132 1.331 1.379
t_1 t_12
clk t_6 A ^ -> Y ^ BUFX8 0.092 1.423 1.471
t_7 A ^ -> Y ^ BUFX6 0.112 1.535 1.563

c
ck_0 t_8 A ^ -> Y ^ BUFX8 0.092 1.627 1.655
ck_4
t_9 A ^ -> Y ^ BUFX4 0.128 1.755 1.747
t_10 A ^ -> Y ^ BUFX8 0.088 1.843 1.835
t_11 B ^ -> Y ^ NAND2X1 0.066 1.909 1.901
t_12 A ^ -> Y ^ INVX1 0.037 1.946 1.937

reg_2 D v DFFRHQX1 0.000 1.946 1.937

6/16/08 BD03: Digital Physical Design 668


Timing Report for Hold Violations
Path 1: VIOLATED Hold Check with Pin reg_3/CK
Endpoint: reg_3/D (v) checked with leading edge of ’CLK1’

e
Beginpoint: reg_1/Q (v) triggered by leading edge of ’CLK1’
Other End Arrival Time 0.933
+ Hold 0.179

c
+ Phase Shift 0.000 Arrival Required
= Required Time 1.152 Instance Arc Cell Delay Time Time
Arrival Time 1.099

n
clk ^ 0.000 0.088
= Slack Time -0.053
ck_0 A ^ -> Y ^ BUFX2 0.091 0.091 0.178
Clock Rise Edge 0.000

e
= Beginpoint Arrival Time 0.000 ck_1 A ^ -> Y ^ BUFX2 0.097 0.188 0.275

ck_2 A ^ -> Y ^ BUFX2 0.094 0.282 0.369

d
ck_3 A ^ -> Y ^ BUFX2 0.092 0.374 0.462

ck_4 A ^ -> Y ^ CLKAND2X2 0.150 0.524 0.612

a
reg_1 CK^ -> Q v DFFRHQX1 0.288 0.812 0.900

t_1 A ^ -> Y ^ BUFX8 0.092 0.904 0.992

c
t_2 A ^ -> Y ^ BUFX8 0.092 0.996 1.084

t_15 B ^ -> Y ^ NAND2X1 0.066 1.062 1.115

t_16 A ^ -> Y ^ INVX1 0.037 1.099 1.152

reg_3 D v DFFRHQX1 0.000 1.099 1.152

6/16/08 BD03: Digital Physical Design 669

Techniques to Reduce Timing Violations


‹ To fix setup violation, we need to speed up the delay path causing
violation by

ce
‰ Increasing cell drivability by upsizing cell
‰ Adding buffers to optimize the critical path and reducing the load on
complex gates with large fanout

Upsize Cell

en Insert Buffer

a d
‹ To fix hold violation, we need to make the signal path slow by
‰ Adding delay cells to slow the signal

c
‰ Reducing drivability of cells

Insert Delay Cell


Down size Cell

6/16/08 BD03: Digital Physical Design 670


Discussion Questions
‹ What are the two type of timing analysis?

e
‹ What constraints define a clock?

‹ In reading a timing report, how do you know that the design has a
timing violation?

nc
d e
ca
6/16/08 BD03: Digital Physical Design 671

Static Timing Analysis


‹ Timing analysis

e
‹ Timing constraints

‹ Constraint checking and report timing: slacks and violations

‹ Timing exceptions

nc
‹ Setup and hold timing violations

e
‹ On-chip variation (OCV) and clock path pessimism removal (CPPR)

d
‹ Multi-mode multi-corner (MMMC) design

a
‹ Timing correlation

‹ Design rule verification

6/16/08
c BD03: Digital Physical Design 672
What Are Timing Exceptions?
Paths that are given special consideration by the timing analysis tool
‹ False paths

‹ Multicycle paths

ce
n
False Paths Multicycle Paths
Paths that are not exercised during operation Paths that take multiple cycles

False
Path
DFF1

d e Multicycle
Path
DFF1

a
N cycles

Result is used every

c
N clock cycles

DFF2
DFF2

6/16/08 BD03: Digital Physical Design 673

Timing Exceptions: False Path


‹ False path
‰ A path that has no functional purpose or a path that does not need

‹ Reasons for false paths

ce
to be timing constrained (i.e., path between two clock domains).

‰ Path is never exercised during circuit operation

‹ Blocking false paths


n
‰ Path is only possible in special operation mode (test mode, etc.)

e A
a_b
A+B

d
‰ Blocking of timing arcs C

‰ Blocking the path itself

a
adder
c_d

c
B C+D

Sel
Examples: Multiplexed Logic in a Test Mode

6/16/08 BD03: Digital Physical Design 674


Timing Exceptions: Multicycle Path
‹ Multicycle paths
‰ The paths that exist between two synchronous clock domains with integral

ce
multiples of clock frequency
data

data

en data data data

d
CLK2 CLK1
BlockA BlockB

ca
CLK1

DATA

CLK2
T cycle
time

T/2 cycle
time
6/16/08 BD03: Digital Physical Design 675

Static Timing Analysis


‹ Timing analysis

e
‹ Timing constraints

‹ Constraint checking and report timing: slacks and violations

‹ Timing exceptions

nc
‹ Setup and hold timing violations

e
‹ On-chip variation (OCV) and clock path pessimism removal (CPPR)

d
‹ Multi-mode multi-corner (MMMC) design

a
‹ Timing correlation

‹ Design rule verification

6/16/08
c BD03: Digital Physical Design 676
Three Timing Analysis Modes
‹ There are three different timing analysis modes.

e
Timing Analysis Mode Description

c
Single Single operating condition used to scale delay
value

n
Best case and worst Analyzes off-chip variation for two extreme
case (BC-WC) operating conditions

e
On-chip variation (OCV) OCV is the small difference in the operating
parameter value across the chip.

d
‹ In this course, we will only cover the OCV mode.

a
6/16/08
c BD03: Digital Physical Design 677

What Is On-Chip Variation Analysis?


‹ Cannot assume constant PVT across die. It is essential to comprehend impact
of these variations in timing analysis.

ce
‹ In this analysis mode, the delay calculation for one path may be based on
maximum operation condition while delay calculation for another path may be
based on minimum operating condition for setup and hold checks.

n
On-chip variations
for setup timing check

On-Chip Variation Analysis


Data

d e
Data Delay

Worst Case

ca Clock
Best Case

CLOCK Delay
On-chip variations
for hold timing check

6/16/08 BD03: Digital Physical Design 678


STA Tools in OCV Analysis Mode
‹ Computes min and max delays for cells and nets by multiplying annotated
delay with min and max timing de-rate value, respectively.

ce
‹ Apply min and max delays to different paths simultaneously.
‰ For setup check, annotate worst-case SDF. Use max delay for launch path and min
delay for capture path.

n
‰ For hold check, annotate best-case SDF. Use min delay for launch path and max
delay for capture path.

Launch
0 2 4 6

d
8

e 10

a
Early Late
Launch Capture
Capture

c
L T L T L T CLK1
Phase shift (late)
Ideal clock edges

6/16/08 BD03: Digital Physical Design 679

OCV Mode Setup


early path

launch clock
root

ce
n
late path
capture clock

d e min
library
max
library

ca
‹ For setup check, the timing delay values from the Max library are used for the
data and the launch clock network delay.
‹ The delay values from the Min library are used for the capturing clock network
delay assuming that the clocks are set in propagated mode.

6/16/08 BD03: Digital Physical Design 680


OCV Mode Hold

late path

capture clock
root

ce
launch clock

en early path

d
min max
library library

ca
‹ For hold check, the timing delay values from the Min library are used for the
data arrival time and launch clock network delay.
‹ The delay values from the Max library are used for the capturing clock network
delay assuming that the clocks are set in propagated mode.

6/16/08 BD03: Digital Physical Design 681

What Are CPPR and CRPR?


‹ Definition: Clock path pessimism removal (CPPR) and
clock re-convergence pessimism removal (CRPR) are the process of

e
identifying and removing the pessimism introduced in the slack reports for
clock paths when the clock paths have a segment in common.

c
‹ Example: In the on-chip variation methodology, during setup checks, if both
the launch clock late path and the capture clock early path share a portion of

n
the clock network, then for the common clock network, a pessimism equal to
the difference in maximum and minimum delay values is introduced in the

e
slack values.

d
common segment

a
early path

c
launch clock
root

late path
capture clock

6/16/08 BD03: Digital Physical Design 682


CRPR: Pessimism Calculation in OCV Mode
Fast path x Mfast
d1

e
FF1
Dcommon (dc)

root

nc FF2

No OCV : path to FF1 = dc + d1


path to FF2 = dc + d2 e
Slow path x Mslow

d
d2

With OCV : path to FF1 = (dc + d1) x Mfast


path to FF2 = (dc + d2) x Mslow

ca
‹ CRPR: The common path cannot be de-rated by two different values at the
same time.
‹ The slack calculation is too pessimistic.
‹ The pessimism is P = dc x Mslow – dc x Mfast.
‹ New slack = slack(w/o CRPR) + P.
6/16/08 BD03: Digital Physical Design 683

Timing Report with Clock Pessimism


Path 1: MET Setup Check with Pin reg_2/CK Arrival Required
Endpoint: reg_2/D (v) checked with leading Instance Arc Cell Delay Time Time
edge of ’CLK1’

e
Beginpoint: reg_1/Q (v) triggered by leading clk ^ 0.000 0.508
edge of ’CLK1’
ck_0 A ^ -> Y ^ BUFX2 0.091 0.091 0.598
Other End Arrival Time 0.104

c
- Setup 0.167 ck_1 A ^ -> Y ^ BUFX2 0.097 0.188 0.695
+ Phase Shift 2.000 ck_2 A ^ -> Y ^ BUFX2 0.094 0.282 0.789
+ CPPR Adjustment 0.420
ck_3 A ^ -> Y ^ BUFX2 0.092 0.374 0.882

n
= Required Time 2.358
- Arrival Time 1.946 ck_4 A ^ -> Y ^ CLKAND2X2 0.150 0.524 1.032

= Slack Time 0.412 reg_1 CK^ -> Q v DFFRHQX1 0.288 0.812 1.320

e
Clock Rise Edge 0.000 t_1 A ^ -> Y ^ BUFX8 0.111 0.923 1.431
= Beginpoint Arrival Time 0.000
t_2 A ^ -> Y ^ BUFX8 0.092 1.015 1.523

d
t_3 A ^ -> Y ^ BUFX8 0.092 1.107 1.615
t_4 A ^ -> Y ^ BUFX8 0.092 1.199 1.707

a
t_5 A ^ -> Y ^ BUFX4 0.132 1.331 1.799
t_6 A ^ -> Y ^ BUFX8 0.092 1.423 1.891

c
t_7 A ^ -> Y ^ BUFX6 0.112 1.535 1.983
t_8 A ^ -> Y ^ BUFX8 0.092 1.627 2.075
t_9 A ^ -> Y ^ BUFX4 0.128 1.755 2.167
t_10 A ^ -> Y ^ BUFX8 0.088 1.843 2.255
t_11 B ^ -> Y ^ NAND2X1 0.066 1.909 2.321
t_12 A ^ -> Y ^ INVX1 0.037 1.946 2.358
reg_2 D v DFFRHQX1 0.000 1.946 2.358

6/16/08 BD03: Digital Physical Design 684


Static Timing Analysis
‹ Timing analysis

e
‹ Timing constraints

‹ Constraint checking and report timing: slacks and violations

‹ Setup and hold timing violations

‹ Timing exceptions

nc
e
‹ On-chip variation (OCV) and clock path pessimism removal (CPPR)

d
‹ Multi-mode multi-corner (MMMC) design

a
‹ Timing correlation

‹ Design rule verification

6/16/08
c BD03: Digital Physical Design 685

MMMC Design
‹ Today’s chips include
GPRS MP3 Awake Scan
‰ Multiple standards support EDGE Camera Doze BIST

e
WCDMA Gaming Sleep OPMISR
‰ Multiple functionalities

c
‰ Multiple power profiles
‰ Multiple test modes

n
 Results in multiple constraint sets

e
‹ It becomes more difficult below 90 nm to
‰ Determine worst-case corner combinations

d
‰ Determine RC corners
Mode 1 Mode 2 Mode 3
‰ Determine constraint modes

a
(functionality) (test) (power)
Min Max Min Max Min Max
‹ MMMC provides the ability to concurrently

c
support multiple combinations of modes and SDC1 SDC2 SDC3
corners.
‹ Example: Cell phone chips typically need to
be designed for 20 mode/corners scenarios.

6/16/08 BD03: Digital Physical Design 686


What Is MMMC Analysis?
‹ In deep submicron processes Min Max

‰ Cell and wire delay behave differently DFF1 DFF2 DFF3

e
depending on process variation
‰ Analysis needs to be done at more than just

c
a single min corner and single max corner
‰ Identification of single worst corner-case and
fixing violation becomes difficult due to

n
differing condition
‰ Multi-corner capability enables you to

e
analyze and optimize at all these corner Delay Calculation Corners
cases.
RC Corner
‹ Multi-mode timing analysis • Timing Libs Constraint Mode

d
• cdB Libs Descriptions
‰ A design can have multiple modes of • PVT setting • Clock defs
operation and each mode can have different, • De-rating

a
• Constants
even conflicting, constraints • SDF • Exceptions
‰ Allows concurrent analysis and optimization • RC Controls (SDC)

c
of multiple modes, eliminating iterations for
timing closure
‹ Multi-corner timing analysis
‰ Used to resolve different timing problems that Delay Corner
appear at different processes, voltages, and Constraint Mode
temperatures pointers

Analysis Views
6/16/08 BD03: Digital Physical Design 687

How Is MMMC Analysis Achieved?


‹To achieve multi-corner analysis and optimization normal mode 1 mode 2
SDC SDC SDC
1.Set up the environment
2.Define the scenarios
3.Load the SDC file

ce
4.Analyze the timing reports from multiple scenarios
Synthesis

STA Tool

n
5.Determine which scenario to optimize Load design

e
‹To analyze by using the sequential multimode Create
scenario scenario
Per scenario

d
1.Define the current scenarios Set
operating
2.Identify the critical scenario based on timing report

a
conditions
generated by STA tool
3.Define the most critical scenario as the first scenario

c
Set constraints
in the current scenario definition
Identify most
4.Run optimizations such as clock tree optimization, critical scenario
post placement optimization, or routing optimizations
Analyze
Repeat steps 2 through 4 until timing is satisfactory
Optimize

6/16/08 BD03: Digital Physical Design 688


Static Timing Analysis
‹ Timing analysis

e
‹ Timing constraints

‹ Constraint checking and report timing: slacks and violations

‹ Setup and hold timing violations

‹ Timing exceptions

nc
e
‹ On-chip variation (OCV) and clock path pessimism removal (CPPR)

d
‹ Multi-mode multi-corner (MMMC) design

a
‹ Timing correlation

‹ Design rule verification

6/16/08
c BD03: Digital Physical Design 689

What Is Correlation?
‹ Synthesis, place and route, and the sign-off tools Design Entry
are different (usually).

e
‰ Synthesis uses wire load models estimation of Synthesis
physical design.

c
Timing Engine
Need to adjust wire load model coefficients.

n
‰ Place and route uses more realistic numbers for Place
physical design.
Timing Engine

e
‰ Timing more accurate as flow progresses.
‰ Different timing engines used at different stage use
Route

d
different technique to calculate timing.
Timing Engine
Do the optimizer and placer see the same worst paths

a
as the static timer?
RC Extraction

c
‹ Correlation is an indication of the relationship
between two variables.
SI Analysis

Static Timing
Analysis (Sign-off)

6/16/08 BD03: Digital Physical Design 690


Why Correlate?
‹ Majority of today's design flows utilize two timing-analysis tools.
‰ One for implementation, and a second for signoff

e
‰ Implementation tools have an in-built extraction tool, which are different from sign-
off extraction tools.

c
Extracted output will be different
‰ Both tools should see same information and provide the same results.

n
Prevents additional work at the time of sign-off

e
‹ At 130 and 90 nm, parasitic effects are small, and there is not that much that you need
to correlate.

d
‹ At 45 nm, correlation between different timing infrastructures is nearly impossible,
based on the number of complex effects.

ca Delay Variation Using

250 nm
Sign-Off Tool

180 nm 130 nm
Delay Variation Using

Technology Node
Implementation Tool

90 nm 65 nm

6/16/08 BD03: Digital Physical Design 691

How to Achieve Correlation


‹ To correlate native extraction results with sign-off extraction
‰ Compare SPEF files from basic and sign-off extraction

 Total capacitance

ce
‰ Generate the scaling factors or the de-rating factors for

 Cross-coupling capacitance
 Resistance

en
‹ The timing scaling factors affect the path delay values generated in
the timing reports.

a d
‹ Scaling factors are set for data paths, clock paths, minimum and
maximum operating conditions.

6/16/08
c BD03: Digital Physical Design 692
Post-CTS and Post-Route Correlation
‹ Post-CTS
Post-CTS
‰ Actual clock tree delays

e
propagated Clock pin
C. logic

c
‰ Actual clock net delays used
instead of estimates done at pre-
CTS Delay

n
Clock Source network
‹ Post-route source latency latency

e
‰ All cells and nets have fixed
location on design

d
‰ Generates a more realistic timing
Post Route
result

a
Clock pin
‰ Effects due to congestion are
C. logic
taken into account

c
‰ Effects due to signal integrity can Delay
be taken into account
Clock Source network
‰ Account for mismatch between source latency latency
pre-route and post-route delays

6/16/08 BD03: Digital Physical Design 693

Static Timing Analysis


‹ Timing analysis

e
‹ Timing constraints

‹ Constraint checking and report timing: slacks and violations

‹ Setup and hold timing violations

‹ Timing exceptions

nc
e
‹ On-chip variation (OCV) and clock path pessimism removal (CPPR)

d
‹ Multi-mode multi-corner (MMMC) design

a
‹ Timing correlation

‹ Design rule verification

6/16/08
c BD03: Digital Physical Design 694
What Are Design Rule Constraints?
‹ Design rule constraints are requirements depending on the technology library.

‹ Default constraints always exist implicitly because of selected target libraries.

ce
‹ These rules are established by the library vendor for the proper functioning of
the fabricated circuit; they must not be violated.
‹ User can set more restrictive values—the explicit values—but cannot remove

n
implicit design rule constraint attributes.

d e Constraints

a
Max Tran

c .lib Max fanout

Max Cap

6/16/08 BD03: Digital Physical Design 695

Some Design Rule Constraints


‹ What constraints are there for the outputs of logic gates?

e
‹ Output of every gate usually has one or more of the following design
rule constraints:

c
‰ Max transition

n
‰ Max fanout
‰ Max capacitance

d e
ca
6/16/08 BD03: Digital Physical Design 696
What Is Maximum Transition Time?
‹ The maximum transition time for a net is the longest time allowed for
its driving pin to change logic values.

ce
‹ Typically, fixed by buffering the output of driving gate.

Upsized Driver or Added Buffers

1x

en
Before Optimization
After Optimization
1x

d
2x 1x
1x

ca
Maximum Transition
Rule Violation
Maximum Transition
Rule Met

6/16/08 BD03: Digital Physical Design 697

What Is Maximum Fanout?


‹ The fanout of a net is the physical number of logic gate inputs to which an
output is connected.

ce
‹ To prevent routing congestion, as well as to help the synthesis tool meet
maximum transition and capacitance constraints, we need to specify the
maximum fanout limit for the design.

n
‹ Most technology libraries place fanout restrictions on driving pins, creating an
implicit fanout constraint for every driving pin in designs using that library.

d e
ca
6/16/08 BD03: Digital Physical Design 698
What Is Maximum Capacitance?
‹ Maximum capacitance specifies the maximum capacitance allowed on
the output pin of a cell.

ce
‹ The maximum capacitance design rule constraint allows you to control
the capacitance of nets directly.
‹ The design rule constraints max_fanout and max_transition limit the

n
actual capacitance of nets indirectly.

e
a d
6/16/08
c BD03: Digital Physical Design 699

Learning Activity
In this activity, you will

e
‹ Study a timing report which has a critical path which is failing to meet
the timing requirement

nc
‹ Analyze the report and identify the problem

‹ Decide which course of action is best suited to fix the critical path so
that it meets timing

e
‹ Present your findings to the class

d
ca 20 minutes for activity
10 minutes for debriefing

6/16/08 BD03: Digital Physical Design 700


Topics in This Module
‹ Static timing analysis

e
‹ Signal integrity analysis

nc
d e
ca
6/16/08 BD03: Digital Physical Design 701

What Is Signal Integrity?


‹ Definition: Unintended
effects on digital signals Specification Floorplanning Place/Route

e
caused by interconnect Designer Placement

parasitic resistance or Micro-

c
Physical Synthesis

Architecture Scan Reorder


capacitance that causes
Static Timing Analysis

Design Optimization
Designer
noise and/or changes delays Pre-CTS
Delay Calculation

Signal Integrity

n
Extraction

RTL CTS

‹ Example: In our example Design Optimization


PostCTS

e
design, we saw SI effects Logic Synthesis
Route
such as noise-on-delay and Synthesized
Gates Design Optimization
Netlist PostRoute
glitches, due to long nets that

d
Detail
were running in parallel. Routed
Design
GDSII

a
Layout Design Verification

GDSII

c
GDSII Mask Prep

6/16/08 BD03: Digital Physical Design 702


Input and Output, Format
Signal integrity
‹ Input

e
SPEF
‰ Routed design in the Verilog language
or other HDL + DEF

c
‰ Constraints in Synopsys Design Routed Design
Constraints (SDC) format SDC TCL

n
Gates +
‰ Constraints and commands in TCL DEF

‰ Parasitic extraction file (SPEF)

e
‰ Logical timing libraries in Liberty (.lib) Signal Integrity
format

d
Logical Physical
‰ Physical libraries in LEF format Incremental Library Library
SDF
‰ Tool specific SI libraries

a
Tool
Specific
‹ Output Delay File Library

c
‰ Incremental SDF file containing all of
the delay information in the design
related to noise-on-delay
‰ Reports for glitch nets
‰ List of problem nets that need to be re-
routed

6/16/08 BD03: Digital Physical Design 703

SI Problems with Changing Process Technology

ce
n
SI Problems

d e
ca 0.15 0.13 0.09
Process Technology
0.065

6/16/08 BD03: Digital Physical Design 704


Why Now?
These SI effects have always existed, but they are worse at deep submicron sizes
because of

e
‹ Finer geometries

c
‰ Greater wire and via resistance
‰ Higher electric fields (if supply voltage not scaled)

n
‰ Smaller spacing rules between wires

‹ More metal layers

e
‰ Higher ratio of cross coupling to grounded capacitance

d
Interconnect: Determining factor for
performance, power, and yield

a
30
Delay (ps)

25

c
20 Total
15
Gate
10
5 Interconnect

0.65 0.5 0.35 0.25 0.18 0.13 0.09 0.065

Shrinking Process
6/16/08 BD03: Digital Physical Design 705

Signal Integrity Analysis


‹ Crosstalk (cross coupling): noise on delay

e
‹ Glitch on functionality

‹ Noise library

‹ ECO repair files

nc
‹ Hierarchical SI analysis: block noise model (XILM)

d e
ca
6/16/08 BD03: Digital Physical Design 706
What Is Crosstalk Effect?
‹ Crosstalk is caused by transition on an adjoining signal having a capacitive or
inductive coupling between neighboring wires leading to an unintended logic

e
transition.
‰ Victim net: Net on path affected by crosstalk

c
‰ Aggressor net: Net that affects victim net

‹ Switching window: Time interval when a signal transition may occur. When

n
coupled signals switch

e
‰ In opposite direction (aggressor), victim line signal delay increases.
‰ In same direction (helper) victim line signal delay decreases

a
Aggressor
Wire R

d Aggressor net

c
Grounded C
Drive R Coupling C

Victim net

Victim
Input Noise Tolerance

6/16/08 BD03: Digital Physical Design 707

Effect of Crosstalk on Delay


‹ Crosstalk can lead to an increase in delay, which may lead to setup or hold
time failures when both attacker and victim are changing simultaneously.
‹ Timing predictions become inaccurate.

ce
‹ Crosstalk can have two effect on victim nets:
‰ Crosstalk causes signal to slow down.

Cell Delay here


n
‰ Crosstalk causes signal to speed up.

e
Delay here depends on the
behavior of other nets

a d Wire R
in1
FF

c
in
Grounded C Coupling C

Other logic net(s)


a1

6/16/08 BD03: Digital Physical Design 708


Crosstalk Causes Signal to Slow Down
When attacker and victim are changing
in opposite directions Wire R victim
FF

e
‹ The cross coupling between the in

two nets causes the victim to slow

c
Coupling C
down.

n
‹ This can affect the setup time a1
requirement of the flip-flop if the

e
signal arrives late.

In1 Data

a d Setup Time

c
Clock

In1 Data

Setup Time

6/16/08 BD03: Digital Physical Design 709

Crosstalk Causes Signal to Speed Up


When both attacker and victim are
Wire R victim
falling or rising simultaneously FF

e
in
‹ The cross coupling between the

c
two nets causes the victim to Coupling C
speed up.

n
‹ This can affect the hold time a1
requirement for a flip-flop if the

e
signal arrives early.

d
In1 Data

a
Hold Time

c
Clock

In1 Data Hold Time

6/16/08 BD03: Digital Physical Design 710


Crosstalk Analysis
Steps
‹ Computes the timing windows and slew

e
Compute slew rate
rates internally

c
‹ Uses timing windows and logic
constraints to disallow specific

n
simultaneous switching scenarios Disallow
between victim and attacker nets simultaneous

e
switching
‹ Analyzes each valid overlapping
attacker subset to determine the worst-

d
case delay change
Find victim with

a
‹ Outputs either an incremental or full worst delay
SDF file for all nets

6/16/08
c BD03: Digital Physical Design
Generate
incremental
or full SDF

711

Signal Integrity Analysis


‹ Crosstalk (cross coupling): noise on delay

e
‹ Glitch on functionality

‹ Noise library

‹ ECO repair files

nc
‹ Hierarchical SI analysis: block noise model (XILM)

d e
ca
6/16/08 BD03: Digital Physical Design 712
Impact of Noise on Functionality
‹ Coupling noise can cause functional failures.

‹ Slew rate (dv/dt) and capacitance (C) set glitch current (i).

ce
‹ Load impedance sets the glitch voltage.
‰ The attacker causes a significant glitch on the reset signal such that it resets the
flip-flop and destroys the stored logic state.

n
‰ With lower transistor threshold voltages (Vtn and Vtp) for low power design, glitches
can lead to unintended switching of transistors.

d e 1 d
q

a
Attacker
0
clk

c
i C
1 reset
Victim
i=Cdv/dt

6/16/08 BD03: Digital Physical Design 713

Noise Analysis Flow


Steps
‹ Propagates the noise glitch to see if the

e
Propagate
noise glitch reaches a storage element noise glitch

c
(latch or flip-flop)
‹ This reduces the number of potential

n
false alarms as it utilizes the inherent Check if noise
glitch filtering properties of CMOS logic reached storage

e
‹ Measures the height of the glitch after it elements
has propagated to the receiver output

d
‹ Performs sensitivity analysis, which

a
determines if a glitch will amplify or not
‹ If the glitch does not amplify, it cannot
Measure glitch
height at
receiver output

6/16/08
c
cause a functional failure

BD03: Digital Physical Design


Perform
Sensitivity
Analysis

714
Example Text Glitch Report
Generated with generate_report -sort_by rcvr_peak -slack

e
*******************************************************************************************
CeltIC Noise Report
Generated: Fri Aug 15 10:22:01 PDT 2007

c
***************************************************************************
Report Options:
---------------------------------------------------------------------------

n
Slack : yes
Sort by : noise (receiver input peak)
Threshold : 10.0 (mV)

e
Level : VH and VL
---------------------------------------------------------------------------
Peak(mV) Level TotalArea %AreaTillPeak Width(ps) VictimNet

d
1687.614 VL 1067.88 17.17 1265.55 U2DFF:CP {CLK2}

Receiver output peak:

a
Value ReceiverNet
1559.185 U2DFF/CP (DFQD1)

c
Constituents:
Source Peak(mV) Offset(ps) Slew(ps) Edge Net TraceBackNet(NoiseType)
Cpl: 1687.614 4950.000 50.000 R CLK1 -

Baselevel: 0.000 - - - - -
---------------------------------------------------------------------------

6/16/08 BD03: Digital Physical Design 715

SI Repair Techniques for Crosstalk Glitch and Delay


Minimizes disturbance to existing place and route by

e
‹ Increasing the spacing between the affected nets

‹ Upsizing the victim driver so the affect of the aggressor is minimized

c
‹ Add a shielding wire between the affected nets; shield is usually VSS

n
d e
ca
6/16/08 BD03: Digital Physical Design 716
Signal Integrity Analysis
‹ Crosstalk (cross coupling): noise on delay

e
‹ Glitch on functionality

‹ Noise library

‹ ECO repair files

nc
‹ Hierarchical SI analysis: block noise model (XILM)

d e
ca
6/16/08 BD03: Digital Physical Design 717

Noise Library
‹ Signal integrity analysis requires each cell in the circuit to be modeled
(characterized) using a hierarchical model, such as
‰ UDN (user-defined noise)

ce
‰ ECHO (hierarchical block)
‰ XILM (interconnect logic model) or cdB (block)

make_cdb utility.

en
‹ This pre-characterized information is stored in a noise library using the

d
‹ The characterization determines the sensitivity of the cell library to
noise glitches on the inputs.

ca
‹ Factors such as resistance, capacitance, noise tolerance, and output
holding strength are to be taken into account during characterization.

6/16/08 BD03: Digital Physical Design 718


What’s in a Noise Library?
‹ Characterized gate-level data
‰ UDN portion

e
 Input characterization data

c
 Output characterization data
 Slew characterization data

n
‹ SPICE transistor description

e
‰ Copy of transistor-level cell
‰ Cell renamed to _CADMOS_<cellname>

d
Characterized slew on input
of last logic stage output

a
(- rise, -fall ) Internal Slew Characterization
Cell input slew
(-slews)

6/16/08
c Internal node
(-rise_prop_to –fall_prop_to)
capVal

BD03: Digital Physical Design


Output pin connected
to internal node
-connNL

719

Generating a Noise Library


‹ The make_cdb utility performs characterization and automatically extracts I/O
port direction as follows:

e
‰ As specified in the Synopsys .lib (preferred approach)
‰ As specified by the set_port command

c
 Ports connected to gates are marked as inputs
 Ports connected to transistor channels are marked as outputs

en
 Channel connected inputs or bidirects must be marked manually
‹ Records the Vds-Ids curves for each Vgs connected to each cell output

d
‹ Calculates the noise threshold of each cell input and the I/O pin capacitance
Cell Library CMOS

a
SPICE Device
Netlist(s) Model

c
Synopsys Command File make-cdb
Library (TCL)
.lib

Noise
Library
.cdb
6/16/08 BD03: Digital Physical Design 720
What’s in a cdB File?
A block-level cdB contains a cell-level view and a cdB Structure

transistor-level view. SPICE Transistor Model(s)

e
‹ The cell-level view contains pin capacitance, Characterized Data For cell1
calibrated input noise threshold, and

c
subckt transistor description for cell1
nonlinear output drive strength.

‹ The transistor-level view contains an ECHO

n
Characterized Data For cell N
built with the cells and R/C network
connected to each I/O pin. This is different subckt transistor description for cell N

e
than the .cdB created by make_cdb, which
contains a UDN built with transistors, not

d
cells. Noise Check

a
Noise Check

UDN
c
UDN
Cell Level View

Transistor-Level View
6/16/08 BD03: Digital Physical Design 721

Signal Integrity Analysis


‹ Crosstalk (cross coupling): noise on delay

e
‹ Glitch on functionality

‹ Noise Library

‹ ECO repair files

nc
‹ Hierarchical SI analysis: block noise model (XILM)

d e
ca
6/16/08 BD03: Digital Physical Design 722
What Is Engineering Change Order Mode?
ECO mode is used in an SI analysis tool to

e
‹ Analyze both glitch and delay failures

‹ Fix propagated noise failures

c
‹ Output a tool-specific ECO command file

n
d e
ca
6/16/08 BD03: Digital Physical Design 723

ECO Repair Files


‹ An ECO repair file is a tool-specific output command file generated
when the tool operates in the ECO mode.

equivalents.

ce
‹ The ECO mode uses the Liberty file (.lib) or user-defined cell

‹ The tool can fix glitch and incremental delay failures with the ECO
option.

en
‹ The tool automatically outputs the ECO repair file in a text file and a
HTML format, showing the original noise and the new noise after

a d
swapping in a new cell.
Victim driver cells can be upsized. (Swapping victim driver cells will
not fix the failure if the coupling is caused by a long wire.)

6/16/08
c BD03: Digital Physical Design 724
Noise-on-Delay Fixing
Options for ECO analysis on noise
failures

e
‹ Buffer: Buffer insertion Place and Route
‹ Resize: Driver resizing
‹ Spacing: Wire spacing

nc
‹ Shieldnet: Shield net insertion
‹ Nofix: Do not do ECO analysis Extraction
ECO
Repair File

e
for noise failures (Glitch +Delay)
‹ Default option is spacing.

a d Noise Analysis

6/16/08
c Static
Timing
Analysis

BD03: Digital Physical Design 725

ECO Repair File Example


The ECO HTML file below generated by CeltIC contains a detailed table with
information on noise and delay ECOs. It is generated automatically when the ECO

e
is enabled.

nc
d e
ca
6/16/08 BD03: Digital Physical Design 726
Signal Integrity Analysis
‹ Crosstalk (cross coupling): noise on delay

e
‹ Glitch on functionality and delay

‹ Noise library

‹ ECO repair files

nc
‹ Hierarchical SI analysis: block noise model (XILM)

d e
ca
6/16/08 BD03: Digital Physical Design 727

Hierarchical Methodology
‹ Design sizes and complexity increasing

‹ Longer turnaround time and capacity limitations when running designs in a flat
hierarchy

ce
‹ To handle complexity, block-based hierarchical design methodologies are
used

en
a d Black

c
Box

6/16/08 BD03: Digital Physical Design 728


What Is an XILM?
‹ XILM is an interconnect logic model that contains all the nets from the
boundary to the first latch or flip-flop and the cross-coupling capacitance for

e
noise analysis.

c
‹ It is created using CeltIC NDC.

‹ The XILM model is used for both hierarchical noise and timing analysis.

en
d
Propagated
Attacker Noise Attacker Attacker
Failure?

a
Primary d q d q Primary
Input Output

c
Victim
clk clk

6/16/08 BD03: Digital Physical Design 729

Advantages of Hierarchical Analysis


‹ Reduction in turnaround time

e
‹ Less likely to have a capacity limitation

‹ Gives feedback earlier in the design cycle

c
‹ Supports a continuous convergence methodology

n
d e
ca
6/16/08 BD03: Digital Physical Design 730
Learning Activity
In this activity, you will

e
‹ Be given a handout of a SI report which contains violations

‹ You have to analyze the report and trace the cause of the problem

nc
‹ Decide which strategy is best suited to fix the violation

‹ Present your findings to the class

d e
20 minutes for activity
10 minutes for debriefing

ca
6/16/08 BD03: Digital Physical Design 731

STA Summary
‹ The goal of timing analysis is to verify that a design meets timing
requirements under a specified set of timing constraints.

possible paths.

ce
‹ STA ignores functionality of circuit and analyzes the timing for all

‹ Timing constraints are used by designer to guide the timing

n
optimization tools in order to meet the timing goals.

e
‹ The timing reports provide a summary of the final timing information,
which reports timing failures (setup and hold) for all paths starting with
the worst failing path.

a d
‹ Timing exceptions are set on paths that are not designed to be
exercised during normal circuit operation.

c
‹ Timing analysis modes, such as OCV mode, direct the tool so that it
takes into account the small difference in operating parameters across
the chip while analyzing the design.
‹ Design rule constraints are the requirements established by the library
vendor for the proper functioning of the fabricated circuit.
6/16/08 BD03: Digital Physical Design 732
Signal Integrity Summary
‹ SI issues lead to failure in performance of a circuit due to errors
induced in the normal operation of a design through crosstalk and

e
glitches.

c
‹ A noise library characterizes the cells in a design to determine its
sensitivity to noise glitches on their inputs.

en
‹ An ECO repair file is a command file that provides information used to
repair nets that suffer from noise and that should be fixed in the
database available after place and route.

a d
‹ XILM is an interconnect logic model that defines the noise propagation
up to the first latch/flip-flop from the boundary pins.

6/16/08
c BD03: Digital Physical Design 733

Testing Your Understanding


True or false

e
In static timing analysis, the designer creates timing test vectors that are
simulated using a gate-level netlist to verify timing.

from multiple libraries.

nc
Operating conditions are always set from a single set of libraries and never

The timing constraints should be complete before running a timing debug.

e
Design rule constraints are requirements depending on technology library.
Crosstalk is caused by transition on an adjoining signal having a capacitive

a
unintended logic transition.

d
or inductive coupling between neighboring wires leading to an

Coupling noise can cause functional failures.

6/16/08
c BD03: Digital Physical Design 734
Terminology
Term Description
Constraint-related

e
Clock Skew The maximum difference in arrival times of clock signal to any two latches/FFs fed by the clock network
Clock Jitter The maximum difference in phase of clock between any two periods

c
Clock latency Specifies the delay along the clock tree (Source latency + Clock network latency)
Slew Rate Represents the maximum rate of change of a signal at any point in a circuit

n
Path Delay Represents the time taken for signal to propagate from one point to another

Timing report-related

e
Beginpoint Flip-flop or port at which the signal is launched with respect to the clock
Endpoint Flip-flop or port at which the launched signal is captured with respect to the clock
Other end arrival It is the capture clock path from clock source to capture flop register

d
time
Slack Slack or timing margin is the difference between the “required arrival time” and “actual arrival time”

a
Phase Shift Phase shift is the delay adjustment used to calculate the appropriate required time at the path end point
Instance Master cell definition used multiple time in a design with a unique name

c
Arc Any signal path along a net from one start point to one end point

Operating mode-related
Launch Clock Clock signal at the starting flip-flop which launches the data

Capture Clock Clock signal at the ending flip-flop which captures the data
Early signal Earliest time at which the value on a net /point can change from its previous cycle stable value
Late Latest time at which the value on a net/point can settle to its final stable value for the current cycle

6/16/08 BD03: Digital Physical Design 735

ce
en
a d
6/16/08
c BD03: Digital Physical Design 736
Design Optimization

Module 11

June 16, 2008

Optimization Process
‹ Optimization is the successive
refinement of a product or design.

e
‹ Usually, it takes several iterations

c
of optimization until a product or
design is complete.
Trees

n
‹ The types of optimizations
performed on the product or

e
design depend on the stage.
For example, to make lumber,

d
trees are chopped down, cut into
long strips, sized, and sanded.

a
“Optimization”

‹ In digital design, we also see

c
various optimizations as the
design progresses through the
physical implementation flow.
Lumber

6/16/08 BD03: Digital Physical Design 738


Module Objective
In this module, you will be able to

e
‹ Explain the value of optimization at the various stages of the design
flow to meet timing

nc
d e
ca
6/16/08 BD03: Digital Physical Design 739

Topics in This Module


‹ Optimization for timing, SI, power, and area

e
‹ Inserting repeaters to optimize for timing

‹ Pre-CTS, post-CTS, and post-routing optimization

nc
d e
ca
6/16/08 BD03: Digital Physical Design 740
What Is Optimization?
‹ Unless you are an absolute genius, your design will not meet the
timing requirements on the first run.

ce
‹ Optimization is the process of iterating through a design such that it
meets timing, area, and power specifications.
‹ In general, optimization can be broken down into the following areas:
‰ Timing
‰ Signal integrity

en
d
‰ Power
‰ Area

ca
6/16/08 BD03: Digital Physical Design 741

What Is Timing Closure?


‹ A placed and routed design achieves timing closure when it meets its
timing specifications while also satisfying electrical, design rule, and

e
signal integrity constraints.

c
‹ Timing closure is often one of the greatest causes of ASIC tapeout
schedule slips.

en
‹ The problem lies in the discrepancy between front-end and back-end
designers’ concept of timing.
‹ Front-end designers use wireload models to predict timing, and back-

a d
end designers use a fully placed design, including its resistance and
capacitance (RC) values.
‹ Who is more accurate?

6/16/08
c BD03: Digital Physical Design 742
What Are Wireload Models?
‹ One of the most vexing problems Sample wireload model file
traditional synthesis tools face is

e
wire_load(“sample_wl10") {
how to predict interconnect
resistance : 8.5e-8;
parasitic values.

c
capacitance : 1.5e-4;
‹ One approach is to develop a area : 0.7;
lookup table that ties the RC slope : 66.667;

n
values of a net to its fanout. fanout_length (1,66.667);
}

e
‹ Tools calculate the appropriate wire_load(“sample_wl20") {
wire load block for each net. resistance : 8.5e-8;

d
capacitance : 1.5e-4;
‹ These values are derived from area : 0.7;
statistical analysis of ASIC foundry

a
slope : 133.334;
data for a given process node. fanout_length (1,133.334);

c
}

6/16/08 BD03: Digital Physical Design 743

Drawbacks of Wire Load Models


‹ Since these wireload model selections are based on discrete values of
the wire area, they are generally crude and inaccurate.

ce
‹ In process nodes of around 1 micron, the dominant component of net
delay is the I/O pin delay of standard cells. Therefore, the wireload
delay plays an insignificant role.

e
widths mean more resistance.

n
‹ As device dimensions shrink, global routes get longer and smaller wire

‹ The wire load can no longer be relied on to close timing.

a d
‹ A better replacement for wireload models is physical synthesis, where
synthesis and placement are combined to more accurately calculate
the wire delay timing based on physical data.

6/16/08
c BD03: Digital Physical Design 744
Optimizing for Timing
‹ There are many ways to reduce delay; we will cover some
fundamental techniques here.

ce
‰ Upsizing gates increases their drive strength and, thus, reduces the time it
takes for that gate to transition based on a given load.
 Upsizing a gate increases its own input capacitance, giving its driver

n
higher capacitive load.
 A technique called logical effort was invented to optimize the size of

e
gates along a path for minimal delay.
 The tool will usually perform calculations for you.

a d
‰ Reduce wire capacitance
 Usually involves shortening the wire lengths of critical paths by moving
cells or inserting buffers

6/16/08
c Switching to a higher metal layer can also reduce capacitance

BD03: Digital Physical Design 745

Optimizing for Timing (continued)


‹ Often, your design will contain an adder or multiplier unit in the logic
path.

ce
‰ For a large number of bits for example, a carry lookahead adder performs
much better than a ripple carry adder.
‰ Physical synthesis tools optimize datapath elements to meet timing, while

n
balancing area and power.

e
‹ If all fails and the datapath contains too much combinational delay, it is
often viable to simply break the path and insert a register in between,

d
creating an extra pipeline stage.
‰ An extra pipeline stage means more latency and more area.

a
‰ Such a change usually requires changing the RTL itself.

c
6/16/08 BD03: Digital Physical Design 746
Signal Integrity
‹ As technology continues to scale, the aspect ratio of the horizontal-to-
vertical dimensions are reduced.

capacitances.

ce
‹ This results in increased ratios of coupling capacitance to substrate

‹ The impact on the victim line is a strong function of the rise time of the

n
interfering signal and the strength of the gate driving line Y.

e
‹ A voltage step on line X causes a transient step on Y that decays with
a time constant: τ XY = RY ( C XY + CY )

a d X

CXY coupling capacitance

c
Y

VX RY CY substrate capacitance

6/16/08 BD03: Digital Physical Design 747

Signal Integrity (continued)


‹ Aside from delay issues, crosstalk can also cause functional failures.

e
‹ When a voltage is applied on line X, there is also a change of voltage
on line Y equal to C XY

c
ΔVY = ΔV X
(CY + C XY )

n
‹ If this change in voltage is large enough, it can cause an erroneous
logic value at the load of line Y.

d e X

a
CXY coupling capacitance
Y

c
VX RY CY substrate capacitance

6/16/08 BD03: Digital Physical Design 748


Optimizing for Signal Integrity
‹ There are a few ways to reduce the effects of crosstalk. Recall that the
equation for delay is τ XY = RY (C XY + CY )

e
‰ Reduce RY, which means upsizing the driver of line Y.

c
RY

en
‰ Insert a repeater in the line.
RY

d
‰ Reduce the capacitance, which means separating the wires or changing
metal layers.

ca
6/16/08 BD03: Digital Physical Design 749

Power
‹ Power is a major issue in most chips, especially those that are used in
mobile devices where battery life is limited.

ce
‹ Recall that power is given by the equation P = f*C*Vdd2 where f is the
operating frequency, C is the total capacitance of the circuit, and Vdd
is the supply voltage.

en
‹ Most of the time, the voltage supply and the operating frequency of the
circuit is already determined long before the physical implementation
stage.
‹ How can we the reduce power?

a d
6/16/08
c BD03: Digital Physical Design 750
Optimizing for Power
‹ To reduce power
‰ Reduce capacitance.

ce
‰ Decrease size of standard cells. Power is also a linear function of the
driving current, and smaller gates output less current.

ANDX10

en ANDX6

a d
‹ Leakage current is a dominant factor in today’s (90 nm and below)
chips and can account for as much as 30% of the power consumption.
‹ To reduce leakage current, gates with a higher threshold must be

6/16/08
c
used.

BD03: Digital Physical Design 751

Power and Timing Tradeoff


‹ As we were discussing power optimization, you may have noticed that
some of the techniques are in direct conflict with those in timing

e
optimization.

c
For example, downsizing gates leads to less power, but also more
delay.

one correct solution.

en
‹ This is an age-old problem in the development of ICs and there is not

‹ Every chip has its own priorities regarding power or delay.

a d
For example, a mobile phone processor may not need to run at 2 GHz,
but it must consume as little power as possible.

6/16/08
c BD03: Digital Physical Design 752
Visualizing the Tradeoff
‹ In the graphic below, the purple box represents the constraints for
energy (power) and delay (timing) put on your design.

can have.

ce
‹ The blue curve represents the highest possible efficiency your design

‹ Your goal should be to move your design onto the blue curve.

application of your chip.

en
‹ Again, the exact desired location on the blue curve depends on the

d
‹ The derivation of this curve is highly theoretical and is beyond the
scope of this class.

ca
6/16/08 BD03: Digital Physical Design 753

Optimizing for Area


‹ The purpose of shrinking device dimensions from 90 nm to 65 nm to
45 nm is to fit more transistors on a die giving the chip more

e
functionality for the same area.

c
‹ Area is therefore a very important specification, especially for chips
used for medical purposes such as hearing aids and pacemakers.

RAMs and register files.

en
‹ The components that usually take up the most area on a chip are

‹ Shrinking the size of RAMs is an architectural issue and must be

d
settled with the RTL designer.

a
6/16/08
c SRAM

BD03: Digital Physical Design 754


Optimizing for Area (continued)
‹ Downsizing gates also has a small effect, but comes at a cost of
reduced speed and signal integrity.

has been taken up.

ce
‹ Utilization is defined as how much percentage of the floorplan area

‹ If the utilization is too high, the design may become congested,

timing.

en
making it difficult to route. Longer routes also make it harder to meet

a
Congestion

d
6/16/08
c BD03: Digital Physical Design 755

Topics in This Module


‹ Optimization for timing, SI, power, and area

e
‹ Inserting repeaters to optimize for timing

‹ Pre-CTS, post-CTS, and post-routing optimization

nc
d e
ca
6/16/08 BD03: Digital Physical Design 756
Inserting Repeaters
‹ Recall that you may upsize gates to decrease the delay through a
path.

buffers to reduce fanout.

ce
‹ If the fanout of a gate is too high, then it is a viable option to insert

‹ But why would inserting an extra stage in the path decrease the
overall delay?

en
‹ Take for instance the following circuit; the input capacitance of the
buffer is Cg, and the value of its load is 16Cg

a d
c
Cg
16Cg

6/16/08 BD03: Digital Physical Design 757

Inserting Repeaters (continued)


‹ Recall that the electrical fanout of a gate is defined as its loading
capacitance divided by its input capacitance.

e
16 C
= 16
g
‹ For the previous circuit, the electrical fanout of the buffer is C g

c
‹ Since it is the only buffer in the circuit, the total fanout of the circuit is
also 16.

en
‹ Let’s insert another buffer and size it to be twice as large as the first
buffer so that its input capacitance is 2Cg.

d
Cg 2Cg
16Cg

ca
‹ The total electrical fanout of the circuit is now 2C g Cg
= 10
16C g

‹ Recall that since the total delay of the circuit is roughly proportional to
the total electrical fanout of the circuit, we have effectively reduced the
delay of the path.
+
2C g

6/16/08 BD03: Digital Physical Design 758


Inserting Repeaters (continued)
‹ What if we replaced that buffer with a larger one?

Cg

ce
4Cg
16Cg

en
‹ A quick calculation will show that the total electrical fanout is now 8 instead of
10.

d
‹ How do we pick the optimal electrical fanout?

a
‹ This problem can be solved by
‰ Calculating the total delay for N stages of buffers and a total electrical fanout of

c
F(loading capacitance divided by input capacitance of the first buffer)
‰ Taking the derivative with respect to N
‰ Finding the zero of the derivative (call it N0). The optimal electrical fanout is then
equal to N 0
F
6/16/08 BD03: Digital Physical Design 759

Inserting Repeaters (continued)


‹ Although it would be a nice exercise, we will not perform the detailed
calculations here as there are many factors we did not take into

e
account, such as the intrinsic delay and loading of each buffer.

c
‹ A numerical analysis of the problem reveals that the optimal electrical
fanout is roughly equal to 4.

en
‹ This means to achieve optimal delay, every stage in the logic path
should have equal electrical fanout and equal delay.
‹ The method of logical effort, which will not be explained in this class,

d
explains how to size logic gates of any type.

a
6/16/08
c BD03: Digital Physical Design 760
Restructuring Logic
‹ Logic gates with a high number of inputs are not desirable.

e
‹ Usually, it is much more effective to restructure a wide gate into
smaller gates.

individual gates.

nc
‹ This allows more flexibility in terms of optimization for each of the

d e
ca
6/16/08 BD03: Digital Physical Design 761

Topics in This Module


‹ Optimization for timing, SI, power, and area

e
‹ Inserting repeaters to optimize for timing

‹ Pre-CTS, post-CTS, and post-routing optimization

nc
d e
ca
6/16/08 BD03: Digital Physical Design 762
Optimization During the Design Flow
‹ Now that you have learned all of these optimization techniques, where do you
use them?

lectures.

ce
‹ Below is a typical back-end flow that you may be familiar with from past

Netlist

en Floorplan

d
Power Plan

ca Placement

Clock Tree Synthesis

Routing

6/16/08 BD03: Digital Physical Design 763

Optimization During the Design Flow (continued)


‹ Typically, tools have three stages of optimization within the flow:

e
Netlist

c
Floorplan

n
Power planning

e
Placement

d
Pre-CTS Optimization

a
Clock tree synthesis

c
Post-CTS Optimization

Routing

Post-Routing Optimization

6/16/08 BD03: Digital Physical Design 764


Pre-CTS Optimization
‹ The first optimization that takes
place is right after the placement Netlist

e
stage.
Floorplan

c
‹ It is here that we have the most
freedom. Power planning

n
‹ The techniques that are commonly
used here include Placement

e
‰ Inserting buffers for high fanout
Pre-CTS Optimization
nets

d
‰ Upsizing and downsizing gates Clock tree synthesis

a
‰ Restructuring logic to meet timing.
Post-CTS Optimization
‹ Since the metal routes are not in

c
place yet, we cannot perform any
Routing
optimization by moving metal
layers.
Post-Routing Optimization

6/16/08 BD03: Digital Physical Design 765

Post-CTS Optimization
‹ When the clock network is put in place, a new element comes into
play called clock skew.

ce
‹ This factor is because the clock needs to propagate from the center of
the clock tree toward the peripherals.

n
Skew

Reg1

d e Reg1 Reg2

Reg2

ca Clock
source

6/16/08 BD03: Digital Physical Design 766


Post-CTS Optimization (continued)
‹ When harmful skew is added to
the timing path, the path can Netlist

e
violate timing depending on the
amount of the skew and the Floorplan

c
nature of the path.
Power planning
‹ To mitigate the effects of skew,

n
you can Placement

e
‰ Insert buffers in the clock tree to
lessen the skew Pre-CTS Optimization

d
‰ Re-time and use any of the
previously mentioned techniques Clock tree synthesis

a
to fix timing
Post-CTS Optimization
‹ Once again, no metal routes have

c
been placed, although the clock
Routing
signals are often routed during
clock tree synthesis.
Post-Routing Optimization

6/16/08 BD03: Digital Physical Design 767

Post-Routing Optimization
‹ Now the that the design is fully
placed, routed, powered, and Netlist

e
clocked, it is time to undergo the
final phase. Floorplan

‹ This is the stage to perform fixes


on hold violations.
‹ Note however, that at this stage,

nc Power planning

Placement

e
there is usually not enough room
to do much modification. Pre-CTS Optimization

d
‹ Moving standard cells and macros Clock tree synthesis
may require intensive re-routing.

a
‹ Therefore, the following

c
techniques are usually used:
‰ Changing metal layers
‰ Moving metal layers
‰ Resizing gates
Post-CTS Optimization

Routing

Post-Routing Optimization

6/16/08 BD03: Digital Physical Design 768


Summary
‹ Achieving timing closure is a difficult task and requires careful
negotiation between front-end and back-end designers.

ce
‹ Wireload models were originally used to determine timing in a design,
but are quickly becoming obsolete with shrinking device dimensions.
‹ Timing can be improved by upsizing gates, shortening wire lengths,
etc.

en
‹ Signal integrity issues are usually caused by coupling capacitance
between wires that are close to each other.

a d
They can be solved by moving wires and upsizing drivers.
‹ Power can be reduced by downsizing gates and using high-threshold

c
cells.

6/16/08 BD03: Digital Physical Design 769

Summary (continued)
‹ The power and timing tradeoff is always a critical consideration
depending on the application of the chip.

ce
‹ RAMs and register files should be used sparingly to optimize for area.
A high amount of optimization makes the design difficult to route and

n
may cause congestion.
‹ Buffers can be inserted to reduce delay in a pattern such that the

e
electrical fanout for each stage is approximately 4.

d
‹ Physical implementation usually consists of three stages of
optimization: pre-CTS, post-CTS, and post-routing.

ca
‹ Each successive stage will have less freedom to optimize as metal
layers are being added.

6/16/08 BD03: Digital Physical Design 770


Learning Activity
In this activity, the class will

e
‹ Study several scenarios (given in the next few slides) within the
optimization flow diagram

scenario shown

nc
‹ Identify which problems can potentially occur as a result of the

‹ Brainstorm the optimizations steps that are necessary to mitigate

e
these problems.

a d
20 minutes for activity
10 minutes for debriefing

6/16/08
c BD03: Digital Physical Design 771

Class Activity: Optimization Case 1


‹ Here is a sample netlist in schematic form.

e
‹ Run physical synthesis on the netlist, and find that the gates
highlighted in red are violating timing.

c
‹ What types of optimization would you perform on this netlist?

n
d e
ca
6/16/08 BD03: Digital Physical Design 772
Class Activity: Optimization Case 2
‹ Here is a design that has been through CTS.

e
‹ What are some possible problems with this design?

‹ How would you fix them?

nc Register Register

e
Clock “Long”
Buffer

d
Clock Net
Clock PLL

ca SRAM Register

6/16/08 BD03: Digital Physical Design 773

Class Activity: Optimization Case 3


‹ Here is a design that has been through detailed routing.

e
‹ What problems can you see in the routing?

‹ How would you fix them?

nc
d e
ca “Long”
Signal Nets

6/16/08 BD03: Digital Physical Design 774


Testing Your Understanding
True or false

e
1. As power increases on a chip, the delay decreases.

2. Clock skew is generally not desired and should be minimized through


optimization.

nc
3. Buffers can be upsized arbitrarily to optimize delay.

e
4. Crosstalk optimization can only be performed after routing.

5. Post-route optimization has more options to modify the placement of

d
cells versus pre-CTS optimization.

a
6/16/08
c BD03: Digital Physical Design 775

ce
en
a d
6/16/08
c BD03: Digital Physical Design 776
Engineering Change Orders, Design
Verification, and Tapeout
Module 12

June 16, 2008

Design Changes
From specification to final Functional
implementation, a chip can undergo Changes

e
changes at various stages. Specification

c
Functional changes to the specification
can require RTL Coding

n
‹ A restart of the entire
implementation process

e
Logical
Implementation
‹ Changes to the design during the

d
implementation process
Physical
These functional changes impact

a
Implementation

‹ Schedule

‹ Cost

c
Design Verification
and Tapeout
‹ Features of the product
Final
Implementation

6/16/08 BD03: Digital Physical Design 778


Module Objectives
In this module, you will be able to

e
‹ Articulate what an Engineering Change Order (ECO) is and what ECO
techniques are used at the different stages of the flow

requirements

nc
‹ Articulate the various steps in verification as well as list the tape-out

d e
ca
6/16/08 BD03: Digital Physical Design 779

Discussion Questions
‹ What are some plans and projects in everyday life that you are
involved with?

ce
‹ Can you give example of some “last-minute” changes that have
occurred in those plans and projects?
‹ Can you give examples of a “checklist” or some type of process or

n
documentation to ensure that the plan or project is complete?

e
a d
6/16/08
c BD03: Digital Physical Design 780
Topics in This Module
‹ Engineering change orders (ECOs)

e
‹ Design verification

‹ Tapeout

nc
d e
ca
6/16/08 BD03: Digital Physical Design 781

What Is an ECO?
‹ Definition: The process of inserting a logic change directly into the
netlist after it has already been processed by an automatic tool

ce
‹ Example: After our final netlist was created, our marketing person
informed the team of a must-have feature for the chip. To incorporate
the feature, we created and implemented an ECO.

en
a d
6/16/08
c BD03: Digital Physical Design 782
ECOs
Implementing ECOs is one of the most challenging aspects of the design
process.

ce
ECOs are necessary to implement important product features, but we must
do so with as minimum impact to schedule and cost, while making sure
what we implement is correct.

‹ ECO types

en
In the next few slides, we will cover

d
‹ ECO implementation types

a
‹ Using back-end tools to implement ECOs

6/16/08
c BD03: Digital Physical Design 783

ECO Types
Generally, there are two types of ECOs.

e
‹ Functional ECOs
‰ Changes to the specification to add or remove functionality to the design

nc
‰ The ECO’d netlist and the original RTL do not match functionally
‰ RTL must be modified to match the ECO’d netlist

‹ Timing ECOs

d e
‰ Changes to the netlist, typically late in place/route, that do not change the
function, but try to improve on the timing of the design
‰ The ECO’d netlist and the RTL do match functionally

ca
6/16/08 BD03: Digital Physical Design 784
Functional ECOs
Steps
1 Functional

e
1. The specification calls to add or Specification
Changes
remove functionality.

c
2. The netlist is manually modified 3 RTL Coding
either in logic implementation or

n
physical implementation.

e
3. The RTL code is modified to Logical 2
Implementation
match the functionality of the
ECO’d netlist and verified.

a d
4. Once the ECO is verified, then the
rest of tapeout process is
completed.
Physical
Implementation
2

c
Design Verification 4
and Tapeout

Final
Implementation

6/16/08 BD03: Digital Physical Design 785

Functional ECOs (continued)


‹ To save time, we skip the logical
// RTL Code
implementation or physical

e
always @ (posedge clk)
implementation steps by reusing q <= !((!c ? !b : a) || d);
the information from previous

c
runs. Netlist after Logic Synthesis

‹ The instance names (u1, u2, etc.) u3

n
a u4 u5
of the gates are preserved from

e
logic synthesis into placement and
u1
physical implementation. b

d
‹ If we modified the RTL, then re-
c
synthesized the design, - all of the u2

a
instance names would be different d
and our placement information

c
from the previous runs would be Netlist after Placement
useless. u1 u3
u2
u4 u5

6/16/08 BD03: Digital Physical Design 786


How Much Is Too Much?
For a functional ECO, how much logic can be implemented?

e
To simplify, there are three cases:
‹ Easy: Easy to perform an ECO, just a few gates

nc
‹ Medium: ECO will be tough, not impossible

‹ Difficult to impossible: Better off re-synthesizing the design

e
Difficulty

a d Re-synthesize design from RTL

6/16/08
c
ECO
1-100 Gates
ECO or
Re-synthesize

~1% of Total Gates

BD03: Digital Physical Design


Amount of Logic

787

Timing ECOs
Timing ECOs typically occurs late in the Netlist after Logic Synthesis
physical implementation process. a
u3
u5

e
u4
Steps

c
u1
1. Critical paths are analyzed with b
the place/route tool or a static

n
timing analysis tool. c
u2
2. Suggestions are made by the d

e
design engineer or the tool to
implement a timing ECO. Netlist after Placement

d
Example: Upsize “u4” to next u1 u3
higher power.

a
u2
3. The ECO is done, and timing is u4 u5
re-analyzed.

6/16/08
c
4. This is iterated until all paths meet
timing.
5. Once the design meets timing, the
rest of the flow is completed.
Netlist after Timing ECO

u1

BD03: Digital Physical Design


u4
u3

u5
u2

788
ECO Implementation Types
There are three types of ECO implementation types:

e
‹ Spare gates

‹ Metal fix

‹ Focused ion beam (FIB)

nc
d e
ca
6/16/08 BD03: Digital Physical Design 789

What Are Spare Gates?


‹ Definition: The purposeful insertion of extra logic in the netlist, just in
case an ECO is required. Spare gates can be used to modify or add

e
logic to an existing design.

c
‹ Most design teams will have a strategy to include spare gates in their
design, just in case they are needed for ECOs.

en
‹ Spare gates can be implemented before we tapeout a design, and
typically during the physical implementation process.
‹ There are several methods for including spare gates:

a d
‰ Randomly sprinkle gates where available
‰ Instantiate a “pack” of ECO gates at various levels in the design hierarchy

c
‰ Use “ECO bulk” cells

6/16/08 BD03: Digital Physical Design 790


Random Spare Gates
The simplest method is to randomly add
spare gates in the areas that are

e
available. Netlist before random spare cell insertion

c
‹ This can be a manual process or u1 u3
done using the place/route tools u2
utilities

n
u4 u5
‹ Simple process

‹ Flip-flops typically are not


e
‹ Can be difficult to find the proper
gate with the right drive strength

d
Netlist after random spare cell insertion

u1 s1 s2 s3 u3 s4

a
connected to the clock-tree or
s5 s6 s7 s8 u2
scan-chain, and must be

c
connected if used u4 u5

6/16/08 BD03: Digital Physical Design 791

Instantiating an ECO Pack


A more proactive approach is to use an
ECO pack.
‹ Instantiate in various quantities

ce
throughout the design hierarchy.
‹ The ECO pack can contain flops,
// RTL Code
always @ (posedge clk)
q <= !((!c ? !b : a) || d);

n
muxes, and random gates.
// Instantiate ECO Packs
‹ The flops can be connected to the

e
eco_pack eco_u0 (…);
clock-tree and scan-chain during eco_pack eco_u1 (…);
the normal implementation

d
eco_pack eco_u2 (…);
process.

a
‹ Design team has better control of
the instantiation, contents, and

c
reuse of the spare gates.

6/16/08 BD03: Digital Physical Design 792


ECO Bulk Cells
Some vendors have special ECO cells Netlist after random ECO bulk cell insertion
called “bulk” cells.

e
u1 e1 e2 e3 u3 e4
‹ ECO bulk cells are randomly
e5 e6 e7 e8 u2

c
placed throughout the design.
u4 u5
‹ Can be “programmed” by adding a

n
specific functional cell on top of Netlist after ECO bulk cell modification
the bulk cell.

e
u1 e1* e2 e3 u3 e4
‹ A single ECO bulk cell can e5 e6 e7 e8 u2
become an inverter, nand, nor,

d
u4 u5
xor, etc., just by changing the
functional connections on top of

a
VDD VDD
the cell.

c
‹ Gives a lot of flexibility for later
stage ECOs. a z a z

VSS VSS

e1 e1*
6/16/08 BD03: Digital Physical Design 793

Implementing ECOs
If carefully planned, an ECO “pack” of
cells would be located near every ECO s1

e
location.
Unfortunately, there is not enough room

c
on most chips to do so.

n
Let’s say
‹ The output of u4 is currently

e
connected to input of u5.
‹ We need to invert the output of u4

d
and feed it to u5.

a
‹ We do not have ECO “bulk” cells.
‹ s1 is a spare inverter.

c
‹ s2 is a spare 2-1 mux.

How do you choose the right cell to


implement the ECO?

6/16/08
u1

BD03: Digital Physical Design


s2

u4
u3

u5
u2

794
ECO Implementation Types
There are three types of ECO implementation types:

e
‹ Spare gates

‹ Metal fix

‹ Focused ion beam (FIB)

nc
d e
ca
6/16/08 BD03: Digital Physical Design 795

What Is Metal Fix?


Definition: Once the design has gone
through detail route, a metal fix is an ECO

e
where only a few metal layers are modified in Mask N
order to modify connectivity between existing

c
logic in the design.

Metal fix occurs after the design has taped-

n
out and is in the midst of production.

e
To make changes at this point, it is always Mask 10
best to consider a metal fix or a “metal-only” Mask 9
fix because we can reuse our previous work Mask 8

d
as much as possible. Mask 7
Mask 6

a
Consider the “masks” for a tapeout: Mask 5
Mask 4
‹ Each mask represents a layer in our

c
Mask 3
design. Mask 2
‹ If we make modifications, we would Mask 1
like to minimize the number of layers,
to minimize the number of masks
changed.

6/16/08 BD03: Digital Physical Design 796


What Is Metal Fix? (continued)
For example, let’s say we needed to
Netlist before metal-only fix
create a simple change

e
u1 s1 s2 s3 u3 s4
‹ Re-route the existing design to

c
use an inverter instead of a buffer s5 s6 s7 s8 u2
(u2). u4 u5

n
‹ Identify a spare cell close by to re-
route (s8).

d e
‹ Implement the metal-only changes
with as few layers as possible.
‹ Change just two mask layers and
Netlist after metal-only fix

a
u1 s1 s2 s3 u3 s4
continue with production.
s5 s6 s7 s8 u2

c
u4 u5

6/16/08 BD03: Digital Physical Design 797

ECO Implementation Types


There are three types of ECO implementation types:

e
‹ Spare gates

‹ Metal fix

‹ Focused ion beam (FIB)

nc
d e
ca
6/16/08 BD03: Digital Physical Design 798
What Is a Focused Ion Beam?
Definition: Once the design has gone
through the manufacturing process, a

e
focused ion beam (FIB) machine can be
used to etch away or add connections

c
to a die in order to modify or add logic to
an existing design.

An FIB is a specialized machine that


can add or remove material with very
high accuracy.

en
d
‹ After a chip has been produced,
wire connections can be removed

a
or added to change functionality.

c
‹ This is an expensive alternative
and is done for one die.
‹ This is usually done for prototype http://en.wikipedia.org/wiki/Image:Fib_tem_sample.jpg

parts, etc.

6/16/08 BD03: Digital Physical Design 799

ECOs with Back-End Tools


In most back-end tools, we can choose
to implement ECOs using

e
// Original Netlist
‹ ECO by netlist buf1x u2 (.a(n1), .z(n2));

c
‰ Create a modified Verilog® netlist
and have the back-end tool // ECO Netlist

n
incorporate the new cells. // buf1x u2 (.a(n1), .z(n2));
‹ ECO by change list inv1x u2 (.a(n1), .z(n2));

e
‰ Create a command file to add or
remove cells and connections.

a d // ECO change list


-buf1x u2
-n1 u2.a

c
-n2 u2.z
+inv1x u2
+n1 u2.a
+n2 u2.z

6/16/08 BD03: Digital Physical Design 800


Discussion Questions
Situation

e
‹ Assume you have a single-gate ECO to implement, changing a two-
input AND gate to a two-input OR gate.

nc
‹ You have a spare 2-1 mux near the two-input AND gate.

‹ You have a spare two-input OR gate far from the two-input AND gate.

e
Questions
‹ How can a 2-1 mux behave like a two-input OR gate?

a d
‹ How would you implement this ECO?

‹ What factors would you consider when choosing the mux or the OR

c
gate?

6/16/08 BD03: Digital Physical Design 801

Learning Activity
In this activity, you will

e
‹ Study several scenarios of design at different stages of the
implementation process

nc
‹ Decide which course of action is best suited for your scenario,
including the implementation and verification of your ECO
‹ Present your findings to the class

d e
20 minutes for activity
10 minutes for debriefing

ca
6/16/08 BD03: Digital Physical Design 802
Topics in This Module
‹ Engineering change orders (ECOs)

e
‹ Design verification

‹ Tapeout

nc
d e
ca
6/16/08 BD03: Digital Physical Design 803

Design Verification
The design verification flow consists of
Physical
‹ Formal verification or logic

e
Verification
Original
equivalence checking (LEC) Physical
Formal
Verification

c
Implementation
(LEC)
‹ Physical verification ECO

‹ GDSII export to layout

n
GDSII
to Layout
‹ Signoff LVS and DRC

d e Layout Tool

Mask Prep

a
GDSII and
for Tapeout Manufacturing

c
Signoff
LVS and DRC

6/16/08 BD03: Digital Physical Design 804


Logic Equivalence Checking
ECOs involve the manual edit of a netlist.
‹ One task we must perform is to ensure the functionality is consistent between

ECO’d netlist (timing ECO).

ce
the RTL and the ECO’d netlist (functional ECO) or the original netlist to the

‹ LEC, which is part of formal verification, can be used to verify these cases.

RTL
Functional ECO

Design

en ECO’d RTL
Timing ECO

d
Engineer RTL

Netlist
Logic

ca
Synthesis

Design
Engineer
Formal
Verification (LEC)

ECO’d
Netlist Netlist
Logic
Synthesis

Design
Engineer
Formal
Verification (LEC)

ECO’d
Netlist

6/16/08 BD03: Digital Physical Design 805

Verifying Functional ECOs


Edits so far
‹ Manual edits to the netlist during

e
Specification

physical implementation (1) Simulation (3)

c
‹ Corresponding edits to the RTL to 2 RTL Coding
reflect the functional changes (2)

To verify that the RTL code matches the


netlist, we can either run
‹ Simulation of the RTL code vs. the

en Simulation (3)
Logical
Implementation
Formal
Verification (4)

d
netlist and compare results (3) Physical
1
Implementation

a
‹ Formal verification or equivalence
checking of the RTL code vs. the

c
ECO’d netlist (4) Design Verification
and Tapeout

Final
Implementation

6/16/08 BD03: Digital Physical Design 806


Verifying Timing ECOs
Edits so far
‹ Manual edits to the netlist during

e
Specification

physical implementation (1)

c
Since the functionality has not been RTL Coding
changed, we can functionally verify the

n
ECO’d netlist with the original netlist.

e
Logical
‹ Simulation of the original netlist Implementation
vs. the ECO’d netlist and compare
Simulation (2)

d
results (2) Original

1 Physical Formal
‹ Formal verification or equivalence Implementation Verification (3)

a
ECO
checking of the original netlist vs. Simulation (2)
the ECO’d netlist (3)

c
Design Verification
and Tapeout

Final
Implementation

6/16/08 BD03: Digital Physical Design 807

Design Verification
The design verification flow consists of Physical
Verification
‹ Formal verification or LEC

e
Original
Formal
Physical
Verification
‹ Physical verification Implementation

c
ECO (LEC)

‹ GDSII export to layout


GDSII

n
‹ Signoff LVS and DRC to Layout

d e GDSII
Layout Tool

Mask Prep
and

a
for Tapeout Manufacturing

c
Sign-off
LVS and DRC

6/16/08 BD03: Digital Physical Design 808


Physical Verification
Physical verification involves several checks from within the place/route
environment before the GDSII generation.

These checks include


‹ Connectivity

ce
n
‹ Geometry

e
‹ Antenna

‹ Manufacturability

a d
6/16/08
c BD03: Digital Physical Design 809

What Are Connectivity Checks?


Verify the connectivity of your design to detect and report various
conditions, including
‹ Opens

‹ Unconnected pins

ce
n
‹ Dangling wires

e
‹ Loops

‹ Partial routing

a
verifyConnectivity
d
In the SOC Encounter® environment use the command

6/16/08
c
Then, view the resulting violations in the Violation Browser.

BD03: Digital Physical Design 810


What Are Geometry Checks?
Like a design rule checker (DRC), this checks the physical layout of the
design, including the following violations for nets.
‹ Width

‹ Length

ce
n
‹ Spacing

e
‹ Area

‹ Overlap

‹ Enclosure

‹ Wire extension

a d
‹ Via stacking

verifyGeometry

6/16/08
c
In the SOC Encounter environment, use the command

BD03: Digital Physical Design 811

What Are Antenna Checks?


During fabrication, excess charge can build up in a long wire and break down the
thin gate oxide of the load connected to it.

M1
M2

ce M1

Driver

Circuit after fabrication

en Load

d
Breakdown!

a
M1 M1

Driver Load

c
Circuit during fabrication

6/16/08 BD03: Digital Physical Design 812


What Are Antenna Checks? (continued)
In each technology process, there are rules, per metal layer, which dictate how
much area can be connected to a pin. If the area exceeds that, it is an antenna

e
violation. To fix the violation, one can

c
‹ Change metal layers so that the rule is met

‹ Add a diode so that there is a discharge path for the excess charge

M1
M2

en M1
M2

M1

d
Driver Load

a
Circuit after metal layer change

c
M2

M1 M1

Driver Diode Load

Circuit after diode insertion

6/16/08 BD03: Digital Physical Design 813

What Are Antenna Checks? (continued)


‹ Check the charge that builds up on pins caused by routing that does
not have a discharge path to a gate.

ce
‹ Check for pin routing that violates the maximum antenna charge for
the pins, and report violations on pins that have an antenna ratio
larger than the maximum allowed antenna ratio specified for the

n
routing layer.

e
‹ Check for unconnected metal segments that violate the maximum
area specified in the technology file.

a
verifyProcessAntenna
d
In the SOC Encounter environment, use the command

6/16/08
c BD03: Digital Physical Design 814
What Are Manufacturability Checks?
Alpha particles can cause problems during manufacture:
‹ Via defects

‹ Cell defects

‹ Wire defects

ce
Via Defects

en
Cell Defects Wire Defects

d
Alpha particle blocks via Alpha particle blocks a gate pin Alpha particle causes a short

ca
6/16/08 BD03: Digital Physical Design 815

What Are Manufacturability Checks? (continued)


To improve the manufacturability of the design, design teams should consider the
following:
‹ Use redundant vias.

ce
‹ Use “yield-hardened” library cells.

‹ Use thicker wires with more spacing on critical nets.

Redundant Vias

en
Yield Hardened Cells Thicker wires + more spacing

d
Improve via reliability Cells are slower, but safer Wires and spacing take up more
space, but are safer

ca
6/16/08 BD03: Digital Physical Design 816
Manufacturability
Calculates the probability of yield loss due to the following effects:
‹ Cell failures

‹ Via failures

‹ Wire opens and shorts

ce
n
These effects are caused by random particles that land on the die during
fabrication, causing defects.

reportYield

d e
In the SOC Encounter environment, use the command

a
Note: This accounts for only a portion of actual yield loss. There is also parametric
yield loss due to RC variation or systematic yield loss due to lithography problems.

6/16/08
c BD03: Digital Physical Design 817

Design Verification
The design verification flow consists of
Physical
‹ Formal verification or LEC

e
Verification
Original
Formal
Physical
‹ Physical verification Verification

c
Implementation
ECO (LEC)
‹ GDSII export to layout

n
GDSII
‹ Signoff LVS and DRC to Layout

d e Layout Tool

Mask Prep

a
GDSII and
for Tapeout Manufacturing

c
Signoff
LVS and DRC

6/16/08 BD03: Digital Physical Design 818


GDSII Export to Layout
The GDSII exported from the
Physical
place/route tool does not have all of the

e
Verification
necessary information for manufacturing Original
Formal
Physical
and final LVS/DRC sign-off. Verification

c
Implementation
ECO (LEC)
A layout tool, such as Virtuoso, is

n
required to produce the final GDSII for GDSII
to Layout
tapeout and final LVS/DRC sign-off.

d e Layout Tool

Mask Prep

a
GDSII and
for Tapeout Manufacturing

c
Signoff
LVS and DRC

6/16/08 BD03: Digital Physical Design 819

Design Verification
The design verification flow consists of
Physical
‹ Formal verification or LEC

e
Verification
Original
Formal
Physical
‹ Physical verification Verification

c
Implementation
ECO (LEC)
‹ GDSII export to layout

n
GDSII
‹ Signoff LVS and DRC to Layout

d e Layout Tool

Mask Prep

a
GDSII and
for Tapeout Manufacturing

c
Signoff
LVS and DRC

6/16/08 BD03: Digital Physical Design 820


What Are LVS and DRC?
Definition: Layout Versus Schematic
Physical
(LVS) and Design Rule Check (DRC)

e
Verification
are sign-off checks run to ensure the Original
Formal
Physical
integrity, functionality, and Verification

c
Implementation
(LEC)
manufacturability of the chip. ECO

‹ LVS is a comparison of the

n
GDSII
Verilog® netlist vs. GDSII to to Layout

e
ensure the functionality of the
design. Layout Tool

d
‹ DRC is a detailed check of the
routed design against the Mask Prep

a
GDSII and
technology’s set of rules. for Tapeout Manufacturing

c
Signoff
LVS and DRC

6/16/08 BD03: Digital Physical Design 821

Input and Output, Format


LVS
‹ Input

e
TCL
‰ Gate-level netlist in the Verilog Gates GDSII

c
language
‰ GDSII LVS

n
‰ Rule deck
Rule SPICE
Deck Libs
‰ SPICE libraries
‰ Commands in Tcl

‹ Output
‰ LVS reports

d e Reports

ca
6/16/08 BD03: Digital Physical Design 822
Input and Output, Format (continued)
DRC
‹ Input

e
TCL
GDSII
‰ GDSII

c
‰ Rule deck
DRC
‰ Commands in Tcl

n
Rule
‹ Output Deck

e
‰ DRC reports
Reports

a d
6/16/08
c BD03: Digital Physical Design 823

Layout vs. Schematic: LVS


‹ Schematic refers to the gate-level netlist that once was a schematic
diagram and now comes from the synthesis tool.

c
determined from that layout,

e
‹ A flat layout is produced, and active devices and routing are

‰ Where poly overlaps diffusion, a transistor is assumed.

able to route signals.

en
‰ All poly, diffusion, and metal layers are conductive and are assumed to be

‹ The netlist extracted from the layout is compared to the original gate-

d
level netlist to verify that they are the same.

a
‹ This is a double-check on the place and route process.

c
‹ An LVS check should be done on the final layout of all ICs.

6/16/08 BD03: Digital Physical Design 824


LVS Analysis Process

Steps: Design Netlist:

• From Gates to Transistors


• Primary I/Os Identified
• Connectivity Traced

ce Net1 Net2 Net3

n
IN1 O1
• Device Recognition

Design Layout:

Net1 VDD
I1
Net2
VDD

d
I3
e Net3
Design Transistors:

Net1 I1
2/1
Net2
I3
2/1 Net3

IN1

ca
GND
A B

I2
GND
A B

I4
O1 IN1
I2
1/1
I4
1/1
O1

6/16/08 BD03: Digital Physical Design 825

Design Rule Checks


Goals
‹ Increase yield and reliability of ICs

‹ Allow automated design checks

ce
‹ Simplify the design process for the designer

n
Typical design rules

e
‹ Minimum width and spacing on each layer
‹ Overlap of metal over via
‹ Metal coverage/slotting rules

a d
Rules are created by the foundry for each manufacturing process.

c
‹ Most rules generated through process characterization
‹ Some rules derived from consistent failure modes of ICs

6/16/08 BD03: Digital Physical Design 826


Design Rule Checks (continued)
Design rules for the layout assure that the IC will work when it is
manufactured.
‹ Example design rules

ce
‰ Minimum width for all layers

n
‰ Minimum spacing for all layers
‰ Minimum spacing between layers: diffusion to well boundary

d e
‰ Overlap of between layers for vias, contacts, and transistors
‰ Percent coverage of a layer for metal layers

‹ Design rule files are rules as drawn.

ca
‰ Photographic processes can be positive or negative.
‰ Widths as manufactured may be larger or smaller then as drawn.

‹ A DRC must be performed on the final layout.

6/16/08 BD03: Digital Physical Design 827

Example Design Rules


Metal 1
overlap of via

Via width

ce Metal 2
width

Metal 2

en
d
overlap of via

ca Metal 1 to
Metal 1
spacing

Metal 1 width

6/16/08 BD03: Digital Physical Design 828


Topics in This Module
‹ Engineering change orders (ECOs)

e
‹ Design verification

‹ Tapeout

nc
d e
ca
6/16/08 BD03: Digital Physical Design 829

Tapeout
‹ Tapeout checklist

e
‹ Mask preparation

‹ Chip manufacturing

nc
d e
ca
6/16/08 BD03: Digital Physical Design 830
Tapeout Checklist
Design teams need to have a checklist to ensure that all processes and
procedures were covered during the design, implementation, and

e
verification phases.

c
Important areas to check

n
‹ RTL code and netlist information

‹ All related timing, constraint, power, signal integrity information

d e
‹ All related design-for-test (DFT) information

‹ All related simulation information (RTL, gate)

a
‹ All related vendor specific requirements

c
‹ All related package, board, software, and system information

‹ All related sign-off criteria

6/16/08 BD03: Digital Physical Design 831

Tapeout Checklist: Example


There are many tasks to track and record for tapeout. They
include, among others,

e
‹ RTL Code Freeze and Version Noted
‹ Synthesis Netlist Version Noted Make sure starting RTL code and netlist are noted

c
‹ Testbench Versions Noted
‹ Functional Verification Passed Make sure simulations pass
‹ Pre-Layout Timing Analysis

n
Validate early timing and SDCs
‹ SDC validity Checked
‹ Boundary Scan Checked

e
‹ Memory BIST Checked
Ensure all DFT processes are complete
‹ Scan Chain Insertion Checked

d
‹ Floorplan Version Noted
‹ Power Grid Analysis Checked
Validate early place/route power and timing
‹ Place/Route with Timing Closure Done

a
‹ Signoff
‰ Physical Verification (LVS/DRC/Antenna)

c
‰ Formal Verification
‰ Static Timing Analysis Ensure all sign-off criteria is met
‰ IR Drop
‰ EM Check
‹ ATPG Done Create ATPG vectors, and make sure
‹ Gate-Level Verification Done Gate-level simulations pass

6/16/08 BD03: Digital Physical Design 832


What Are Masks, Wafers, and Photolithography?
‹ Masks (or Photomasks)
‰ Definition: An opaque plate with holes or transparencies that allow light to

‹ Wafers

ce
shine through in a defined pattern, commonly used in photolithography.

‰ Definition: Thin slice of semi-conducting material, such as a silicon

‹ Photolithography

en
crystal, on which microcircuits are constructed.

‰ Definition: A process used in the fabrication of integrated circuits to

a d
selectively remove parts of a thin film.
‰ Example: In the fabrication of semiconductor devices, masks are used to
create custom patterns of different materials on wafers using

c
photolithography.

6/16/08 BD03: Digital Physical Design 833

Mask Preparation
With advanced geometries, there are problems with the creation of masks due to
the very small sizes of wires and gates.

causing errors in the mask.

ce
The light sources used to create the masks themselves are not accurate enough,

n
Layout 0.25µ 0.18µ

d e
a
0.13µ 90 nm 65 nm

6/16/08
c Figures courtesy Synopsys Inc.

BD03: Digital Physical Design 834


What Are OPC and PSMs?
‹ Optical proximity correction (OPC)
‰ Definition: A photolithography enhancement technique commonly used to

‹ Phase Shifting Masks (PSMs)

ce
compensate for image errors due to diffraction or process effects.

‰ Definition: Photomasks that take advantage of the interference generated

en
by phase differences to improve image resolution in photolithography
‹ Example: OPC and PSM are used in advanced geometries to
improve the printability of wires during mask creation.

a d
6/16/08
c BD03: Digital Physical Design 835

Advanced Mask Technologies


To address these issues, several advanced mask technologies have been
developed, including
‹ OPC

‹ PSM

ce
en
a d
6/16/08
c BD03: Digital Physical Design 836
OPC
‹ OPC is the manipulation of the
mask itself to create extra patterns Optical Proximity Correction (OPC)

e
Design Wafer
to compensate for the errors due
to photolithographic process.
‹ As technologies advance to
smaller geometries, the

nc
wavelength of the light used in the
No OPC

e
photolithographic process is
actually bigger than the mask
shapes themselves, causing

d
errors. OPC

a
‹ The extra shapes modify the mask
to compensate for these effects.

6/16/08
c BD03: Digital Physical Design 837

PSMs
‹ Like OPC, PSMs serve to
Phase Shifting Masks (PSM)
compensate for errors in the

e
photolithographic process. (a) Regular mask
(b) Alternating PSM Mask

c
‹ PSM relies on the interference (c) Attenuating PSM Mask
created by mask modifications to
achieve its goal. Both (b) and (c) have the

n
effect of improving
the contrast on some

e
parts of the wafer, which
could improve the
resolution, as is done

d
with OPC

ca
6/16/08 BD03: Digital Physical Design 838
Chip Manufacturing
Masks and wafers are processed to create integrated circuits.

Masks

ce
en
d
Chemical
and other Wafers

a
Processing

Wafers

c
Wafers Processed Integrated
Wafers Wafer Circuits

6/16/08 BD03: Digital Physical Design 839

Photolithography

e
Start with wafer at current step

Spin on a photoresist

nc
e
Pattern photoresist with mask

a
etch, implant, etc.

d
Step specific processing

6/16/08
c Wash off resist

Courtesy K. Yang, UCLA

BD03: Digital Physical Design 840


Processed Wafers
‹ Most processed wafers contain many copies of the same integrated circuit.

‹ Some processed wafers contain one copy of many different integrated circuits
called a shuttle.

ce
‹ Shuttles are used for prototypes or test chips.

en
a d
6/16/08
c Courtesy D. Bouldin, U. Tennessee

BD03: Digital Physical Design 841

Packaging Process
The last step in the fabrication of a
semiconductor device is packaging.

e
Die Cut
Steps Wafers

c
‹ Die cut—From the wafer, each
individual die is cut Processed Integrated

n
Wafer Circuits
‹ Die attachment—The die mounted

e
to the package or support
structure

d
‹ IC bonding—Interconnect the die
I/O with the package I/O

a
Die Attachment IC Bonding

‹ IC encapsulation—Enclose the die

c
with ceramic, plastic, or epoxy to
prevent physical damage or
corrosion

IC Encapsulation

6/16/08 BD03: Digital Physical Design 842


Learning Activity
In this activity, you will
‹ List as many of the tapeout requirements as you can, without looking back at

e
the lecture material or your notes

c
n
10 minutes for activity
10 minutes for debriefing

d e
ca
6/16/08 BD03: Digital Physical Design 843

Summary
‹ ECOs are a vital part of the design process. Design teams have to add
critical functionality with least impact on schedule and cost using

e
ECOs. They do this by carefully planning for ECOs up front.

c
‹ Design verification involves several checks to ensure that the design
functionality, integrity, and manufacturability of the chip are verified.

en
‹ Tapeout involves ensuring all of the important steps in the overall
process are accounted for, through mask preparation and the final
manufacturing steps.

a d
6/16/08
c BD03: Digital Physical Design 844
Testing Your Understanding
True or false

e
1. Design teams can plan for ECOs very early in the design process.

2. When using a register as a spare gate for an ECO, you can simply

nc
connect it up like a regular logic gate.
3. LVS and DRC are run on a netlist just after logic synthesis.

e
4. Alpha particles cause random errors during the manufacture of a chip.

5. When creating a tapeout checklist, it is important to note all of the


sign-off criteria.

a d
6/16/08
c BD03: Digital Physical Design 845

Sources
Gennari, Frank. Overview of OPC.
http://www.cs.berkeley.edu/~ejr/GSI/cs267-s04/homework-

e
0/results/gennari/

nc
d e
ca
6/16/08 BD03: Digital Physical Design 846

You might also like