Professional Documents
Culture Documents
Sam Palermo
Analog & Mixed-Signal Center
Texas A&M University
Announcements
• Lab 5 Report and Prelab 6 due Mar. 31
2
Agenda
• Clocking Architectures
• PLLs
• Modeling
• Noise transfer functions
3
References
• High-speed link clocking tutorial paper, PLL
analysis paper, and PLL thesis posted on
website
• Posted PLL models in project section
• Website has additional links on PLL and
jitter tutorials
• Majority of today’s PLL material comes
from Fischette tutorial and M. Mansuri’s
PhD thesis (UCLA)
4
High-Speed Electrical Link System
5
Clocking Terminology
Synchronous
• Every chip gets same frequency AND phase
• Used in low-speed busses
Mesochronous
• Same frequency, but unknown phase
• Requires phase recovery circuitry
• Can do with or without full CDR
• Used in fast memories, internal system interfaces,
MAC/Packet interfaces
Plesiochronous
• Almost the same frequency, resulting in slowly
drifting phase
• Requires CDR
• Widely used in high-speed links
Asynchronous
• No clocks at all
• Request/acknowledge handshake procedure
• Used in embeddded systems, Unix, Linux
[Poulton] 6
I/O Clocking Architectures
• Three basic I/O architectures
• Common Clock (Synchronous)
• Forward Clock (Source Synchronous)
• Embedded Clock (Clock Recovery)
7
Common Clock I/O Architecture
[Krauter]
[Krauter]
9
Common Clock I/O Limitations
• Difficult to control clock skew and propagation delay
• Need to have tight control of absolute delay to meet a given
cycle time
• Sensitive to delay variations in on-chip circuits and board
routes
• Hard to compensate for delay variations due to low
correlation between on-chip and off-chip delays
• While commonly used in on-chip communication, offers
limited speed in I/O applications
10
Forward Clock I/O Architecture
• Common high-speed reference
clock is forwarded from TX chip
to RX chip
• Mesochronous system
• Used in processor-memory
interfaces and multi-processor
communication
• Intel QPI
• Hypertransport
• Requires one extra clock
channel
• “Coherent” clocking allows low-
to-high frequency jitter
tracking
• Need good clock receive
amplifier as the forwarded clock
is attenuated by the channel 11
Forward Clock I/O Limitations
• Clock skew can limited forward
clock I/O performance
• Driver strength and loading
mismatches
• Interconnect length
mismatches
• Duty-Cycle variations of
forwarded clock
12
Forward Clock I/O De-Skew
• Per-channel de-skew allows for
significant data rate increases
• Sample clock adjusted to center
clock on the incoming data eye
• Implementations
• Delay-Locked Loop and Phase
Interpolators
• Injection-Locked Oscillators
14
Embedded Clock I/O Architecture
• Can be used in mesochronous
or plesiochronous systems
• CDR Implementations
• Per-channel PLL-based
• Dual-loop w/ Global PLL &
• Local DLL/PI
• Local Phase-Rotator
PLLs 15
Embedded Clock I/O Limitations
• Jitter tracking limited by
CDR bandwidth
• Technology scaling allows
CDRs with higher
bandwidths which can
achieve higher frequency
jitter tracking
16
Embedded Clock I/O Circuits
• TX PLL
• TX Clock Distribution
• CDR
• Per-channel PLL-based
• Dual-loop w/ Global PLL &
• Local DLL/PI
• Local Phase-Rotator PLLs
• Global PLL requires RX
clock distribution to
individual channels
17
Xilinx 0.5-32Gb/s Transceiver Clocking
Channels 1-4
Frac-N LC PLL1, 2 Technology CMOS 16nm FinFET
Power Supply (Vavcc, Vavtt, Vaux) 0.9 V, 1. 2V, 1.8 V
∑ Transmitter Frequency range 500 Mb/s – 32.75 Gb/s
DCC Transceiver Quad area 2.625 mm × 2.218 mm
VCOLB LC PLL range 8-16.375 GHz
Ring PLL range 2-6.25 GHz
VCOHB Receiver
TX PRBS7 jitter at 32.75Gb/s TJ: 5.39 ps, RJ: 190 fs
PI (D,X,S) 32.75Gb/s RX JTOL @ 30MHz 0.45 UI
PPF 2 IQ CAL @ 100MHz 0.6 UI
Channel loss at 32.75Gb/s 30 dB
DCC < 10-15
Measured BER at 32.75Gb/s
Ring PLL Power at 32.75Gb/s with DFE 577mW/ch (17.6pJ/b)
I/Q1, I/Q2
Active Inductor based Clock Distribution
[Upadhyaya VLSI 2016]
19
PLL Block Diagram
[Perrott]
20
PLL Applications
• PLLs applications
• Frequency synthesis
• Multiplying a 100MHz reference clock to 10GHz
• Skew cancellation
• Phase aligning an internal clock to an I/O clock
• Clock recovery
• Extract from incoming data stream the clock frequency and
optimum phase of high-speed sampling clocks
• Modulation/De-modulation
• Wireless systems
• Spread-spectrum clocking
21
Forward Clock I/O Circuits
• TX PLL
• TX Clock Distribution
• Replica TX Clock Driver
• Channel
• Forward Clock Amplifier
• RX Clock Distribution
• De-Skew Circuit
• DLL/PI
• Injection-Locked Oscillator
22
Embedded Clock I/O Circuits
• TX PLL
• TX Clock Distribution
• CDR
• Per-channel PLL-based
• Dual-loop w/ Global PLL &
• Local DLL/PI
• Local Phase-Rotator PLLs
• Global PLL requires RX
clock distribution to
individual channels
23
PLL Design Challenges
• Board-level reference clock frequencies don’t scale often
• 156MHz is a common frequency
• RX CDR bandwidth is hard to scale with PAM4 signaling and
ADC-based front-ends
• Typically 2-4MHz
• PLL bandwidth must be kept less than 10MHz for stability
and to filter reference jitter
• VCO phase noise at low-frequency offsets due to flicker
noise must be suppressed
32.75Gbps Transceiver PLL Simulated Jitter Numbers
Receiver Type PLL PN @1MHz CDR BW RJ RJ in UI
Analog based RX -92.4dBc/Hz 12.7MHz 160.7fs 5.26mUI
ADC based RX -92.4dBc/Hz 2MHz 407fs 13.3mUI
Divider
1/N
e KVCO
in F(s) Vctrl out
KPD s
fb
1
N
Divider
𝐾 PD𝐾 VCO 1
𝐶2 𝑠 + 𝑅𝐶
𝐻 𝑠 = 𝜙out 𝑠 = 𝐶1 + 𝐶2
1
𝐾 PD𝐾 VCO
𝜙 in 𝑠 𝑠3 + 𝑅𝐶1𝐶2 𝑠 + 𝐾 PD𝐾 VCO
2
𝑁𝐶2 𝑠 + 𝑁𝑅𝐶 𝐶
1 2
26
Understanding PLL Frequency Response
• Linear “small-signal” analysis is useful for understand PLL dynamics if
• PLL is locked (or near lock)
• Input phase deviation amplitude is small enough to maintain operation in
lock range
• Frequency domain analysis can tell us how well the PLL tracks the input
phase as it changes at a certain frequency
• PLL transfer function is different depending on which point in the loop
the output is responding to
Input phase response VCO output
response
[Fischette]
27
Input Noise Transfer Function
Phase Detector
Loop Filter VCO
e K VCO
n,in F(s) Vctrl out
K PD s
fb
1
N
Divider
𝐾 PD𝐾 VCO 1
𝜙out 𝑠 𝐶2 𝑠 + 𝑅𝐶 1
Input Phase Noise: 𝐻n I N 𝑠 = 𝑠 = 3 𝐶1 + 𝐶2 𝐾 PD𝐾 VCO
𝑠 + 𝐾 PD𝐾 VCO
𝑅𝐶1𝐶2 𝑠 +
𝜙n I N 2
𝑁𝐶2 𝑠 + 𝑁𝑅𝐶 𝐶
1 2
28
Input Noise Transfer Function
Parameter
Fref 156MHz
N 90
Fvco 14GHz
Icp 100uA
R 22.7k
C1 7.0pF
C2 500fF
Kvco 2π*1GHz/V
5.9MHz
f3dB
60°
Phase Margin
𝐾 PD𝐾 VCO 1
𝜙out 𝑠 𝐶2 𝑠 + 𝑅𝐶 1
Input Phase Noise: 𝐻n I N 𝑠 = 𝑠 = 3 𝐶1 + 𝐶2 𝐾 PD𝐾 VCO
𝑠 + 𝐾 PD𝐾 VCO
𝑅𝐶1𝐶2 𝑠 +
𝜙n I N 2
𝑁𝐶2 𝑠 + 𝑁𝑅𝐶 𝐶
1 2
29
VCO Noise Transfer Function
Phase Detector
Loop Filter VCO
vn,vco n,vco
e KVCO
in F(s) Vctrl out
KPD s
fb
1
N
KVCO is different if the input is at the
Divider
Vcntrl input (KVCO) or supply (KVdd)
𝐶1 + 𝐶2
𝑠2 𝑠 +
VCO Phase Noise: 𝐻n 𝑠 = 𝜙 out 𝑠 = 𝐶1 + 𝐶2 𝐾 PD𝐾 VCO
7C0 𝑠
𝑠 + 𝑅𝐶1𝐶2 𝑠 +𝑅𝐶1𝐶2𝑁𝐶2 𝐾 PD𝐾 VCO
𝜙n 7 C 0 3 2
𝑠 + 𝑁𝑅𝐶 𝐶
1 2
𝐶1 + 𝐶2
𝐾VCO 𝑠 𝑠 +
Voltage Noise on 𝜙 out 𝑠 𝑅𝐶1 𝐶2
𝑇 𝑠 = = 𝐶1 + 𝐶2 𝐾 PD𝐾 VCO
VCO Inputs: n 7C0 𝑠
𝑠 + 𝐾 PD𝐾 VCO
𝑅𝐶1𝐶2 𝑠 +
𝑣n 7 C 0 3 2
𝑁𝐶2 𝑠 + 𝑁𝑅𝐶 𝐶
1 2
30
VCO Noise Transfer Function
Parameter
Fref 156MHz
N 90
Fvco 14GHz
Icp 100uA
R 22.7k
C1 7.0pF
C2 500fF
Kvco 2π*1GHz/V
5.9MHz
f3dB
60°
Phase Margin
𝐶1 + 𝐶2
𝑠2 𝑠 +
VCO Phase Noise: 𝐻n 𝑠 = 𝜙 out 𝑠 = 𝐶1 + 𝐶2 𝐾 PD𝐾 VCO
7C0 𝑠
𝑠 + 𝑅𝐶1𝐶2 𝑠 +𝑅𝐶1𝐶2𝑁𝐶2 𝐾 PD𝐾 VCO
𝜙n 7 C 0 3 2
𝑠 + 𝑁𝑅𝐶 𝐶
1 2
𝐶1 + 𝐶2
𝐾VCO 𝑠 𝑠 +
Voltage Noise on 𝜙 out 𝑠
𝑅𝐶1 𝐶2
𝑇 𝑠 = = 𝐶1 + 𝐶2 𝐾 PD𝐾 VCO
VCO Inputs: n7C0 𝑠
𝑠 + 𝐾 PD𝐾 VCO
𝑅𝐶1𝐶2 𝑠 +
𝑣n 7 C 0 3 2
𝑁𝐶2 𝑠 + 𝑁𝑅𝐶 𝐶
1 2 31
28GHz PLL Phase Noise Example
Phase Noise (dBc/Hz)
[McNeill] 34
Oscillator Phase Noise Model
[Perrott]
T
T OL T T
[McNeill]
• Generally, we care about the jitter w.r.t. the ref. clock (x)
• However, may be easier to measure w.r.t. delayed version of output clk
• Due to noise on both edges, this will be increased by a sqrt(2) factor
to the reference clock-referred jitter
relative 39
Converting Phase Noise to Jitter
[Mansuri]
œ
8
• RMS jitter for T accumulation 2
𝜎OT = 2ƒ 𝑆$ 𝑓 𝑠𝑖𝑛 2
𝜋𝑓Δ𝑇 𝑑𝑓
𝜔
o 0
• As T goes to ∞ 2 2
R 0
2
T S f df o2 0
o2
• Actual integration range depends on application bandwidth
• fmin set by assumed CDR tracking bandwidth
• fmax set by Nyquist frequency (f0/2)
f0
2 2
• Most exact approach T2 S f
H sys f
o2 0
2
2
df
where H sys f is the system jitter transfer 40
function
112Gb/s PAM4 PLL Phase Noise Measurement
Measured Phase Noise at 28GHz TX Output
<100fs Integrated RJRMS
-60
Phase Noise (dBC/Hz)
-80
-
100
-120 EL1
EL1
-140
LoopB W3 kHz
3dB = 1.47Mrad/s
out(s) out(s)
20log 10 20log 10
ref(s) vcon (s)
42
PLL Linear Phase Model:
Frequency Step Response
ref
out
43
PLL Behavioral Model
• From my MSc thesis:
http://www.ece.tamu.edu/~spalermo/docs/msc_thesis_sam_palermo.pdf
• Written in SpectreHDL
• Also look at CppSim: http://www.cppsim.com/
timedom tran stop=20u step=20p ic=all maxstep=20p skipdc=yes relref=alllocal
// Multi-Band Phase Locked Loop Frequency Synthesizer Macromodel
simulator lang=spice
// Main Spectre File
.ic vd=0
// Samuel Palermo
save vd control fref fvco nv_temp vd_gnd
simulator lang=spectre
.OPTIONS rawfmt=psfbin save=selected diagnose=yes vabstol=.01
include "/home/samuel/research/pll/macromodels/pd/dig_pfd/dig_pfd.def"
+ reltol=.99
include "/home/samuel/research/pll/macromodels/lpf/lpf.def"
ahdl_include "/home/samuel/research/pll/macromodels/vco/vco.def" ***********************************************************************
ahdl_include "/home/samuel/research/pll/macromodels/vco/switch_vco.def" // Digital Phase Frequency Detector Macromodel
ahdl_include // Samuel Palermo
+ "/home/samuel/research/pll/macromodels/divider/divider.def" subckt dig_pfd (gnd dd fref fvco up upbar down downbar)
include "/home/samuel/research/pll/macromodels/vco/reference.def" ahdl_include "/home/samuel/research/pll/macromodels/pd/dig_pfd/dff.def"
// Power Supply ahdl_include
vdd dd 0 vsource dc=1 + "/home/samuel/research/pll/macromodels/pd/dig_pfd/nand.def"
// Reference Signal xdffup gnd dd fref up upbar r dff
xref 0 control fref reference xdffdown gnd dd fvco down downbar r dff
vcontrol control 0 vsource type=pwl wave=[0 0.64 1u 0.64] xnand gnd up down r nand
// Digital Tri-State Phase/Frequency Comparator ends dig_pfd
xdig_pfd 0 dd fref fvco up upbar down downbar dig_pfd
// Charge Pump
****************************************
iup dd 1 isource dc=25u *******************************
idown 2 0 isource dc=25u // D Flip Flop Macromodel
gup 1 vd up 0 relay vt1=0 vt2=1 ropen=100M rclosed=1m // Samuel Palermo
gupbar 1 0 upbar 0 relay vt1=0 vt2=1 ropen=100M rclosed=1m module dff(gnd, D, CLK, Q, QBAR, R) ()
gdown vd 2 down 0 relay vt1=0 vt2=1 ropen=100M rclosed=1m node [V, I] gnd, D, CLK, Q, QBAR, R ;
gdownbar dd 2 downbar 0 relay vt1=0 vt2=1 ropen=100M {
rclosed=1m real Q_temp;
// Loop Filter real QBAR_temp;
xfilter 0 vd lpf initial {
gvdgnd vd 0 Q_temp=0;
vd_gnd 0 relay QBAR_temp=1;
vt1=0 vt2=1 }
ropen=100M analog {
rclosed=10 if ($threshold (V(CLK, gnd)-1, 1)) {
// Voltage if (V(D,gnd)==1) {
Controlled Q_temp=1;
Oscillator QBAR_te
//xvco vd out vco (gain=40e6 fc=256e6) mp=0;
xvco 0 vd out nv_temp vd_gnd }
+ switch_vco (u=0.8 d=-0.8 gain=40e6 else {
fc=256e6) Q_temp=0;
// Divider QBAR_temp=1;
xdivider 0 out fvco buffer n_temp divider (divisor=32) }
op dc }
if (V(R,
gnd)==0) 44
{
Q_temp=0;
PLL Frequency Step Response:
Linear vs Behavioral Model
Cycle Slips
45
Next Time
• CDRs
46
Open-Loop PLL Transfer Function
[Mansuri]
47
Open-Loop PLL Transfer Function
w/o C1 (2nd order) with C1 (3rd order)
[Mansuri]
48
Closed-Loop PLL Transfer Function
[Mansuri]
ignoring C1:
49
PLL Natural Frequency and Damping Factor
Natural Frequency:
Damping Factor:
a 2 2 1 n N 4 n N
K PD K VCO K PD VCO
K 50
Damping Factor Impact
[Fischette]
= 0.5
= 0.707
=1
=2
KPD = 25uA/2
K VCO = 240MHz/V
N = 32
n = 1 (normalized)
52
Charge-Pump PLL Circuits
• Phase Detector
• Charge-Pump
• Loop Filter ref out
• VCO fb
• Divider
53
Phase Detector
e
ref
fb
[Razavi]
55
XOR Phase Detector
[Razavi]
57
Cycle Slipping
• If there is a frequency difference between the input
reference and PLL feedback signals the phase detector can
jump between regions of different gain
• PLL is no longer acting as a linear system
[Perrott]
Cycle Slipping
[Perrott]
60
PFD Transfer Characteristic
UP=1 & DN=-1
[Perrott]
[Fischette]
62
PFD Operation
[Fischette]
63
Charge-Pump PLL Circuits
• Phase Detector
• Charge-Pump
• Loop Filter
• VCO
• Divider
64
Charge Pump
VDD
I
Charging VCO Control
UP
Voltage
C1 C2
DOWN Discharging
I R
VSS
[Razavi]
• Issues
• Switch resistance can impact UP/DN current matching as a function of Vctrl
• Clock feedthrough and charge injection from switches onto Vctrl
• Charge sharing between current source drain nodes’ capacitance and Vctrl
66
Charge Pump Mismatch
Ideal locked condition, but CP mismatch Actual locked condition w/ CP mismatch
[Razavi]
• PLL will lock with static phase error
• Extra “ripple” on Vctrl
• Results in frequency domain spurs at the reference clock frequency
offset from the carrier
67
Charge Pump w/ Improved Matching
69
Charge-Pump PLL Circuits
• Phase Detector
• Charge-Pump
• Loop Filter
• VCO
• Divider
70
Loop Filter
VDD
I
Charging VCO Control
Voltage
C1 C2
Discharging
I R
F(s)
VSS
72
Loop Filter Transfer Function
• With secondary capacitor, C2
73
Why have C2?
• Secondary capacitor smoothes control voltage ripple
• Can’t make too big or loop will go unstable
• C2 < C1/10 for stability
• C2 > C1/50 for low jitter
74
Filter Capacitors
• To minimize area, we would like to use highest density caps
• Thin oxide MOS cap gate leakage can be an issue
• Similar to adding a non-linear parallel resistor to the capacitor
• Leakage is voltage and temperature dependent
• Will result in excess phase noise and spurs
75
Charge-Pump PLL Circuits
• Phase Detector
• Charge-Pump
• Loop Filter
• VCO
• Divider
76
Voltage-Controlled Oscillator
KVCO
0 1
VDD/2
VDD
out t Laplace
0 out t 0 KVCO vc
Domain Model
out t out t dt KVCO vc t t
• Time-domain
dt phase relationship
out(t)
77
Voltage-Controlled Oscillators (VCO)
• Ring Oscillator
• Easy to integrate
• Wide tuning range (5x)
• Higher phase noise
• LC Oscillator
• Large area
• Narrow tuning range (20-30%)
• Lower phase noise
78
Barkhausen’s Oscillation Criteria
[Sanchez]
H j
Closed-loop transfer function:
• 2 conditions:
• Gain = 1 at oscillation frequency 0
• Total phase shift around loop is n360 at oscillation frequency 0
79
Ring Oscillator Example
[Sanchez]
80
Ring Oscillator Example
• 4-stage oscillator
• A0 = sqrt(2)
• Phase shift = 45
• Easier to make a
larger-stage
oscillator oscillate, as
it requires less gain
and phase shift per
stage
[Sanchez]
81
LC Oscillator Example
• Oscillation phase shift condition
satisfied at the frequency
when the LC (and R) tank load
displays a purely real
impedance, i.e. 0 phase shift
LC tank impedance
RS L1s
Z eq s
1 L C s 2 R C s
1 1
S
1
2 L12
Z eq
s j 1 R S2
LC
2 2
2
R
2 2 2
S 1
C1 1
LC Oscillator Example
• Transforming the series loss
resistor of the inductor to an
equivalent parallel
resistance
2S 2 2
LP L1 1 2 2 , CP C1, R 1 L
R
L1 P RS
RP
1
1 [Razavi]
LPC P
83
LC Oscillator Example
Loop Gain
[Razavi]
1
• Phase condition satisfied at 1
LPC P
• Gain condition satisfied when g m RP
1
2
84
Supply-Tuned Ring Oscillator
[Sanchez]
86
Capacitive-Tuned Ring Oscillator
87
Symmetric Load Ring Oscillator
[Maneatis JSSC 1996 & 2003]
n-well
[Razavi]
90
Charge-Pump PLL Circuits
• Phase Detector
• Charge-Pump
• Loop Filter
• VCO
• Divider
91
Loop Divider
out(t) fb(t)
[Perrott]
• Time-domain model
fb t 1 out t
N
fb t N1 t dt N1 t
out out
92
Basic Divide-by-2
[Perrott]
• Advantages
• Reasonably fast, compact size, and no static power
• Requires only one phase of the clock
• Disadvantages
• Signal needs to propagate through three gates per input cycle
• Need full swing CMOS inputs
• Dynamic flip-flop may have issues at very low frequency operation (test
mode) depending on process leakage
94
Divide-by-2 with CML FF
[Razavi]
• Advantages
• Signal only propagates through two CML gates per input cycle
• Accepts CML input levels
• Disadvantages
• Larger size and dissipates static power
• Requires differential input
• Need tail current biasing
• Additional speedup (>50%) can be achieved with shunt peaking
inductors 95
Binary Dividers:
Asynchronous vs Synchronous
Asynchronous Divider • Advantages
• Each stage runs at lower frequency,
resulting in reduced power
• Reduced high frequency clock
loading
• Disadvantage
• Jitter accumulation
Synchronous Divider
• Advantage
• Reduced jitter
• Disadvantage
• All flip-flops work at maximum
frequency, resulting in high power
• Large loading on high frequency
clock
[Perrott]
96
Jitter in Asynchronous vs Synchronous Dividers
Asynchronous • Jitter accumulates with the
clock-to-Q delays through
the divider
• Extra divider delay can also
degrade PLL phase margin
97
Dual Modulus Prescalers
2/3
MC=0 3
MC=1
2
15/16
[Razavi]
101
Example PLL Design Procedure
• Step 5 – Determine KVCO K vco = 40MHz/V
300
Channel 4 (235 - 300MHz)
Frequency (MHz)
255
• This is a function of the VCO and 235
210
Channel 3 (190 - 255MHz)
145
165
Channel 1 (100 - 165MHz)
range 100
Voltage Range = 1.6V
• Here I use a combination of VCO Control Voltage