You are on page 1of 277

Digital Design

Preface
To IIIv!alllily. Alii.". Eric. Kelsi, alld Mom:
IlIId ' 0 ' hose ellgill eers II'ho applv ,hel l' sk ills
10 bll ild 'hillgs ,hll' illlp''O I'e 'h e 1111111011 colldll lOlI .
TO STUDENTS ABOUT TO STUDY DIGITAL DESIGN
Dig ilal ci rc u.its . ~hich form th e basis of general-purpose computers as well as peciaJ -
purpose devl~es h.k~ cell phones or video game consoles. are dramatically changing the
worl d . S lUdymg d igItal design not only gives you the confidence Ihal comes" ith funda-
me ntall y unders tand ing how d igital circ uits work. but also introduce!' you 10 an e:tcitiof!
a nd usefu l possible career direction. This statement appl ies regardless of "-hether )ou';
maj or is Electrica l Engineering. Computer Enginee ring. or even Computer Science (in
fac!. the need for digita l designers with strong computer science skills continues to
increase). J hope you find (hi subjeci to be as ime restin e:. excitin2. and useful as J do.
Throughout lhi s book. J have tried not only to in~uce con~ep{S in the rna I inrui-
live and easy to learn manner. but I have al 0 tried (Q sho\\ ho" those concepb can be
\ P A;\ D EXECLTI\ 'l:, PUBLISHER BRL'CE SPATL.
\SSOC IATE PLBLI SHER DAN SAYR E a ppli ed to real-world systems. such as pacemakers. ul trasound machine . pnmers. auto-
~E;\, IOR ,.\ CQ ISITIO:--: S EDITOR ..\ \10 PRODL'CT MANAGER Ct\TII ERINE FI ELDS SHULTZ mobil es. and cell phones. Young and capable engineering ludems (including computer
PROJ ECT EDITOR GI.J\ DYS SOTO science students ) Some limes lea\ their major. clai ming the) " 'ant a job that is more:
SE;\' IOR FDITOR IA L ASS ISTANT DANA KEL LOGG "peopl e oriented." Ye t we need those people-oriented rudenlS more than e\cr. 35 engi-
neering j obs are increas ingly people-oriented. in scveraJ wa) s. First. engineers ~uaJI~
,\ IEDI\ EDITOR STEVEN CHASEY
SENI OR PRODllCTION ED ITOR V,\ LERI E A V,\RGAS
1\I ARKETI:"\G l\ 1,\,'\AG ER PHYLLI S CERY5 wo rk in ,ighTly-imegraled groups involving numerous other engineers (rather than "silting
COVER JLLL'STR ATIO~ ~ lI C II ..\ EL JU NG a lo ne in front of a compu le r all day" as many studen ts belic\'e). econd. engineers often
COVER DES IGj\ER ~ I A D ELY N LESUR E wo rk direCily lI'ilh CUSTomers ( uch as busine people. doctors. la\\ ) ers. go\ cmment offi-
PRODc cn ON SE RV ICES INGRAO ASSOC IAT ES cials. etc.). a nd mu st therefore be able 10 connect with those non-engineer ClJ.)(Qmero.
Th ird . a nd in my opinion mosl importantly. ellgineers build 'lrings tlral dramatically
impo('l people's /in's. teedcd are engineers \\ ho combine the.ir e nthlbiasm. C'T'eati\i £) .
a nd innm'alion wi th Iheir olid engineering skills to con ehe and buiJd ne" product., thai
Thl~ bon!. I' pnnted on acid Ift!c paper. il11prme people's quality o f life.
Cllp\nght :!007 Juhn \\Ik} & Sum. Inc. A ll righh rc~ n C"d, ;-';0 par1 or lhl, publlc alion nhl) be I have included " Designer Profiles" at the e nd of most chaplen.. The de"lgnef't..
rcpr~ul.l·d. ,lIlfed 10 a rctn c\ al ,)"'tcm or tran\l1lll\cd In any rorm or by an) me;I"" e lec tro nt , \\ hose experi ence le \els \ 'M) from j ust a ~ ear to . e\ eral dec:lde .. , and \\ho. l.'"Omparu
ml!chann:al. phmocop) In!!. rl·cordlng. <.Cannlng.or mhe"" ...c, e~<:cpt a .. perm uted under SL"Cllo n, 107 or mnge fm m 5111alli0 huge . , hare \\ ith ~ ou their e\perience~. in .. ighb. and ad\ Il.':e. h'IU \\i ll
10801 the 1976 L' nilcd State... COP) nght At!. ~ I _thou t cuhcr the pnor \\n llcn pcrTlll '\IOn o r the pubh,her.
notice hO\\ co mmon I) the) disc uss the people aspects of wetr Job~. You m3~ 3ho notice
or authOfl/.lllon through Pol) mcnt of the appropnate pcr-<:(lp} fcc to the COP) fl g hl Cle:u-:.ncc Cenlc r. Inc ..
thei r cOIhu li; ias m a nd p:bs iQn for their job .
n" Ru,c1,I,oOO Ome. Oan\c,.... \IA 0 1923. 197H) 750·~·t(X). ra~ (1)7~) 7S0--W70 or on the \\ Cb :11
::\~\\ nlp.1 m.:hu(l1II Rcque"h 10 !he Pubh.. hcr ror perml"wn .. hou ld he addrc\\Cd to thc I' e nm .... io n ..
Dcpanmenl. John Wl lc} & Som. Inc .. III RI \cr Street. ilobol-cn. j\J 070JO.577.a . (10 1) 748.('0 1 J. TO INSTRUCTORS OF DIGITAL DESIGN
f.l\ 1201 J 74H·60(Jlt IIr on llOc al hlll)://I\'\I'II ' 11/1(,. n}ltrlJ.:(I/I't' fII l/\\I01l 1.
TIlis book. brea k!o from the 19 (hJ19 Os. digiml d(!, ign \It:" empha:.llm~ 'Iu-hmlted
T., unlet bon!., or ror CU\ lomer ..en ic:e plcd,e (illl l ·ijO(J·CA LI WI U ·. Y /125-W45,.
dc\ igll, lI1 :--tc3d cmph 3!<olllOg the 1()()()" situation of rtgisur-trolld'er-Iel'el (RTL) de'lgn
ISB"I 1J7H·{)-HO·04JJ7·7 B) Ic:ml) dl\lingul\htng the IOP1C of bask design from optimlzJtion. 1\\0 Il"PII,." pre\ 1-
o u.:-- I) IO "lCp.lmbl) int enwlOtXi. the book 3110\1" J tiJ".1 cour..e on UigllJI de.. lgn I,) re:""' h
PrlOted In the L nlted Sl:Ite' or Ament,1 fi nd ~\ cn emplm'l zt' the tOpH.' of RTL J ig n. A .. rudent t' \ P'l,,.ru tl) RTL Je"llgn In 3 hl"t
10 9 II 7 6 5 4 3 2. COu..... C," \ \ III h3\ e :1 mol'\" I\'le\ ant \ ie\\ of the Illl1dem ili,gi lal dC' I,gn rield. le.u.hn,g n<'1 ,,-"'I~

iii
Preface
Preface

to a beller app rec iati on of modem computers 3nd other digi tal devices. but n more 3CCU- HDL cOl'erage flexibility. The book's organization cleanly allows instructors to
r~HC unders tandin g of careers involving digi t31desig n. Such an accurate und crst:'lIlding is cove r HDLs (hardware descri ption languages) intennixed with the introduction of
cri ti cal to atlr3C( co mputing majors 10 C3rcers involvi ng some 3mOUJl( of digital design. desig n concepts. to cove r HDLs later. or 10 not cover HDLs at all. The HDL
and to cre~lI e 3 ci.ldre of engineers wi th the comfort in both ··softw3re" and " h3rdware" chapter's subsecti ons (Chapter 9) each correspond to an earlier chapter. sucb that
nccessary in mode m embedded comput ing system design. Sec tion 9.2 can directly fo llow Chapter 2. 9.3 can follow 3. 9.4 can follow J , and
The dis tinguishing of basic desig n fro m optimiznrion should not be interpreted as 9.5 can follow 5. Funhennore. rather than the book choosing jUst one of the
avo iding a bOllom-up 3pproac h or glossing over import,lIlt steps - th e book takes a con- popular languages - VHD L. Veri log. and the relatively new SystemC - the
crcte bOllom-up 3pproac h, starting from transistors. and building incrementall y up book provides equ al coverage of all three of those HDLs. And we use our exten-
through gates. flip -fl ops. registers. controllers. datapath components. etc. Rather, the dis- sive ex perience in synthesis with commercial tools to create HDL descnptions
ting uis hing enables th e stu de nt to initiall y develop a solid understanding of basic design. well -s uited for synthesis. in addi ti on to being suitable for imulation.
before considerin g the morc advan ced topic of optimizati on, akin to how a phys ics book
Accompanying HDL-introdlictiOIl books. InstruclOrs wishing to co\er HDLs to an
introd uces Newton's Ja ws of Illotion initiall y ass uming fri cti onless surfaces and no wind
eve n greater extent can utili ze one of our HDL-introduclion books specifically
rcsislJ.nce. Furthermore. optimi zation IOday invo lves more than j ust size minimi zati on.
designed to accompany this tex tbook. wriuen by the same author as this textbook.
ins tead requiring a broader understanding of tradeoffs among size, perform ance. and
Our HDL-introducrion books follow the same chapter tructure as. and use exam-
power. and eve n of tradeoffs among custom digi tal ci rcuits and microprocessor soft ware.
ples from. this tex tbook. eliminating the common situation of students struggling
Aga in , coverage is kept conc rete and appropriate to an int rod uctory digital desig n course.
to correlate their distinct. and sometimes contradicting. HDL book ilIld digital
Nevertheless. the book distinguishes basic design from optimiza tion in a way that
desig n book subjects. Our HD L-intmduction books discuss language. simulanon.
cleanly provides an ins tructor max imum Hexibility to introduce optimi za ti on at the tim es
and testing concepts in more depth. providing numerous HDL e."tamples. and are
and to the ex tent desi red by th e instructor. In pani cular. the optimiza ti on chapter's sub-
al 0 designed to be usable by themselves for HDL learrung or ,..,fereoc<:. The
sections (Chapter 6) eac h correspond directly to one earlier chapter. such that Secti on 6.2
books emphasize use of the language for real design. clearl) distio_uishing HDL
can direct ly follow Cha pte r 2. Secti on 6.3 can fo llow Chapter 3. 6.4 can follow 4. and 6.5
use for symhesis fro m HDL use for testing. and include e."tlensive examples and
can follow 5.
fi gures throughout to ill ustrate conceplS. Our HDL-introductioD ~ come "",ith
Several additional features of the boo k include:
complete Powe rPoi nt slides th at use graphic and animations lO sene as an ea:Sy-
Extensil'e lise of applied examples alldfigures. Afte r desc ribing a new conce pt and to- use tutori al on the HDL.
providing basic examples. Ihe book provides exampl es th at ap pl y the co nce pt 10 Allthor-created graphical animated Pou'uPo;nt slides. A rich set of Po,,-erPoint
appl icati ons recog ni zab le to a student. like a seal belt unfas tened warnin g sys tcm. lides are available to in tructors. The slides were reated by the textbook'
a computerized checke rboard ga me. a color printer, or a di gital video camera. Fur- author. res ulting in consiste ncy of perspective and emphasis be(\\, een the tides
ulermore. the end of mOst chapters include a product profi le. intend ed to give and this book. The slides are designed to be a truly effective teaching tool for the
students an eve n broader view of the applicability of the concepts. and to intro- instructor. Most slides are graphics based (avoiding sLides con isting of j\bl bul-
duce clever appl icati on-speci fi c conce pts the students may find ve ry interesting- leted lists of tex.t). The lides make e:<. t en ~ i\'e us of animation \\ here appropmue to
like the idea of beamfoml ing in an ultrasound machine or of fi ltering in a cellular gradually uO\'eil concept or build-up circuits. ~et e\en nnimated sli~ can b!
phone. The book exte nsive ly uses fig ures 10 illustrate conce pts. contai nin g over printout out and undersuxxi. 1early e\er) figure. oncepL and e"tampie from tlt.b
600 figures.
Learn ing through discovery. The boo k emphasizes understanding the need fo r
book i included io the set of almo t 500 lides. from \\ hich instructors =
choose.
new concepLS. which not onl y helps stude nts learn and remembe r the concept~. but Complete solmiOlls mallilal. Instru tors rna) obtain 3 complete - luuons m3DuJl
develops reaso ning skills that can apply the concepts to other do mains. For (about 200 pages) containing !!olutions to c\ ef) end-of-chapter execci..-..e In thho
example. rat her than just defi ning a carry- lookahead adder, the book shows intui- book. The manual e.\tensively utilizes figures to illu. tr..ne .;:oluoo05.
tive bu t inefficient approac hes to buil ding a fas ter adder. even tua ll y solving the
inefficienc ies and leading to ("discovering") the carry-lookahead des ign. ~ 11r,1,,'PLU lIebsi". Dicit.1 Design;' supported b) \\'jle)PLL' - 3 po\\ rful
PLUS · nnd 'highly intcgrnted sulte of t a;hing and learning re,oun.-es dosign<d to bndge
Introduction to FPGAs. The book incl udes a full y boltom-up int rod uction to
FPGAs. show ing stude nts co ncretely how a ci rcuit ca n be co nvcrtcd into :1 bit- the gnp between "h::lt happens in the c11l!JSroom and \\bat hJ.r~n' .It tkml
stream Ihat prog rams th e indi vidual lookup tables. switch I11 tl tri cc!!. and olher pro~ WileyPLU include, u complete nline ,enlion of the te\t algonthm, ' '''I) ~ """.
grammab le co mponents in an FPGA. This co ncrete int rod ucti on cli mi nntcS the ::lted problems and guided onhne c \ eft'l ... e~. dJith.>nal 3.: h II ~Iud '1doo
mystery of th e increasingly-common FPGA devices. . . olutions of selt"Cted e\ample.... anim:nion, f pen1l1ent 1,.'\m(,."C'pt. (b..."'llh ~
b) Prole>""r Ed DD<nng of R ,.-Hultn"" In,tllUl l. X'OIplet· ,,,Iuu,,,,, manual
Preface vii
vi Preface
controllers) before combinational components (e.g .• adders. comparators. etc.,. Such reor-
and aut hor-created an im:ttt.::d Po\\crPoint >;. pili:' cour~c an d homcwork manage- dering may lead into RTL des ign more natu rall y than a tmditional approach. foll""lI1.
mcnt lOob. in one ~al:.y-to- li se wcb:, itc. instead an a~pr~ac h of increas ing abstraction rather than the traditional approach that 1Oe;
To learn how to aCCC:":" thcsc fC~lIu re~. go 10 the Book Co mpa ni on Site at arat es co mbmatlonal and sequentiaJ de ign. HD Ls can aga in be introduced at the end. left
\\ w\\.w iley.comlcollcgelvahid. or w \ \ \\.dd\ ahid.com. for another cou rse, or integrated after eac h chapter. This approach could aJw be used as
an intermediary lep when migrating from a fu lly-trad itional approach to an RTL
HOW TO USE THI S BOOK approac h. Migraling might involve gradually postponing the Chapter 6 sectjon~ - for
example. covering Chapters 2 and 3. and then Sections 6.2 and 6.3. before mo,in. on to
Thi... book \\a~ tlcsigned to allow nc\ ibil ity to choose among the most C0 l111110n
~~~ -
appro:lt·hc:. to ma terial covc ragl!. We desc ribc :.cvcra l ilp proachcs below.

Completely traditional approach


RTl· focused appro a ch
A n RTL-focu:-cd approach wo uld :.irnpl y covc r the fir!)t 6 chapters in ord er:
Th is book could also be used in a com pletely traditional approach. '" follo\\ :
I. Introd uction (Chapter I)
I. Introduction (C hapte r I)
2. Combina ti onal logic design (Chapter 2) follo\\ed by combtnational logic opumi·
2. Co mbina tional logic de"-ign (Chaptl:f 2)
J. Scquenti al logic design (C hapter 3) zation (Section 6.2)
..,t COlllbinmional and ~eqllc nti ;l l com ponent design (Chapter-t )
3. Co mbi national component design (Section 4.1. 4.3. 4.4. 4.5. 4.7. 4 .. 4.91 fol·
lowed by co mbin ati onal component uadeoffs (Section 6A - Adders
5. RTL dc~ i gn (Chapter 5)
6. Optimizations and Tradeoffs (Chapter 6). to the extent desi red
4. Seq uential logic des ign (Chapter 3) followed b) sequential logic opumizanon
7. Phyo;;ic;\I implcmcnl:ltion (C hapter 7) and/or Processor design (Chapter 8). to the (Section 6.3)
5. Sequential component design (Chapter 4. ecuons 4.2. 4.6. 4.101 follo\\OO b~
c'Xlcnt des ired.
sequential co mponent tradeoffs (Section 6A - ~ l ultiplie~)
We thin!'" thi ... io;; a great way to order the 1l1~lIcri'll. re:-.ulting in stud ents doi ng in teres ting 6. RTL design (Chapter 5) to the extent desired. follo\\OO b~ RTL opumizationl
RTL dc:..ign:- in abollt 7 \\ech. HDL" cnn be int roduced at the cnd if tim e pe rm its. or left tradeolTs (Sectio n 6.5)
fo r a !-Iccond course on digital desig n (as donc at UC R). or covcred immcdi ate ly a rter 7. Ph ysical implementation (Chapter 7) andlor Proce",or design (Chapter l. to the
cach ch"ptcr - ,,11 threc app roac he~ appea r to be quil c C0 111111 0 n . ex tent desired.
This is thc mos t widespread approac h durin g. the past (\\ 0 decnde~. \\ ith the addition of
Tra dition a l approach with some reordering
RTL towards the end. Al though the emphasized distinction be{\\ een combmationJ.l and
Thi", book can be readily u~ed in a trad itional approach that int roduces optim izatio n along scquentia l design may no longer be rele\'ant in the era of RTL de... ign ~\\here both type:...
\~ ith ba~ic dcsign. wit h a slight diffcrcnce from thc traditio nal approach bcing th e wap- of design are imemli xed). some people belic\-e that such distinction make~ for ~ C3.... ier
ping of cove rage of combinat ion'll component.:. and sequent ial logic. a~ follo\V~: learnin g path. which may be true. HOLs an be in luded at the end. left for a tller C\."')lII"'e.
I . Introduction (Chaptcr 1) or integmted throughout.
2. Co mbinational logic dC!o.ign (C hapter 2) fo llo\\cd by combi nati onal logic opti m i~
n ll ion (Sec tion 6.2) ACKNOWLEDGEMENTS
3. Scquential logic des ign (C hapter 3) fo llowed by ...cqucnti al Im! ic optimizati on Man) people and organization~ contributed to thb ediuon I.)f the tx,,-"'l..
(Section 6.3) -
-t o Combinati onal and sequcntial componc nt de\ign (C haptcr 4) fo llowed by compo- tafT memben. at Joh n \\'ile\ and Son ... Pubh ... he", hJ\e e\len"'l\el~ ",upP'-"'noo ~
nellt tradcoffs (Section 6.4) book's de\'elopment. includi-ng Cutherine hultZ, GlJd~ ... l.)tO, Dana J....l!lk;g. and
5. RTL dc ... ign (Chapter 5) to the ex ten t dc, ired. followed by RTL opti1l1 i / ~ll ion/ Kelly B ylc. Bill Zobrist ,upported m) e3J'lier ··Emb«lded ~ ,tern o,,'tgn- N .
tradeoff, (Swion 6.5) motivated me to \\ rite the pre~ent btlOk. Jnu pro\ IUNI g~.lt JJ\ II.: ~ thl\.\Ughl'lllt
6. Phyo.,ical i~l1plcmcn t::nion (Chapter 7) and/or Proccv.• or dc,ig n (C hnptcr 8). to the develop ment.
extcnt d e~ lrcd. R):111 ~ l :U1nion contributed man) lIem~. II1dUJ,lOg the Jrrcnu11.: • nwt".~I'\\U'"
I: xamplc"- and C\C'rcl~e:.. ~e\ eml :.ub ... t: 'lion.... the ':I.)mrl<.'le !.'\ClX'l,e ,,,-)Iuth.)n ...
Till ... i".. :I ve ry rcao.,onabh.: and ef~·ec.t iv~ approach. completing all d i ~cus,ion or Oll l.! tu pie
manual. fnct.chcc!...tn\!, e\ ten'l\e proo(re~tdtng. IJ\:'mend\.,u ... J.."t'I .•lnl.:· Junn£. P(\"-
(c.~ .. I-SM dL''''lgn ~I'''. \\c ll :to., Optl l11 l1~lI l~lI1) he fore mov ing on to the nl.!x t topic. The n:nr-
duclI(ln. help \\ Ith th;' ,lid6. plent) llf I...ka., dunnt! ..IN.'u''1 ...''n' .•\OJ l11u,,--h m\m!
tienng lrom .1 tr:ldlliona l ,Ipproach Introduce", h~h l c ,cqucnli al dC'Ign (FS M' tlild
viii Preface

Ro man Lyscc ky deve loped numerouS exampl es and exe rcises. contri buted most of
the conte nt of the HDL chapler. and co-authored our accompanyin g HDL-intro-
duction books. Roman and Susan Lysecky providcd muc h proo freading
assistance. Reviewers and Evaluators
Numerous reviewe rs provided outstanding feedbac k on various vers ion s o f the
book. Spec ial thanks go to earl y adopters, such as Niki l Out!, Shannon Tauro. J.
Dav id Gillanders, Shcldon Tan. Trav is Doo m. Roman Lysecky, a nd others. who Rehab Abdel-Kader
prov ided excelle nt feedba ck from them se lves and from their students. Georgia Southern University
Otmane Ail Moha med Concordia University
Th e importance of th e support provided to my resea rch and lcachin g caree r by the Hussa in AI -Asaad University of California. Davis
Natio n'll Science Fou ndation cannot be overstated. Additional suppo rt from th e Rocio Alba-Aores University of Mjnnesota. Duluth
Se mi condu ctor Researc h Corporation ca tal yzed industry co llaboratio ns lhat in Basse m A lhalabi Florida Atlantic Un iversity
tum inOuenced mallY of th e perspectives in thi s boo k. Zekeriya Ali yaz iciog lu Cal iforn ia Pol ytechnic State UniversIty. Pomona
Visha l Anand SUNY Brockpon
ABOUT THE COVER Bevan Baas University of California. Davis
Noni Bohonak Uni versity of South Carolina. Lancaster
The cove r's image of shrinking squares graphically depicts the amaz ing rcal-life phe-
Don Bouldin University of Tennessee
no men a of di gital ci rcuits ('computer chips' ) shrinking in size by one half roughl y every
David Bourner University of Maryland Baltimore Coun!)
18 mont hs. for several decades now. a phenomena ofte n referred to as Moore's Law. Such
Elah eh Bozorgzadeh Uni versity of California. m i ne
shrinki ng has enab led in credibly powerfu l computing circuits to fit in tiny devices. like
Frank Cand oc ia R orida International University
modem ce ll phones, medical devices, and portable video games. See pages 34 and 35 for
Ralph Carestia Oregon Institute of Technology
a disc uss ion of Moore's Law.
Rajan M . Chandra California Polytechnic State Universi!). Pomona
Ghulam Chaudhry University of Mis ouri. Kansas Cit~
ABOUT THE AUTHOR
Michael Chelian Californi a State University. Long Bea h
Frank Vahid is a Professor of Computer Science & Engineering at the Uni versity Ru sse ll Clark Saginaw Val ley State Univcrsit}
o f Ca lifornia. Ri vers ide. He holds Electrical Engin eering and Compu ter Science James Conrad University of Nonh Carolina. Charlotte
degrees: has worked/consu lted for Hewlett Packard. AMCC. NEe. Motorola. Kevan Croteau Francis Mari on University
and medica l equipm ent makers: holds 3 U.S. patents: has received several Sa njoy Das Kansas Slare Unh'ersity
teaching awards; hclped se tup UCR's Computer En gineering program; has Ja mes Davis Uni versi ty of South Carolina
a uthored two prcvious textbooks: and has published ove r 120 papers on digital Edward Doering Rose-Hu lman Institute of Technolog)
desig n to pics (automation. architeclu re, and low- power). Travis Doom \Vrighl Slate Uni \ crsiry
Jim Duckworth \Vorcester Pol) technic institute
Nikil Dutt University of California. Iryine
De nni s Fairclough Utah Valle) late College
Paul D. Franzo n 'orth Carolina uue Unher;il'\
Subra Ganesan Oakland Uni, ersit) .
Zane Gastineau Harding ni,ersi!)
J. Dav id Gill a nder, Arkansas tate Unherslt)
C lay Gl oste r Howard nh ersil)
Ardian Grcca Georgia S uthem l'nhersit)
Eric Ha nse n Dartmouth College
Bruce A. Harvey FAM U·FSll College of Englne-ering
John P. Hayes Uni\·e~it) of ~1 ichigan
Mi chae l Helm Texas Tech Unt\t:'~lt~
William HolT C lorad chool of Mine.
Erh-We n Hu \ i1liam Ptllcf'lon Unt\en.lt) of l\'e\\ Jef't:~
Baback ILadi UNY 'e\\ P:tlu
viii Reviewers and Eva luators

Jerf J 3ck~OIl University or Alabama


An ura J ay a ~ ulllan;} Colorado State Uni versity
Bruce Johnson
Ri chard J o hn ~ toll
Universit y or Nevada, Reno
Lawrencc Technologic<l1 Uni versit y
Contents
RJji v Kapadia Minnesota State Uni vers ity. Mankato
Bahadir Knruv Fairleigh Dickinson Uni vers ity
Robe rt Klenke Virginia Commonwcalth Uni versity
Clint Koh l Cedarville Universit y Preface iii
Ht:rrnann Kromphol z 3.3 Finite-State Machines (FSMs) and
Texas Tech University
Content s xi Controllers I II
Timoth y KUI7;Wt:g Drexel Uni vers ity
JUl1lokc L ~l d eji - Osi3s 3A Controller Design 120
Morgan State Universi ty CHAPTER 1
Jeffrey Lillie 3.5 More on Flip-Flops and Controllm 130
Roches ter Institute orTcchnology Introduction
David Livingston
1 3.6 Sequential Logic Oplimizations and Tradeoff~
Virginia Military Institute
Hong Man 1. 1 Digital Sys tems in the World Around Us (See Section 6.3) 137
Stevens Institute of Technology
ABOI Gihan M ilndour 1.2 Th e World or Digital Systems 4 3.7 Sequenlial Logic Descrip[ion using
Chri stopher Newpon University
Di ana M :lrculesc tl 1.3 Implemcnting Di gital Systems: Programming Hardware Description Language..,
C<lrnegie j\'lclion Uni versi ty
Miguel [l,llarin Microprocessors versus Designing Digital (Sec Section 9.3) 137
McGill Uni versity
M Ll ryHIll M ouss avi Circuits 17 3.8 Product Profile-Pacemaker 137
Calirornia State University. Long Beach
Olb Na~raoui 1.4 About thi s Boo k 23 3.9 Chapter Summilr) 1~0
University or J\llemphi s
P;:1Irici~1 Nava 1.5 Exercises 24 3.10 Exercises I ~O
University or Texas. EI Paso
John Nestor Lafaycllc College
Roge lio Pal oll1cra CHAPTER 2 CHAPTER 4
Garcia Uni versity of Pueno Ri co. Mayaguez
Ji.IIllC:-. Peckal Combinational Logic Design 30 Datapath Co mponents 150
University or WaShington
ABO \Vitale! Pedrycz
Uni versity or Albena 2. 1 Introduc ti on 30 ~. I Introduction 150
Andrew Pcrry 4.2 Registers 151
Springfield College 2.2 Switches 30
Denis Popel Bakcr University 4.3 Adders 165
Tariq Qilyyum
2.3 Th e CMOS Transistor 35
Cali romia Polytec hnic State Universit y. Pomona 4.4 Shifters 173
Gang Qu 2.4 Boolean Logic Gates-Building Bl ocks ror
University of Maryland ~. - Comparators 177
M ih:lclu RLldu Di gilill Circuits 38
Rosc-Hulman Insti tute or Tcc hnology ~ .6 COunters 18\
Suresh Rai 2.5 Boolean A lgebra 47
Louisiana Statc UniverSity. Bnt on Rouge 4.7 ~lultiplier-ArrJ) t)le 189
William Rcid 2.6 Representation s of Boolean Fun ctions 55
Clemson Uni versity 4.8 Subtracto.." 190
Mu!.okc Scndnu ln 2.7 Combination:.11 Logic Design Proce s 67
Temple Univcrsity -t9 Arilhm~tic-Logic L'nib-ALL".., 101
SCOlt Smith 2.8 More Gates 73
Boise Statc University -I-. I 0 R ~2i~ tcr Fil('~ 2O..J
Gary Spivey 2.9 Decoders :.1nd Mu xc 77
Gcorge Fox University 4.11 Da~apath Component Tradeoff,
Lnrry St ephens 2. I 0 Add itioll:.11 Considerations 83
Univcrsity or South Carolina ( co eeuon 6.41 109
Jamc!. Stine 2. 11 Combi nmional Logic Optilll iz:.1t ions
Ill inois InstitUic or Technology . t I ~ D~1I3p:lt.h Component De,C'ription u.qng
Philip Swain and Tradeoffs (See Secli on 6.1) 86
Purduc University Hardware [Xscnption l:mguagc!"
Shannon T<Illro 2. 12 Combinational Logic Descripti on using
University or California. Irvinc ( ('ceClllln 9A) 109
Cmlos: T<rvora Hardware Description L:lI1guagt!s
Marc Timmcrman
Gonzaga Universi ty 4.13 Chapter Summar) _16
(Sec Section 9.2) 86 -1-.14 E\r: "b~!'> 11
Hariharan VijaYilragha vnn Oregon Institute of Techn logy
Univcrsity or Kansi:\'\ 2. 13 Chapter Summary 86
Bin \\lang CHAPTER 5
M . Chri!oo Wcrnicki Wright State Ullivcr~ ity 2. 14 Exerc ises 87
Shullchich Yang Ncw York In,titutc of Tcchnology Register-Transfer level (RTl) DeSign 1_S
CHAPTER 3
Hcnry Ych Roc;hcM:r Institut e of Technology 5.1 Introdu("lton .!!.S
Seque ntial Logic Design- Controlle rs 95 .5 ..! RTL IX'lell \ l<thoJ 126
Naccll1 Zaman Califom" Sto te Univcr\ it y. Lo ng Be.lch
San Jaoqui n Delt a oll ege 3. 1 Introd ucti on 95 .:i~ RTL Dt"lgn E\Jlllpk, .U1J I"ue,
3.2 Storing One Bit- Flip.Flop, 96 IXtenninlllg Ch. "Io..:k F1\.'qu~nl.:~ .!.51
xii Contents
vi ii
5.5 Behavioral-Level Design: C to Gates 8A A Six- Instruction Programmable
(Oplional) 254 Processor 434
5.6 Memory Components 258
5.7 Queues (FIFOs) 27 1
5.8 Hierarchy-A Key Design Concept 275
5.9 RTL Design Optimiza ti ons and TradeofTs (See
8.5 Example Assembly and Machine
Progrnms 438
8.6 Funher Extensions 10 the Programmable
Processor 439
1
Section 6.5) 278 8.7 Chapler Summary 44 1
5.10 RTL Design using Hard ware Dcscriplion 8.8 Exercises 442
Languages (Sec Section 9.5) 279
5. 11 Produci Profi le: Cell Phone 279 CHAPTER 9 Introduction
5. 12 Chaptcr Summary 285 Hardware Description Languages 445
5. 13 Exercises 285 9.1 Introduction 445
9.2 Combinational Logic Description Using
CHAPTER 6 Hardware Description Languages 447
Optimizalions and Tradeoffs 294 9.3 Sequential Logic Description Using
AS 1.1 DIGITAL SYSTEMS IN THE WORLD AROUND US
6.1 Imroduct ion 294 Hardware Description Languages 459
6.2 Combinational Logic Optimizalions and 9.4 Dmapmh Companelll Deseriplion Usi ng Meet Arianna. Arianna is a five-year-old girl who lives in CaJjfomia. She's a cheerful. out-
Trodeoffs 296 Hardware Description Languages 467 going kid who loves to read, play soccer. dance. and lell jokes thai she makes up be""lf.
6.3 Scquelllial Logic Optimizalions and 9.5 RTL Design Using Hardware Description
Tradeoffs 317 Languages 475
6A Dalnpalh Componelll Tradeoffs 333 9.6 Chapler Summary 492
6.5 RTL Design Optimizations and 9.7 Exercises 492
Tradeoffs 345
APPENDIX A
AE 6.6 More on Oplimizations and Tradeoffs 354
6.7 Product Profile: Digital Video Playerl Boolean Algebras 496
Recorder 36 1 A. I BOOlean Algebra 496
6.8 Chapler Summary 370 A.2 SWilching Algebra 497
6.9 Exercises 370 A.3 Impanam Theorems in Boolean Algebra 498
AA Olher Examples of Boolean Algebras 504
CHAPTER 7 A.5 Funher Readings 504
Physical Implementation 379
7. 1 In lroduClion 379 APPENDIX B
7.2 ManufaclU rcd IC Technologies 379 Additiona l Topics in Binary Number Systems 505
B.I Inlroduclion 505 One day. Ananna's family was driving home from n soccer
7.3 Programmable IC Technology-FPGA 388
7.4 Other Technologies 40 1 game. She was in the middle of excitedly talking about the game
B.2 Real umber Represcnlation 505
7.5 IC Technology Comparisons 409 B.3 Fixed Poilll Arilhmelic 508 when suddenly the van in which she was riding was clipped b~ 3
7.6 Prod uel Profile: Giani Video Display 412 car thai had crossed O\'er to the wrong side of the higb", a~ .
8.4 Floming Poim Represelll," ion 509
7.7 Chapler Summary 416 B.5 Exercises 514 Although lhe aceidenl wasn·, panicularly bad. the impa I caused a
7.8 Exercises 4 17 loose item from the rear of the van 10 project forward inside Lhe:
APPENDIX C van. slriking Ananna in the back of the head. he "cnt
CHAPTER 8 Extended RTL Design Example 515 unconsciou .
Programmable Processors 421 C.I Inlroduclion 515 Annnna wns rushed to a hospital. Doctors immediatel) noticed that tk!r b~athmg
8.1 In!roduclion 42 1 C.2 DeSigning Ihe Soda Di 'pen,cr wns vcry weak-a common situ:llion after a se\ ere blo" to the head-.. o ~~ put her
8.2 Basic Architecture 422 Con !roller 516 onto n ventilator. which is Amedical dl!vice lh::u ~bL' with breathing. he;' hJd ... ~t3Jnro
8.3 A Three- Instruction Programmable C.3 Undemanding Ihe Behavior of Ihe . odn brain tmumA dunng the blow (0 the hend. nnd she rel1lain~ unco~(:i~ for ~ \ern1
Proce!)sor 428 Dispcn;cr COlllrOlicr nnd Dn",,,nlh 5 19 weeks. All her vi tal signs were !)t3ble, ex ept ,he ("onllnued to re-qulre breaming a.. . . . I.. -
Index 526 Innce fro m the ventilmor. Patients in such tl Idtu3tion some tames 1'l'\."'O\er. .:md 'nnl<ome,
they don'l. \ hen they do recO\'cr. sometime~ that reco\ "'I) take . . man~ mooLtb
viii I Introduction
1.1 Digital Systems in the World Around Us 3
Thanks to the advenl of modern port able venti lators, Portable ventilators help not only trauma vic-
Arianna's parents were gi ven the opti on of taking her tims. but even more commonly help patientS with
home while they hoped for her recovery, an option they debi litating diseases, like multiple sclerosis. to gain
chose. In addition to the remote monitoring of vi tal signs mObility. Such people can today move about in a
and the daily at-home visits by a nurse and respiratory wheelchair, and hence do things like attend school.
therapist. Arianna was surrou nded by her parents, brother, visi l museum . and take part in a family picnic.
sister. cousins. other family, and frie nds. For the majority experiencing a far better quality of life than was fea-
of the day. someone was hold ing her hand , singing to her, sible JUSt a decade ago when those people would
whispering in her ear. and encourag ing her LO recover. Her have been confined to a bed connected to a large.
sister slept nearby, Some studies show th at such hu man heavy, expensive ventilator. For example. the young
interaction can indeed increase the chances of recovery. gi rl pictured on the left will li kely require a venti-
And recover she did. One day, several months later, lator for the rest of her life-but he will be able to
with Arianna's mom sitting at her side, Arianna opened Phoro courtesy of PlllmOI1l'li('~ move about in her wheelchair quite freely. rather
her eyes, Later that day. she was transported back to the than being mostl y confined to her home.
AI
hospital. After some time. she was weaned from the venti- The LTV 1000 ventilator described above was
lator. Then, after a lengthy time of recovery and conceived and de igned by a mall group of people.
rehabilitation. Arianna finall y went home. Today, six-year- pictured on the lefL who sought to build a ponable
old Arianna shows few signs of the accident that nearly and reliable ventilator in order to help people like
took her life. Arianna and thousands of others like her (as well as
What does th is story have to do with digi tal design? to make ome good money doing o!). Those
Arianna's recovery was aided by a portable ventilator designers probably started off like you. reading text-
device, which in turn is possible thanks to di gital circuits. PholO cOllrles), oj PIIIII/Ollel;c,,' books and taking courses on digital de ign.
A Over the past three decades, the amoun t of digital circu itry programming. electronics. and/or other subjectS.
that can be stored on a single computer chip has increased The ventilalor is just one of literally thol/sands of use ful device that have Come
dramatically_by nearly 100.000 times . bel ieve it or not. about and continue to be created thanks to the era of digital circuits. If you top and think
Thus. ventilators, along with almost everything else that about how many dev ices in the 1V0rid around you rely on or are made po sible becau e of
runs on electrici ty, can take advantage of incredibly pow- digital cirCuits, you may be quite surpri sed. A few such devices include:
erful and fast yet inexpensive digital circuits. The
Antilock brakes. ai rbags. aUlofocus cameras. automatic teller rn3 hines. aircraft conrroUers
ventilator in Arianna's case was the Pulmonetics LTV
and navigators, camcorders. CilSh regi ster. ce ll phones. computer net\\orks. credit card
1000 ventilator. Whereas a ventil ator of the early 19905 readers, cruise controllers. dcfibrillmors. digital cameras. DVD players. electri card reader'S.
might have been the size of a large copy machine and cost electron ic games. electronic pianos. fax machine!), fingerprint identjfiers. hearing aids. home
perhaps $100,000. the LTV 1000 is not much biooer or securi ty systems. modems. pacemakers. pagers. personal compute". personal digita1 assis-
hea' h 00
I'ler t an this textbook and costs on ly a few thousand lants. photocopiers, port able music players. robotic aml . I.,canner-, lele\"ision~. IDc!nn Stat
dOllars~small enough, and inexpensive enough, to be cOlllrolicrs. TV se t-top boxes. ventilators. vid\!o game consoles-the Ii.:,( goe\ on.
c,arned In med ical rescue helicopters and ambulances for a ile il/dicalor oj Those devices were created by tens of thousands of designers. including omputer
life-saving Situat ions, and even to be sent home with a Ihe ra fe I"ar lIe\\'
;III'(:' m;OIl .\' are sc ientists. computer engineers. electrical engineers. mechanical engineers. and others.
patient. The digital circuits in side conti nua lly mon itor the deve/oped is Ih e working together wi th scienti sts. doctors. busine s people. teachers. etc. One thing that
pat ient 's breathing, and provide just the right amount of number of 11(:,11' seems clear is thai new devices wil l continue to be inyented for the fore<eeable furure-
air preSSure and volume to Ihe palient. Evel), breath thai ptllelltS gran/ct/-
devices that in another decade will be hundred of times smaller. cheaper. and m re po\\_
170.000 per yellr
the deVice deli vers requ ires 1II;/I;OIlS of compulations for i" the U.S. (llolle! erfu l than today's devi ces. enabling new applications that toda~ \\e don't e\en dream of.
proper delivery, computat ions carri ed ou t by the digital Already. we are seeing amazing new applications that seem futurisric e\en though tbe~
CirCUitS inside.
exisr today. like tiny digital -circuit-controlled medicine tii"pem,ers implant~ under the
g skin. voice-conrrolled ce ll phones and applian es. roboric self-guiding hou, h,'lli \ J uurn
cleaners. laser-guided automobile cruise control. and m reoWhat', not c1e.lf b \\h:u n \\
and exc iting applicat ions will be devel ped in the future. or \\ ho those dey i' S \\ ilIl:>en-
Portable velllilator elit. Future designers. li ke YOllrselr perhaps. \\ ill h Ip dl'tennine th;}t.
1.2 The World of Digital Systems 5
4 1 Introduction

1.2 THE WORLD OF DIGITAL SYSTEMS

Digital versus Analog


.
.
h one of a finite sel of possIble values,
o
~ Sound waves
The world is mostly analog, and therefore many applications were
previously implemented with analog circuits. But many implementa-
lions have changed or are changing over to digital implementations.
To understand why, we might first notice Lhat although the world is
•_____ ..::::L ________ . most ly analog, humans often obtain advantages when converting
A digilal signal is a signal Ihal al any lime can ave log signal can have one of an
· I k . '
an d I S a so ' nown as a discrete Si g na.
I [n contraSI , an alia continuouS sional. A signal I.S :I :
move the analog signals 10 digital signals before "processing" that infonnation.
'+ '
6.'d i::i~hb:::~s
. . . d ' Iso known as a 0 For example. a car horn is actually an analog signal-the volume cao
mfil1lle number of possIbl e values, an IS a . I I every inslant of time. An
. .
JUSI some physIcal phenomena 11al as
.
everyday exam ple of an analog sIgnal IS t e
I h a unIque va ue a
. h temperature ou
ture may e ·
ts'lde because phys ical tem-
"
b 92 356666 degrees An
.. .
.
.
I IU
i + j the magnet.
take on infinite possible va lues, and the volume varies over time due
to variations in the battery sLrength, temperature, etc. But. we
perature is a continuous value-the lempera ft· you hold up because the , , humans neglect those variati ons, and we "digitize" the sound we hear

t ===_
. . humber 0 lIlgers ,
l ___ ~!:~ ~~~_~~_'!--. .:-="=:
eve ryday examp le of a di gital sIgnal IS l e n fi '1 set of values [n fac t the into one of two values: the car hom is "off." or the car hom is "on"
. 7 8 9 or IO--a 111 e ., (gel out of Ihe way!).
va lue must be en her 0, I, 2. 3, 4, 5, 6. , , ' . . (d"1 s) 111eaning finoer
.. " .. . d f "dl on" Igl U , 0 .
lerm dlgnal comes from Ihe Lalln wor or 0 . I those th'lt can have one of which creates Converting analog phenomena to digital. for use with digital cir-
dOoital siona s are (
In compuling syslems, Ihe mOSI common 10 0 d 1 or 0) . Such a two-valued current in the nearby wi re CUi IS, can also yield many advantages. Let's illustrate this point by
. I'k ff (often represenle as
on Iy two possIble values. I'e on or 0 . d' '1 I slem is a system that takes considering one example, audio recording, in some detail. Audio is
. . k b' resenlauon. A Igl a sy
representauon IS ' nown as a lIIary rep .. .. . nnection of digital com- clearly an analog signal. with infinite possible frequencie and volumes. Consider
.. . .. I A dlgllal clfclIIl IS a co
dlgnal mputs and generates dlgna out pUIS. . b k the term dioital wi ll refer recording an audio signal. li ke music, through a microphone. 0 that the music can laLer be
00
ponenlS Ihat logelher comprise a digilal system. [n thIS text 1 k' own as a binary digit or played over speakers in an electronic stereo y tem. One type of microphone. a dynamic
. . . A' I binary slona IS n , ,
10 systems wnh bmary-valued sIgnals. slllg e 0 I ular in the mid-1900s microphone, works based on a principle of electromagnetism-moving a magnet near a
bil fo r short (binary digit). Digi lal electrolllcs became extreme YbPOPI med 011 or off usin o
wire causes changing current (and hence voltage) in a nearby wire. The more the magnet
after the . mve. .
nllon of the transIstor, an eIectnc ' switch thaI can e uh 0
. . f rther in the next c apler. moves. the higher the VOltage on the wire. So a microphone has a small membrane
another electric signal. We' ll descnbe IranslSlors u .
attached to a magnel near a wire-when sound hits the membrane, the magnet moves.
causing current in the wire. Likewise. a peaker works on the same principle i; reverse--a
Digital circuits are the basis for computers d . b bl
. . ICIrCUI
' 'Is 'n changing current in a wire wi ll cause a nearby magnet to move, which if allached to a
The most well -k nown use of dlgna I the world aroun us IS pro a yI'kto
build the microprocessors that serve as the brain of general-purpose computers, I e membrane wi ll create sound . (If you get a chance. open up an old speaker- you'lI find a
the personal computer or laptop computer ta h t you mi ooht have at. home. General- strong magnet inside.) If the microphone is allached directly to the speaker (through ao
purpose computers are also used as servers, \vhich operate behllld . the. scenes to amplifier that strengthens the microphone' output current), then no digitization is
implement banking, airline reservation, web search, payroll , and SImilar such sys- required. But what if we want to save the sound on ome sort of media, so \\e cao record
tems. General-purpose computers take digital input data, such as lellers and a song now and play the song back later? We can record sound using analog methods or
numbers received from files or keyboards, and output new dIgItal data, such as n~w digital methods. but di gital methods have many advantages.
lellers and numbers stored in files or di splayed on a monitor. Lear,~ lI1 g about dlglt~~ One advantage of digital methods is lack of deterioration in qualiry over time, When
A geneml-pllrpose compllfer design is therefore useful in understanding how computers work u.nder the hood, I was grow ing up, the audio casselle tape, an analog method. was the mo t common
and hence has been required leaming for most computing and ele~t nca lengineenng method for recording song. Audio tape contain huge numbers of magnetic particles that
majors for decades. Based on material in upcoming chapters, we II deSIgn a SImple can be moved to particular orientations using a magnet and that hold that orientation even
com puter in Chapter 8. after the magnet is removed. Thus, using magnetism, we could hange the tape' surface.
ome pans up. ome higher. some down. etc. This is similar to how you can -pike your
Digital circuits are the basis for much more " hair, some up, some sideways. some down. using hair ge\. The po ible orientations of the
Increasingly, di gital circuits are being used for much more than Implem: ntmg
tape's particles. and your hair, are infinite, so the tape is definitely anal g. To record onlO
general-purpose computers. More and more new applicallonsconve.rt analog SIgnals
a tape, we pas the tape under a "head" that generates a magnetic field based on the elec-
to digital ones, and run tho e digital signal s through customIzed dlgllal CirCU Its, .to
achieve numerous benefits. Such applications include cell phones, automobIle tric current over the wire coming from a microphone. The tape' panicle "ould thus be
engine controllers, TV set-lop boxes, music instruments, digital cameras and cam- moved to particular orientations. To playa recorded song back. \\c \\ uld • the t:lpe
corders, video game consoles, and so on. Digital circuits found inside applications under the head again, but this time the head operate in reverse, genernting current Q\ r a
other than general-purpose computers are often called embcllded sysl ems, because wire based on the changing magnetic fie ld of the tape. and that current then gets amplified
Embedded systems those digital systems are embedded inside another electronic device. and sent to the speakers.
6 1.2 The World of Digital Systems 7
I Introduction
analog signal Ihu s sloring Is ~nd Os eas il y. Likewise. compUler hard disks in compuler use magnetic
on,wire
, panicle onematl on 10 Slore Os and Is, making such disks si mil ar 10 audio tape. but
,, enabling fas ler access to random pans of the disk since the head can move sideways
,, across the top of the spinning disk.
,
To play bac k this digitized audi o signal, we can simpl y conven the digital value of
eac h sampling peri od to an ana log signal, as shown at the bOllom of Figure 1. 1. Notice
Ihal Ihe reproduced signal is not an exact repli ca of the ori gi nal analog signal. However.
", ___ digitized signal
the faster we ample Ihe analog signa l and the more bits we use for each sample. the
,,--- 000 11 01011 111101101 000
closer Ihe reproduced analog signal derived from the digil ized signal will be to Ihe orig-
analog signal ina l analog signal- a! so me poinl , humans can' l not ice Ihe difference between a pure
000 1101011111 101 101 000
read from tape. CD. etc.
reproduced from audio signal and one thm has been digitized and then convened back to analog.
,digi~ize~ signal Another advan lage of digitized audio is compres ion. Suppose Ihat we'lI be lOring
,, "" each sample with ten bits, in stead of IWO bits like above, 10 achieve much beller quality
, " : : \
wi re Ii : due 10 less rounding. But thal 's a lot more bils for the same audi o-the signal in Figure
If
"
1","
I
I
,
1. 1 has eleven amp les, and a[ len bils per sample. that yields one hundred ten bits 10
0/ I I I I , , , 1 : : :. store the audio. If we sampl e hundreds or Ihousands of time a second. we end up with
t.tOO : 01 :10 : 10 : 11 : 11 : 11 :01 , 10 , 10 ,00 , time
huge numbers of bil s. Suppose, though, that a panicular audio recording has many
samples th at have Ihe value 0000000000 and Ihe value 111111111l. We could com-
speaker press the digital file by using the following trick: if the firsl bit of a ample i O. the
.
Figure 1.1 Converti ng an analog Signal d' . I . al (top). and vice versa (bollom). Notice
10 a Igna sign nex l bit being 0 means the sample is actually supposed 10 be expanded 10
some quality loss in the reproduced signal. 0000000000: the nex t bi t being I means the sample i 111111111l. So 00 i shon-
hand for 0000000000. and 01 is shonhand fo r 1111111111. If the first bil of a
sa mple is l. then the next len bits represent the actuaJ sample . So the digitized signal
A pro blem W .ith aud'10 tape
~
is that the orientations
. . .
of the panicles on the tape's
surface change over time- just like a spiked hatrdo In the morning eventually flatten s o~t "0000000000 0000000000 0000001111 1111111111 " would be compre cd to
throughout the day. Thus, audio tape quality deteriorates over time. Such detenoratlOn IS "00 00 10000001111 01." The receiver. which must know the com pres ion
a problem with many analog systems. . .. . scheme, wou ld decompress that signal into the original digitized signal. There are many
Di gitizing the audio can reduce such deterioration. Digiti zed audIO :"orks as shown other tricks that can be used 10 compress digitized audio. Perhaps the mo tly widely
in Figure 1. 1. The fig ure shows an analog signal on a wire dunng a period of ttme. We known audio compression scheme is known as MP3. which is popular for com pres ing
sample that sional at panicular time intervals, shown by the dashed hnes. Assummg the digitized songs. A typical song mighl requ ire many lens of megabyle uncompre ed.
analog signal ~an range from 0 Volts to 3 Volts, and that we plan to store each sample bUI compressed usually only req uires about 3 or 4 megabyte . An audio CD can lore
usin o two bits. then we must ro und each sample to the nearest Volt (0, 1, 2, or 3), shown aboul 20 songs uncompressed. but aboul 200 ongs com pres cd. Thanks 10 compre ion
as p~ints in the figure . We can SlOre 0 Volts as the two bits 00, I Volt as the two bits 01 , (combined wilh higher-capacily disks), loday ' ponable music players can tore thou-
2 Volts as the two bits 10, and 3 Volts as the two bits 11. Thus, we wou ld conven the sands of songs-a capability undreamt of by mo I people in Ihe 1990 .
shown analog signal into the fo llowing digital signal : 00011010111111 0 11 0 1 000. Di giti zed audio is widely used not only in mu ic recording, but also in voi e commu-
To record thi s di gital signal, we just need to store Os and Is on the recording nicali ons. For example. digilal ce llul ar telephones digitize your voice and then compres
media. We could use regu lar audio tape, using a short beep to represent a 1 and no beep the digilal signal before transmilling Ihe ignal. enabling far more cell phones to operate
to represent a 0, for example. While the audio signal on the tape wi ll deteriorate over in a panicul ar region than possible using analog cell phones.
time, we can still certai nly tell the difference between a beep and no beep, just like we
can tell the difference between a car hom bei ng on or off. A sligllll y quieter beep is still
Satellites DVD Video MusJCal
a beep. You 've likely heard digi ti zed data commu nicated using a manner simi lar to such players recorders instruments
Portable
beeps when you've picked up a phone being used by a computer modem or a fax music players Cell phones Cameras TVs ???
machine. Even betler than audio tape, we can record the digital signal using a media
spec ifically designed 10 store Os and Is. For example, the surface of a D (compact 1995 1997 1999 2001 2003 2005 2007

di sk) can be configured to ei ther refl ect a laser beam to a sensor strongly or weak.ly,
Figure 1.2 More and more analog produ ts are bt.-coming primarily digit!!.!.

,
1.2 The Wo rld of Digital Systems 9
8 Introductio n
Digital Encodings and Binary Numbers
manner sl11111ar 10 that descnbed for aud.o.
Pi ctures and video can be dlgll ized In a , h ghl y-co mpressed dl glla l fo rm . and The previous section showed an example o f a di gital system, which involved digitizing an
pictures In I
Dl glla l Cdmems. for example, slOre d sks In compressed form too aud io signa l into bits, which we could the n process using a digital circuit to achieve
dlgllal Video recorders SlOre Video onlO tapes or I few of the hundreds of new a nd several bene fits. Those bits el/coded the data of interest. Encodi ng data into bits is a
d Video arc Just a .
Dlg lli zed audiO, pictures. an of ana 100 phenome na As shown 111 central tas k in di g ita l systems. Some of the data we want to proces may already be in
Illl ure ,'ppl ,callo ns th at bene fi t from dl",lI
'"
zalion '"
lar products prev IO usly base o n
d
d de numerouS popu ' di gita l fo rm, whi le other data may be in ana log form (e,g. aud io, video. temperature) and
Figure I 2. over the past eca. ani 10 dlgllal technology Ponable muSiC th us req uire convers ion to dig ital data first , as illustrated at the top of Figure 1.3. A di gital
"n 310g technology. ha ve conve rted pnm, Y to CDs In the midd le 1990s, a nd syste m takes di gi tal data as input, and produces di g ita l data as output.
I d from cassette tapes
pl,.ye rs. for exampl e. SWIIC1e I \I phones used analog comm UOlca-
lecent ly to MP3s and other dlgll al form ats Ear y ce mdar In Idea to that shown 111 E ncoding a na log phenomena
1990 d tal commulllcallon , S.
tl o n. but In the late s Igl , I 2000s, analog VHS v.deo players gave way
An y ana log phe no mena ca n be digitized, and hence cou ntle s a pplications have evolved
Figu re 1. 1. beca me do minant In the ear y b t dl ollize v.deo be fo re stonng the
'd corders have egun 0 '" and contin ue to evolve that digiti ze ana log phenomena. Automobiles digitize informa-
to d lg llal DVD players. VI eo re d fi l ntlfely and In stead slOre photos
ha ve eliminate m e . tio n about the e ng ine te mperature, car speed. fue l leve l. e tc., 0 that an on-chip
vlcleo o nto tape. wh. Ie cameras Iy d. gllal-based wllh e lectrontC
1 t l1cnts are Increasmg . compute r ca n moni to r and contro l the vehicle. The ventilator we introduced earlier dig-
uSin g dlglla l cards Muslca inS rU I ' rit and electnc gUllars w llh d .g .tal pro-
drums and keyboa rds IIlc reasIng III popula Y'. . Y 10 di oital TV. Hundreds of iti zes the measure o f the air fl ow into the patient, so that a compute r can make
. . I A aloo TV IS also giVing wa '" calcu lations on how muc h addition a l Row to provide. And so on. Digitizing analog phe-
cess lIlg appea nng recent y. n '" . . I' st decades such as cloc ks a nd
no mena req ui res;
other dev ices have conve rted from analog 10 d.glta InhPa ometers (' which now wo rk in
t human temperature term
watches, ho usehold th ermosta s, ) gine controllers oasoline A sel/sor that mea ures the analog physical phe nomena and converts me mea-
the car ra ther than under the tongue or other places car en t . e
sured va lue to an ana log e lectrical signa l. One example is the microphone (which
pumps. hea rin g aids. and so on. . d bein o introduced in di gital form from the measures sound ) in Fi gure 1. 1. Other common examples include video capture
Many devices were never analog. IOstea o. .' .
.d oes have been di gital sillce thelf IIlceplion . . devices (whi ch measure li ght), thermo meters (which measures temperature). and
ve ry start. For example. VI eo ",am .' 1 d Os. Computations uSlllg Figure 1.3 A typical speedo me te r (whic h measure peed).
Di giti zat ion requ ires that we encode tillngs Into sl an d Os We introduce these di gital system.
di gi tal circuits require that we represent numbers usmg s an . An alia log-la-digital call verIer that convens the electrical ignal into binary
aspects of digital circuits now. e ncodi ngs. The converter must sample (measure) the e lectrical signal at a panic-
ular rate and conve n each sample to some value of bits. Such a converter was
featured in Figure 1.1, and hown as the A2D compo nent in Figure 1.3.

.. THE TELEPHONE. Likewise. a digilal-Io-allalog COli verier (s hown as D2A in Figure \.3) convens bits
back 10 an e lec trical s ignal , a nd an achlOlor convens mat electrical signal back to phys-
The telephone. pmented by Alexander Graham Sell in Sell and his assistant ica l phenomena. Sensors and actuators together represent type of devices known as
the late 1800s (though invented by Antonio Meucci). Watson disagreed on Irallsdllcers--devices that convert one form of e nergy to anomer.
operates using the electromagnetic principle described how to answer the In many examples th roug hout this book, we will utili ze idealized sensors mat them-
earlier-your speech creates sound waves that move a phone; Watson wanted selves directly output di g itized data. For example. we might utilize a temperature sen or
membrane. wh ich moves a magnet, which creates "Hello:' which won. that reads me present tempe rature and sets its 8-bit output to an encoding mat represents
current on a nearby wire. Run that wire to somewhere far but Sell wanted "Hoy the te mperature as a binary numbe r (see next sections for binary number encoding ).
away. put :l magnet connec ted 10 a membrane near that hoy" instead, (Fans of
wire. and the membrane will move, producing sound the TV show "The
E ncoding digita l phenomena
waves tha. sound like you talking. Much of .he telephone Simpsons" may have
Other phenomena a re inhere ntl y di g ital. Such phenomena can only as ume one value
system today di gitizes the audio '0 improve quality and noticed lhat Homer's o0 1 0 000 1
boss, Mr. Sums, from a finite set of values.
quantity of audio transmissions over long distances. A
"' t's 33 degrees" So me d ig ita l phenomena can a ume only one of two pos ible value. and mus can
couple of illleresting facts about the .elephone; answers the phone with
Believe it or no•. Western Union actuall y tumed a "hoy hoy." ) be straigh tforward ly e ncoded as a sing le bit. For example. the following types of sensors
dow n Sell's initial proposal to develop the All early-slyle ,elepltolle. may output an e lectri cal signal that a umes one of twO valu :
telephone. perhaps .hinking that the then-popular (Source of some of the above materia l: www.pbs.org.
telegraph was all people needed. trunscript of '1'he Telephone").
Motion sensor; o utputs a positive voltage (say +.' ) when motion is en . 0
volts when no mot ion is sensed.
10 1 Introduction 1.2 The World of Digital Systems II
" . 0 when li oht is sensed, 0 V when dark.
Light sensor: outputs a pOSlllve volta " C " ' . d 0 V when ~ WHY BASE TEN?
.. I ge when the button IS presse .
Button (sensor): outputs a posllive vo ta Humans have len fingers. so they chose a numbering
nOl pressed . hand-the four tops of those fingers. the four middJe
system where cHeh digi t can represent len poss ible pans of Ihose fingers , and the four bottoms of those
r's output to a bit. with 1 representi ng values. There's nOlhing magical aboul base ten. If fingers. Thm 's likely why the number twelve is
We can straight forwardly encode each senso I tl oughout thi s book we will humans had nine fingers, we'd probably usc a base
the pos iti ve voltage and 0 representing 0 V. In examp es . lr I ' common in human counting today. Uke the use of the
.. . . . I t t the encoded b11 va ue. nine numbering sys tem. It !Urns Out thut base twelve
term "dozen," and lhe lwelve hours of a clock.
utili ze Idea lized sensors that dIrect y OUpu 'bl I s For example a keypad was used somewhat in the past 100, because by using
e several POSSI e va ue . ,
Other digital phenomena can assum ' bl k A desioner mi oht create a our lhumb. we can easily point 10 twe lve different (Source: " Idem. and Information: ' Arno Pen'lias. W.W.
may have four bUllons. co lored re. ' oreen.' and. ac.
d blu ~." h the value " 001;" blue might spots on the remaining four fingers on that Ihumbs's Nonon and Compa ny).
circuit such that when red is pressed. a three-bll output as d the output mioht be 000.
output 010 . green 011. and black 100 . If no bunon IS presse , "
FIgure I 4 Illustrates such a keypad na IS the Enollsh alphabet. Each 523 To under tand binary numbers, we might firsl ensure Illat we understand decimal
a An even more general di gItal phenome " keyboard numbers. Decima l numbers use a base len numbering syste m. The basic definition of base
f " fi t set of characters so typtng on a
charactcr comes rom a nt e
I data We can convert e Igl
' th d tal data to bIts by len is a numbering syslem where the rightmost digit represent the number of ones ( LO~
results tn dlgl ta I. not ana og, . I odll1 o of Engltsh Figure 1.6 Base len we have. the nexi digit represents the number of groups of tens (10 1) we have. the next
assl o nll1g a bl l encodtng to each character A popu ar e~c d" d Code for
cha;;'cters IS known as ASCII (standtng for Amencan tan ar F
number system. digit represents the number or groups of len tens ( 102) we have. and so on, as illu trated
des each character tnto seven blls. or in Figure 1.6. So the digi ls "523" in ba e 10 represent 5* 102 + 2*10 1 + 3*100.
Inlormalton Interchange). wIllCh enco 'A" "1000001" and
CII d fo r Ihe uppercase lener IS , Because humans have ten fin ger. they developed and used a ba e ten numbering
example. Ihe AS enco IIIg . , , , 00001 " d 'b' IS "1100010." The Wt,b s('(lf'C:h system . They came up with symbols to represent quan litie ranging from no fingers (0) to
for 'B' IS "1000010 " A lowercase a IS 11 , an 1000010 engin e Google's
Th ' h e "ABBA" would be encoded as "1000001 all the fin gers but one (9)-lhese are called "ones" rather than "fingers" though. because
us. I e nam d ' II 26 I tters (upper Illlm e cOllieS from
10000101000001 " ASCII defi nes 7-blt enco tngs .or a . e - ,IIe lI'om "googol" we aren' t always counting fin gers. To represem a larger quantily than nine one , humans
and lowercase), Ihe numenca . I sy mb0Is 0 tllrouoh
'" 9 punctuatIon
" . marks, ande -(J } f ollowed by introduced another digil to represent the number of groups of all the fingers. called "ten."
JCKJ :.elves, NOle thai we don't need a unique symbol for the quantity ten itself. ince that quantity
even a number of encod ings fo r nonprinlable "control operaltons. There ar aplJllrelllJy
128 encodings 10lal in ASC Il . A subset of ASCli encodll1gs .IS shown .in imply ing the can be represented as one group of ten and no ones. To represent more than nine tens.
Figure 1.5. Another encoding, Unicode, is increasi ng tn popul anty due to Its ellg ille ellll search humans introduced yet another digit, 10 represent the number of groups of len tens. which
a/% j
support of international languages. Unicode uses 16 bils per character, II1s tead are called "hundreds." To represent ten hu ndreds, they introduced another digit. called
a t a '-,l/o rl1lal;oll.
of jusl the 7 bils used in ASC II , and represenls characters from a dIversity of "thousands. " English (as spoken in America) doesn't have a name for a group repre-
languages in the world . senting ten thousands. nOr for the group representing ten ten thousand . which is referred
to as hundred thousands. The next group is called millions, and further group that are
Symbol Encoding Symbol Encoding
Figure 1.4 Keypad encodings. mu ltipl es of one thousand have names too (billions. trillions, quadrillions. etc.).
R 1010010 1110010 o Now that we better understand base ten numbers. we can introduce base two num-
S 1010011 1110011
bers. know n as bi/lary /lllll/b ers . Since digital circuits work with values that are either
T 1010100 1110100
1101100
"on" or "off," such circu its need only two symbols. rather than ten ymbols. Let tho e two
1001 100
N 1001110 1101110 Figure 1.7 Base Iwo symbols be 0 and I. I f we need to represent a quantity more than I. we'll use another
E 1000101 1100101 number system. digil, whi ch wi ll represent the number of groups of 21 which we'll call two. So "10" in
0 0110000 0111 001 base two represenlS I IWO and 0 ones. Be careful nOI to call "10" ten-in tead. you might
0101110 0100001 say "one-two." If we need a bigger quantity. we'll use another digil. which "ill represent
<tab> 0001001 <space> 0100000 the number of groups of 2 2 , which we'll call four. The weights of each digit in base two
I sail' the/ollowillg are shown in Fi gure 1.7.
Figure 1.5 Sample ASCn encodings. 011 a T-shirt, ami
Jound il rather For example. the number 101 in base IWO equal 1*2 2 + 0*2 1 + 1*_0. or 5. in base
Encoding numbers . . filllllY: ten. In other words. 10 1 can be poken as "one-four zero-two one-one." I t people
Perhaps the most important use of digital circuits is to perform arithmetIC computallOns. "TIlere nrc I0 types comfortable with binary might instead ju t say "one zero one." To be "ety lear, you
In fact , a key dri ver of earl y digital com puter design was the arithmetic computations of or people in the might say "one zero one, base two:' But you should definitely /lOT say "one-hundred ne,
ballistic trajectories in World War L1 . To perform arithmetic computat ions, we need a way world: those who
get binnry. and base two." 101 is one-hundred one in base ten. but Ihe leftmost 1 does not repre,.;em ne-
to represent numbers using binary digi ts-we need binary numbers. those who don't." hundred in base IWO.
12 1 Introduction 1.2 The World of Digital Systems 13

Knowing powers of two When converting from binary to decimal , people often fi nd it useful to be comfort-
~ COUNTING "CORRECTL Y" IN BASE TEN. able knowing the powers of two. since each Success ive place to the left in a binary
helps in learning binary:
I think makes more sense). Thus. the num ber 523 number is two times the previous pl ace. In binary. the firs!. righlmost place is 1. the
The fJe l Lhill there are name~ for ~omc of the groups in
base ten. but 110( o lhcn" prevents many people from
g3i ning an intuitive underslunding of base ten. Further ~~:~
Id be spoken as "fi ve- hundred two-ten lhree" rnt~er
"five-hundred twenty- three:' I believe Lhat kids
2
4
lJ128 256
512
second place is 2, then 4, then 8. 16. 32. 64. 128, 256. 512. 1024. 2048. and 0 on. You
might top at this poinl to practice counting up by powers of Iwo: 1,2.4.8. 16.32,64.
liddi ng to the co nfusion arc the abbreviated names for have a harder time learni ng math because ofin thea 8 1024 128, 256. 512. 1024. 2048, etc .. a rew times. Now. when you see the number 10000 Ill.
gr ups of lens-the numbers 10. 20. 30..... 90 should confusing number naming-for example. carry g
16 2048 YOll might move along the number from righ t to lefl and count up by powers of two for
be ca ll ed One len. two ten. three len..... ninc ten . but one from the ones column to the tcns"column make~
32 ... each bit to delermine Ihe weight of the leftm ost bit: 1.2,4.8. 16.32.64. 128. The nexl
instead use abbreviated nam es: one tc n as just "ten:' more sense if the ones column adds ( 0 o n~ ten seven
rather than to "seventcen"-the resul ~ obvl ~ u s l y. adds 64 highest 1 ha a weight of (counting up again) 1,2. 4; add ing 4 to 128 gives 132. The next
Iwo tell as " twe nt y:' three len us " thirt y," " .. and nine
len as ·'ninety." YOLI can sec how "n inety" is a one 10 Ihe tens column. Learning btnary tS slightly I has a weighl of 2; addi ng Lhat to 132 gives 134. The rightmost 1 has a we ight of I;
!-honcning of "nine ten:' Funhcnnore. short names arc harder for some studenls due 10 a lack of a solid adding Ihat to 134 gives 135. Thus. 10000 III eq ual 135 in base ten.
also used for the numbers between 10 an d 20. II understanding of base 10. caused largely by the
naming confusion. Perhaps. when a store clerk tells EXAMPLE 1.2 Counting in binary
o;hould be "o ne len o ne: ' but is instead "cleven," wh ile
19 should be " one ten nine" but is instead "nineleen," you "that will be ninety-nine cents." y~u can co~ecl
Count ing from 0 10 7 in binary looks as follows: 000. 001. 010. 011 , 100. 101. 110. III.
Tab le 1.1 indictll es how 10 count "correclly" in b3se ten him by saying "you mean ~ine: ten nme ~ent s . Lf
(where I boldly define "correcll y" us counling the way enough of us do Ihis. perhaps 11 wtll calch on. An interesting fact abo ut binary numbers-you can quickly determi ne whether a
TABLE 1.1 Counting "correctlv .. in base ten. binary num ber is odd j ust by checking if the rightmost digit has a I. If the righLmost digil
, oto 9 A s usu:.ll: "zero;' ··one." "two;' etc.
is a O. Lhe number mllst be even, since the number is the sum of even number .

10 10 99 10. 11. 12. , .. 19: "one tcn," "one ten onc... ..one len I ".:" "one
wo, .. ten nine"
'ne" Converting between decimal and binary numbers using the subtraction method
20. 2 1. 22 .. ... 29: "two ten:' "two ten one," "two len two, ... two len nl As we saw earli er, converting a bin ary number to decimal is easy-we j u t add the
30. 40 . ... 90: "three len," "four ten," ... "nine len" we ights of eac h dig it having a 1. Converting a decimal number to binary take slightly
100 10 900 As usual: "one hundred." "two hundred," ... "nine hundred." Even bener wou ld be 10 replace more effort . One mel hod for converting a decimal number to a binary number that is easy
the \ ord "hundred" by "len to the power of 2." for humans to carry o ut by hand, which we' ll call the sllb/ractioll m e/hod. i hown in
1000 and up As usual. Even bener: replace "thousand" by "ten 10 Ihe power of 3". "len thousand" by "len Table 1.2. The met.hod starts wiLh a binary number thal is all Os.
to the power of 4:' e IC.• eliminati ng all the names.
TABLE 1.2 Subtraction method for converting a decimal number to a binary number.
Slep Descripti on
When we are writ ing numbers of different bases and the base of the number is not
0. PUllin PUI a 1 in the highesl binary place who e weigh I is less than or equal 10 the
obv ious. we indicate the base with a subscript, as follows: 101 2 = 5 10 , We mt ght say thiS c:;;" highest place dec imal number.
as "one zero one in base two equals five in ba e ten." ,
Updale Updale the decimal number by Subtntcling the highesl binary place's \\ eight from
o Note that since bi nary isn' t as popular as decimal. people haven I created short
N
0.
decimal the decimal number. The new decimal number is lhe remaining quanti£)' to be
names for its groups of 21. 22, and so on. like they have for groups in base ten (hundreds, " number
c:;;
16 8 2 converted 10 binary. If Ihe updaled deci mal number is nOI zero. return 10 step I.
Lhousands. millions, etc.). Instead . people just use the equivalent base len name for the
Figure 1.8 Basc two group--a sou rce of some confusion to people just learning binary. Nevertheless, tt may
num ber "'y~ l e l11 .
For example. we can convert the decimal number 12 as shown in Figure 1.9.
sLil1 be eas ier to think of each group in base two uSlllg base 10 names, rather than
increasing powers of two, as show n in Fig ure 1.8.
Decimal Binary
EXAMPLE 1.1 1. Put 1 in highest place 12 )( 1 0 0 0 (current value
Binary to decimal
Try place 16. too big (16)12)
Next place. 8. is highest (8<12)
168421 is 8)
Convertlhe following binary numbers to deci mal numbers: 1. 11 O. 10000. 10000 Ill. and 001 10. -8
2. Update decimal number
0 4
12 is jusl 1*2 . or I/ o. . . Decimal not zero. return to Step 1
110, is 1*2 2 + 1' 2 + 0*20. or 6 10, We mighl lhink of Ihis using the group wetghls shown In
Figure 1.8: 1' 4 + 1*2 + 0*1. or 6. 1. Put 1 highest place 1 1 0 0 (cumm' value
10000, is 1' 16 + 0*8 + 0' 4 + 0' 2 + 0' 1. or 1610, Next place. 4. is highest (4=4)
2. Update decimal number -4 168421 IS 12)
looooi 1h is 1' 128 + 1' 4 + 1*2 + 1' 1 = 135 10, Not ice Ihis lil11e Ihat we didn ' l bother to Decimal number is zero. done.
-0-
write O~I th e groups with a 0 bit.
001 102 is Ihe sal11e as 11 02 above - the leading O's don'l change Ihe value. Figure 1.9 Converting Ihe decimal number 12 10 binary usi ng the ubtntclIon "lethO<l
14 Int rodu ction
1.2 The World of Digha l Systems 15
W
e cu n c heck Our wo rk by co nven in " 1100 back to eC 'Il n ,al'. 1*8+ 1*4 +0*2 +0*2. = 12.
d
A s a no th e r example. Figure 1. 10 illustrates the subtrac ti o n method fo r convert 109 t~e

~+'J
d ec ima l number 23 to bin ary. We can chec k our wo rk by co nvertlll g the lesu lt, 101 1 , 1. Divide deci mal number by 2
back to d ecimal: 1* 16+0*8+ 1*4+ 1*2 + 1* 1 =23. Inse~ rem~inder into the binary number
Conllnue since quotient (6) is greater than 0
~ ~
1. Put 1 in highest place 23 10 0 00 (current value 2. Divide quotient by 2 2.[6 0 0
Place 32 too big, but 16 works. 168421 is 16) Insert remainder into the binary number -6 21
2. Update decimal number -16
- 7
Continue since quotient (3) is greater than 0 o (current value: 0)
Decimal not zero, return to Step 1
1
3. Divide quotient by 2
1. Pu t 1 in highest place 1 0 1 0 0 (current value 213 1 0 0
Insert remainder Into the binary number -2
Next place is 8 . too big (8)7) 168421 is 20)
Continue since quot ient (1) is greater than 0 l'
4 2 1
4 works (4<7) (current value: 4)
-4
2. Update deci mal number -3-
Decimal number not zero, return 4. Divide quotient by 2
o
to Step 1
2V1 1 0 0
Insert remainder into the binary number -0 8421
Quotient Is 0, done l' (current value: 12)
1. Put 1 in highest place 1 0 1 1 0 (current value
Next place is 2. wo rks (2<3)
-2
168421 is 22)
Figure 1.11 Converti ng th e decimal num ber 12 10 binary using the divide-by-2 method.
2. Updale decimal number
1
Decimal nol zero , return to Step 1
EXAMPLE 1.4 Decimal to binary using the divide-by-2 method
1. Put 1 in highest place 1 0 1 1 (current value
Nexl place is 1, works (1=1)
-I
168421 i523) Convert th e followi ng numbers to bi nary using the div ide-by -2 method: 8. 14.99.
2. Updale decimal number
Decimal number is zero, done o To convert 8 to binary, we start by di vidi ng 8 by 2: 812=4, remainder O. Then we divide the
quoti ent , 4, by 2: 412=2. remainder O. Then we divide 2 by 2: 212=1 . remainder O. Finally. we divide
Figure 1.10 Conve ni ng the decimal number 23 to binary using the sublIacti on method. I by 2: 1/2=0. remainder I. We stop di vidi ng because the quotient is now O. Combining all the
remainders. least sig nifican t.digi t. fi rst, yields the binary number 1000. We can check this answer by
EXAMPLE 1.3 Decimal to binary mullip lYlllg eac h binary dig it by liS weight and adding the terms: 1*23 + 0*22 + 0'2 + 0'20 = 8.
'
To conven 14 tn binary, we follow a similar process: 1412=7. remainder 0.712=3. remainder I.
Convert th e fo llowing deci mal numbers to binary using the subtrac ti on me thod : 8, 14, 99.
3/2= I, remainder I. 112=0, remainder I. Combining the rem ainders gi ves us the binary number 1110.
To convert 8 to binary. we start by putting a I in Ihe 8's place, yieldin g 1000. Since 8-8=0, we Checki ng the answer verifies that 1110 is correct: 1' 23 + 1*22 + 1'2' + 0' 20 = 8 + 4 + 2 + 0 = 1.1.
are done-the answer is 1000. To conven 99 to binary. the process is the arne but natumll y takes more step: 9912=49
To co nve rt 14 to bi nary, we stan by pUiting a I in the 8's place (16 is too much). yielding 1000. remainder I. 49/2=24, remainder I. 24/2= 12, remai nder O. 1212=6. remainder O. 612=3 , remainder
14 -8 =6. sn we PU I a I in th e 4' place. yielding 11 00.6 - 4 = 2, so we put a I in th e 2's place, O. 312= I. remai nd er I. 112=0. remai nde r I. Combining th e remainders tnge ther gives us the binary
yieldi ng I I 10. 2 - 2 = 0, so we are done-the answer is 111 0. We can quick ly chec k our work by number I 1000 1I. We know from Example 1.3 th at this i the correct answer.
conve rtin g back 10 decima l: 8 + 4 + 2 = 14.
To convert 99 to bi nary, we stan by pu tting a I in the 64 's place (the nex t hi gher place, 128. is Con ve rting fr om any base to any other base using the di vide-by-n method
too bi g-noti ce that being able to count by powers of two is handy in this problem), yielding We have bee n di v iding by 2 in o rder to conven to base 2, but we can u e the arne basi
1000000.99-64 is 35, so we PU I a I in the 32's place, yieldi ng 1100000.35-32 is 3. so we put a meth od to conve rt a base 10 number to a number of any base. To conven a number from
I in the 2's place. yieldi ng 11 000 10.3 -2 is I. so we put a I in the I 's place, yielding th e fina l answer base 10 to base 11. we s impl y repeatedly divide the number by /I and add the remainder to
of I 1000 I I. We can chec k our work by conven.ing back to dec ima l: 64 + 32 + 2 + I = 99. the new base /I number, sta n ing from the lea t s ignifican t digit.

Convertin g betwee n decimal and binary numbers using the divide-by-2 method EXAMPLE 1.5 Decimal to arbitrary bases us ing the divide-by-n method
An a lte rn ative approach for co nverting a decimal number to binary, perh aps less intuitive
Conven the num ber 3439 to base 10 and to base 7.
th a n the s ubtrac ti o n method but easier to automate using a comp ute r p rogram , invo lves
re pea ted ly dividing th e decimal number by 2-we' ll call this the divide-by-2 m ethod. The We kn ow the num ber 3439 is 3439 in base 10. but let's use the divide-by", (where n i- 10l
rem a inder at each s te p (ei ther 0 o r I) beco mes a bit in the binary numbe r, s tarting from meth od to illustrate that the method works fo r any base. We tart by di\;ding 3439 b) 10: 3439/
the leas t s ig nifi cant (ri g htmost) digit. For exa mple, th e process of convertin g the dec imal 10=343, remainder 9. We th en divide the quotient by 10: 343110=34. remainder 3. We do the same
number 12 to binary us ing th e d ivide-by-2 method is show n in Fi g ure 1. 11. with the new quoti ent: 34/3=3. remai nder 4. Finally, we divide 3 by 10: 3/10=0. remainder . Com-
bining the remainders. least signifi ant digit firs t. gives us the base 10 number 3439.
-
1.3 Implementing Digital Systems: Programming Microprocessors versus Designing Digital Circuits 17
16 IntrOdu ction
. " 1 excepl we now divide by 7. We begin by . The subtraction or di vide-by- 16 method can al 0 be used to conven decimal to hexa-
To conven 3439 to base 7. the approac h IS Simi ~r. .'
our calculations we get: 49 117==70, decul1al, however, convening directly from decimal to hexadecimal can be a bit unwieldy
. d ? Continuing < < ,
dividing 3439 by 7: 3439n=-191. rematn er -' . 3 In=o remainder I. Thus. 3439 in base for humans SUlce we are not used to working with powers of sixteen. Instead. it is often
remai nder I. 70n= 10. remainder O. Ion = I, remalOder . I ' sull' I*r' + 3*73 + 0*72 + 1*71
7 is 130 12. Checking the answer ve ri fies Ihat we have the corree re . qUIcker to conven from dec imal to binary u ing the ubtraction or divide-by-2 method
+ 2*70 = 240 1 + 1029 + 7 + 2 = 3439.
and then conventng from btnary to hexadecim al by grou ping set of 4 bi ts.
d from one base to another by first convening EXAMPLE 1.7 Decimal to hexadecimal
Generall y, a number can be convene n number to the desired base using the
that number to ba e ten. then convenm£ the base te COIll'en 29 base 10 10 base 16.
clivide-bY-/l method.
To perform thi s conversion, we can firs t convcn 29 to bi nary and lhen conven the binary result
to hexadecimal .
8 A F Hexadecimal and octal numbers. . .
. b known as " exadecl/lwl /ltlmbers or Just "ex, are Convening 29 to binary i straighlforward usi ng Ihe divide-by-2 method: 29/2= 14. remainder
164 163 162 161 160 Base SIxteen num ers. b . d' 't is I. 14/2=7 . remai nder O. 712=3. remainder I. 312= 1, remainder I. 112=0. remainder I. Thus. 29 is
also 0 ular in digital design. mainly because one. ase sIxteen Igl.
111 01 in base 2.
A F . PI P r r base twO di oits making hexadeCImal numbers a Illce
eq ulva ent to 'ou " ' . I fi d" t
t t . for binary numbers. In base SIxteen , t le rst Igl0
shonhan d represent.llon
Convert ing 111 012 10 hexadecimal can be done by grouping selS of four bilS. so 11101, is I,
and 1101 2. meaning 116 and D16, or ID I6. - -
10001010 1111 fif
represents up 10 teen 0 nes-the sixteen sy mbols commonl y used
_ .
are ,
Of course. we can use Ihe divide-by-16 method 10 conven directly from decimal 10 bexadec-
hex binary hex binary I. 2..... 9. A, B. C. D. E, F (so A=ten, B=eleven, C=twelve, D-thlneen,
imal. Slarti ng wi lh 29. we di vide by 16: 29116=1 , remainder 13 (DI6). 11 16=0. remainder I.
, 0 0000 ---+-_.:...-
8
E=fourteen and F=fifteen). The next digll represents, the number of Combi ning the remainders togelher gives us I D 16 - Though lhis particular conversion was simple.
0001 1000 group of 1'6 1 Ihe next di gil the number of groups of 16-, ebc., as shown convening larger numbers directly from decimal to hexadecimal can be lime-<:on uming. and lhe
9
0010 1001 . F' I 12 S SAF equals S*162 + 10*16 1 + 15* 16, or 2223 10 , two-step convers ion may be preferable.
A 1010 111 Igure . . a 16 d~ d" . b t a
0011 8 1011 Since one digit in base 16 represents 16 values, an our Iglts III ase w
4 0100 C 1100 represents 16 values, each digit in base 16 represents fo ur dIgIts III base Base eight numbers. known as oClallllllllbers, are sometimes used as a binary hon-
5 0101 0 1101 two, as show n at the bOllom of Figure 1.1 2. Thus, to convert SAF I6 to hand too. since one base eight digit equal s three binary digits. 503 8 equals 5*82 + O*SI
6 0110 E 1110 binary. we convert 816 10 10002, AI 6 to 10102, and FI6 to 111 .12' resulllllg =
+3 *So 323 10, We ca n convert 503 8 directly to binary imply by expanding each digit
7 0111 F 11 11 in 8AF I6 = 1000 101 0 1111 2, You can see why hexadeclll1al IS a popular Into three bits, resulting in 503 8 = 101 0000 II , or 1010000 I !,. Likewise. we can conven
Figure 1.12 Base sixleen number system. shonhand for binary: SAF is a lot easIer on the eye than 100010101111. binary 10 octal by grouping the binary number into groups o(three bits starring from the
To convert a binary number to hexadecimal , we Just substItute every right, and then replacing e.ch group with the corresponding octal digit. Thus. 1011101
2
fou r bits with the corresponding hexadecimal digit. Thus, to convert 10 II 0 II 0 12 to hex, yields I 011 101 , or 135 8,
we group the bi ts into oroups of four staning from the right, yielding I 0 II 0 110 I. We Appendix A di scu ses number represemations further.
then replace each group" of four bits with a single hex digit. 110 I is D, 0 II 0 IS 6, and I IS
I, resulting in the hex number 16D 16.
1.3 IMPLEMENTING DIGITAL SYSTEMS: PROGRAMMING
EXAMPLE 1.6 Hexadec imal to/from binary MICROPROCESSORS VERSUS DESIGNING DIGITAL CIRCUITS
Conven the following hexadecimal numbers 10 binary: FF. 1011 , AOooo. You may find il useful to Designers can implement a digital system for an application using one of tWO common
refer 10 Figure 1.12 10 expand each hexadecimal digillo four bils.
digital system implementation methods-programming a microprocessor or creating a
FFI 6 is 1111 (forthe left F) and 1111 (for the righl F), or 11111111 2, custom digital circuit (known as digi tal design) .
10 1116 is 000 1. 0000. 000 1. 000 I. or 000 1000000010001 2, Don'l be confused by lhe facI that As a concrete example, consider a simple application that !Urn on a lamp whenever
1011 didn'l have any symbols bUI I and 0 (which makes Ihe number look like a bll1ary there is moti on in a dark roo m. Assume a motion detector has an output wire alled a that
number). We said il was base 16, so it was. If we said il was base 10. then 1011 would outputs a 1 bit when motion is detected , and a 0 bit otherwise. Assume a light sensor bas
equal one Ihousand and eleven.
an output wire b that outputs a 1 bit when light i sensed. and a 0 bit othen\ise. And
AOOOO l6 is 1010, 0000,0000, 0000.0000. or 1 0 I OOOOOOOOOOOOOOO~ .
a sume a wire F turns on the lamp when F is 1, and rum off the lamp when O. dra\\ing
Convert the following binary numbers 10 hexadecimal: 0010. 0111111 0, 1111 00. of the system is shown in Figure 1. 13(a).
00102 is 2 16, The design problem i to detennine what goes in the block named Dm'Clor. The
o I I I I I 102 is 0 I I I and I I 10. meaning 7 and E, or 7E 16. Detector block takes wires a and b as inputs. and generates a \'lliue on F. -uch that the
1111 00, is II and 1100. which is 00 11 and 1100, meaning 3 and C. or 3C 16. NOlice that we light turns on when motion is detected when dark. The Detector :lpplicati n is readil)
start-grouping bits into groups of fou r from Ihe righl. nOI tlte left. implemented a a digital system. as the application ' inpull and utpUtf obviousl) are
18 Introduction 1,3 Implementing Digital Systems: Programming Micro processors vers us DeSigning Digital Circuits
19
. " ' . h A desioner can implement the Detector
(li gna !. haVing only two pOSS Ible values eac. 3(b"')) 'no 'J custom di oital cirCUIt shown in Figure LI S. The des igner connects the a wire to the microprocessor input pin
block by programming a microprocessor (FIgure I, I or USI ", ' '" 10, the b W, re to Input pin 11 , and output pin PO to the F wire, The designer could then
(Figure 1.13(c)). speCIfy the II1structions for the microprocessor by wri ting the fo llowing C code:
void rnain()

~II
{
Ivhile (1) {

>~
Detector Detecto r Detector
a PO ~ 10 && ! 11 : 1/ F a and ! b ,
Digital F
System PO
Micro-

C is one of several popular lan- motion sensor


guages for describing the desired

~ -- - -- --- ---
----~ instructions to execute on the micro- F

(b) (c) proce sor. The above C code works


(a)
as fo llows. The mi croprocessor. after
Figure 1,13 MOlion-in-lhe-dark-deleclor syslem: (a) sySlem block di agram, (b) implementation being powered lip and reset, executes lamp
using a microprocesso r. (c) implementation using a custom digit al c irc uI t.
the instructions within rna in's cllrl y
brackets ( ). The fi rst instruction is
Software on Microprocessors: The Digital Workhorse "wh i 1e (1) " which simply means
Desioners that need to work with digital phenomena often buy an off-the-shelf micropro- to repeat the insrructions in the
cess;r and write software fo r that microprocessor, rather than design a custom dtgttal while's curly brackets forever. Inside
circui t. Microprocessors are really the "workhorse" of digital systems, handltng most the while's curly brackets is only one
dig it al process ing lasks. instruction "PO = 10 && ! 11," Figure 1,15 Physical motion-in-the-dark
which assigns the microprocessor's detector implementation using a microprocessor.
output pin PO with a 1 if the input pin
10 PO
11
lO is 1 alld (written as &&) the input
;;: P1
12 o· P2 pin I 1 is not 1 (meani ng 11 is 0).
i3 Thus, the output pin PO, which tums
13 '0 P3
A "processor" i3 a
processes. or
14
"~ P4 the lamp on or off, forever gets O --~

tralls/orms, dow. A 15 P5 assigned the appropriate value based 1


Q b
"m icroprocessor" t6 P6
on the input pin values, which come 0-------'
is (l programmllble t7 P7 (b )
proct'ssor fro m the motion and light sensors. 1
implemellted 011 (J (a) Fi gure 1. 16 show an example F
O--~
sillsle compllter
chip-rile "micro" Figure 1.14 Basic microproce sor's in put and output pins. of signals a, b, and F over time, I I I I
J US! meallS sl1Ial/ with time proceeding to the right. 6:00 7:057:06 9:00 9:01 time
here. The A microprocessor is a programmable digital device that executes a user-specified As tim e proceeds, each signal
microproce.uor Figure 1.16 Timing diagram of motion-in-the-<lark
lerm became
sequence of instructions, known as a prog ram or as software, Some of those instructions may be either 0 or 1, illustrated detec tor system.
popular il/ Ihe read the microprocessor's inputs, others write to the microprocessor's outputs, and other by each signal's associated line
19805 whell instructions perfo rm computati ons on the input data, Let's assume we have a bas ic micro- bein g either low or high. We made
processors shrank
dOlvlIfrom
processor wi th eight input pins named 10, 11, ..., !7, and eight output pins named PO, a equal to 0 until 7:05, when we made a become 1. We made a stay 1 until :06.
mulliple cflips to PI, .. ., P7. as shown in Figure 1.l4(a), A photograph of a real microprocessor package
when we made a return bac k to 0, We made a stay 0 until 9:00. when we made a
jusl OIlC. Th e first with such pi ns is show n in Figure L 14(b) (the ninth pin on thi s side is for power, on the
single-chip beco me 1 aga in , and then we made a become 0 at 9:01. On the other hand, we
microprocessor
other side for ground).
made b stan as 0, and then become 1 sometime between 7:06 and 9:00, The
was the Imel 4004 A microprocessor-based solution to Ihe motion-in -the-dark detector application is
chip ill 1971. di agram shows wh at the va lue of F wou ld be given the C program executing on [he
ill ustrated in Figure 1.1 3(b), and a photograph of an actual physical implementation
microprocessor-when a is 1 and b i 0 (from 7:0- to 7:06). F will be 1. A diagram
20 Introduction 1.3 Implementing Digital Syste .p '.
ms. rogrammlllg Microprocessors versus DeSigning Digital Circuits 21
wi th lime proceeding Ihe riohl. and Ihe va lues or digital signals show n by high or
10 Designers like 10 use microproce _
Iow I' ·IS known as a IWll
Ines. . .llg lagram . We draw Ihe inpul lines
<> d '
. (a and' b) 10 bef sors In their digi tal systems because
whalever va lues we walll . bU I Ihen the oUlpul line (F) musl desc ribe Ihe behavIOr 0 microprocessors are readily avai lable,
Ihe digilal sys lem . Inexpensive. easy 10 program. and ea y
to reprogram. II may surpri se you 10
EXAMPLE 1.8 Outdoor motion notifier lIsing a microprocessor learn Ihat you ca n buy cenai n micro- (a)

LeI's use th e basic microprocessor of Figure 1.1 4 to implement a processor chips for under $ 1. Such
10 PO sy~ l cm thai sounds a bu zzer when moti on is detec ted at any of Lhree microprocessors are found in places
11
~
P1 buzzer mOlion sensors outside a house. We connect the motion sensors ~o lIke lelephone answering machines.
12 n P2 microprocessor inpul pins 10. 11. and 12. and conneCI OUlpUI pill mi crowave ovens. cars, IOYs. certain
13 .g P3 PO 10 a buzzer (Figure 1.17). (We assume Ihe mOli on sensors and medical devices, and even in shoes with
14 g P4 buzzers have appropri ate elcclro nic interface to th e micro processor
(b)
'"
15 ~ P5
blinking lighl s. Examp les include Ihe
Q pins.) We can then wri lc the foll owing C program : 805 1 (ori ginally designed by Inlel). the Figure 1.18 Microproeessorchip packages: (a) PIC
16 P6
68 HC II (made by Motorola). and Ihe and 805 1 microprocessors. costing aboUI S I each.
17 P7 void main()
(b) a Pen tiu m proces or with pan of ilS package
( PIC (made by Mi croChip). Other
cover removed. showing the si licon chip inside,
molion sensor ~Ihile (1) ( microprocessors may cos I lens of do l-
IlIlel named 'heir
PO = 10 II II II 12: t!VO/l,illg 1980S/ lars: found in pl aces like cell phones, ponable digital assistams. office automation
Figure 1.17 Motion sensors connected to 90s desktop equlpmenl, and med ica l equipmenl. Such processors include the ARM (made by the
mi cro processor. processors using
/llimbers: 80286, ARM corporal Ion), Ihe MIPS (made by the MIPS corporation). and others. Other
80386. 80486. microprocessors, like Ihe well -known Pentium processors from Intel. may cost several
The progrnm executes the statement inside the while loop repeated ly, That Sla tcmcnt will set As pes bc(:ame
hundred dollars and may be found in desklop computers. Some microprocessors may
PO 10 I if lO is I 01' (wrillen as II in Ihe C language) I I is I or 12 is 1. olherwise Ihe slalemenl popular. Intel
cost s~veral thousa nd dollars and are fou nd in a main frame compuler running perhaps
sels PO 10 O. switched 10
c(lu:hier ,wmes: an alrlme reservallon system. There are literal ly hundreds, possibly even thousands, of
the 80586 lVas
called 0 PemiuIII
differem microprocessor Iypes avai lable, di ffe ring in performance. cost. power. and
EXAMPLE 1.9 Counting the number of active motion sensors ("pellfa" mealls olher melrics. And many of Ihe small low-power processors cost under $1.
In this example. wc'lI usc the basic microprocessor of Figure I 14 to implement a sim ple di gital sys- 5),JollolI'ed by the Some readers of Ihi book may be fami li ar with software programming. others may
Pentium Pro. Ihe
tem th at outputs in binary the number of Illation sensors that presently detect motion , We' ll assume Penlium II, alld no\. Knowledge of programmi ng is not essemial 10 learning Ihe material in this book.
two motion sensors, meaning we'll need to ou tput a two-bit binary number, whi ch can represent the others, £1'e"llIall)~ We wi ll on occasion compare custom digilal circuits with their corre ponding software
possible counlS 0 (00). I (0 I). and 2 (10). We' ll connecl Ihe mOlion sensors to microprocessor the "ames implememali ons-the ullim ale conclusions of Ihose comparisons can be understood
inpul pins 10 and I I and OUlpullhe bi nary number onto outpul pi ns PI and PO. We can Ihen wrile tiomill(IIeti over
the nllmbers, withoul knowledge of programming it elf.
the follOwing C progrzHll:
void main()
(
Digital Design- When Microprocessors Aren't Good Enoug h
while (J) With microprocessors readi ly avai lable, why would anyone ever need to design new digital
if ( ! 10 && ! I I) ( circuits, olher Ihan those relatively rew people de igning microprocessors themselve ? The
P1 = 0; PO = 0 : II output 00 . meaning zero reason is that sofI ware nlnning on a microproce sor often isn 'l good enough for a partic-
ular applicalion. In many cases. software may be too slow. Microproce sors only execUle
else if( ( 10 && ! ] 1 ) II ( !] O && ] 1 ) ) ( one instruclion (or aI most a rew instructions) at a time. But a custom digital circuit can
PI = 0 ; PO = 1 : II output 01. meaning one execute dozens, or hundreds. or even thousands of compUlations in parallel. Many applica-
lions, like picture or video compression. fingerprim recognition. voice command detection.
else if (]O && ]ll ( or graphics display. require huge numbers of computation to be done in a hon period of
P1 = 1 : PO = 0 : 1/ output 10 . meaning two time in order to be praclical-afler all , who wants a voice-controlled phone thaI requires -
minutes 10 decode your voice command. or a digilal camera that require 1- minutes t
take each picture? In other ca e , microprocessors are too big. or nsume mu h
power. or would be too costly, making ustom digital cireuils preferable.

I
n
22 1 Introduction 1.4 About this Book 23
For the mOlion-in-the-dark-detector application. an ahernati ve to the I.llicroprocesso r- . We need '0 decide which tasks to
Irn~l emcnt on the microprocessor and which
based dc.!\ign lISC!) a custom digital circui t inside the. De~ec1O" ~I ock. A c~~cll.1I IS an lIl~erc~n- Micro-
to ~mpl e mcnl as a CUstom digital ci rcuit. processor
·t· r I . W Sl desion ·lcII'CUllthat.loreach dlffelent combInauon
ne~ 1011 0 C eClnc components. C I11U e' lIch circuit is shown in Fi ure s U~Jecl to the constraint Ihal we shou ld (a) (Read.
or Input, a and b. gcnerate, the proper val ue on F. One S .g strive 10 minimize the amount of Custom Compress.
' V '11 d 'b I ts '
I n Lilat circuit later. But you've now andSlore)
I . 13( C.) \' C c~cn c Ile componen . , . seen one SImple digital circuitry in order 10 reduce chi
example of designing a digita l circuit to solve a design problem. The mIcroprocessor also C.OSl~. Such decisions are known as parll..
has a circuit inside. but becallse that ci rcuit is designed to execute programs rather Lhan Just t tO lllIIg. Three panitioning opt ions are

cietect 1110tion at ni ght. the microprocessor's ci rcuitml1Y conwin about ten thousa nd compo- hown in Figu re 1.19. I f we implement all
nents. compared to j ust two components in Ollr custom digital Cl rClI lt. Thus" our custom three tasks On th e microprocessor th e (b)
camera wi ll require 5 + 8 + I = 14 se~ol1d s
di gi tal circuit may bl! smaller. cheaper. fas ter. and consume less power than an llTIplementa-
to take a picture-too much lime for the
l ion on a microproccs~or. . .. . camera 10 be popular wilh consumers. \Ve
Many applica ti on use bot h microprocessors and custom dI gItal deS Igns w attam a cou ld implemelll all the tasks as Custom
~ys t cm th aL ~Ichil!vc~ j u!'.t the right balance of performance. cost, power, Size, deS ign time, digi.al ci rcuits. resulting in 0. 1 + 0.5 + O. =
flexibil ity. etc. 1.4 seconds. We could ins.ead implement
(c )
lhe read and compress tasks wilh CUStom
EXAMPLE 1.10 DeCIding among a microprocessor and custom digital circuit digital ci rcuit s. while leaving the store uisk
to th e microprocessor, resulting in 0. 1 + 0.5
\VC I11U"" dc~ign a digita l ~y~ t cll1lo conlrol a figiller jet's aircra ft wing. In order to properl y control the Figure 1.19 Digi tal camera implemenred
+ I. or 1.6 seconds. We might decide on this
aircrafl. the diuital ,ystCIll must execute. 100 li mes per second . .a computation lask th at adjust the with: (aJ a microprocessor. (b) CUStom
lasl implementation Option. to save COS t
wing'S pos it io; ba~\.!d on the aircraft'lj prescnI and desired speeds. pi tch. yaw, and other night fac tors. ci rcuits. and (c) il combination of Custom
without much noti ceable time overhead.
SllPPO~C we e~til11atc thai software on n microprocessor would req uire 50 ms (milliseconds) for each circuits and a microproces or.
execlition of the computa ti on uhk. whereas a custom digital circuit would requ ire 5 ms per execution.
Execliting the computation task 100 times on the microprocessor wou ld req uire 100 * 50 ms =
5000 ill S. or 5 ... econcis. But we require those 100 executions to be done in I second. so th e micro-
1.4 ABOUT THIS BOOK
procc~sor i:.. not fast enough. ExecUl ing the task 100 times with the custom digi tal circuit would
require 100 • 5 111' = 500 111,. or 0.5 seconds. As 0.5 seconds is less than I second. the custom Section 1.1 di scussed how digital systems now appear everywhere arou nd us and iooifi-
digi tal circuit can !'tali:..!'y the system's performance constraint. We thus choose to implemen t the
cant ly il~pact the way we li ve. Section 1.2 highlighted how learning digital d~ign
digita l sys tcm as J c u~tom digital circuit.
accompli shes two goals: showing us how microprocessors work "under the hood." and
enabling us to implement ystems usi ng custom digi tal circuit rather than or alon2 ide
EXAMPLE 1.11 Partition ing tasks in a digital camera
microprocessors to achieve beller implementati ons. This latter goal i becomin2 inc~as­
A digita l Cilmera cap turcs pictures digi tally usi ng several steps. \Vhen the shuller button is pressed, a ingly significant since so many analog phenomena. like music and video. are becomin2
grid of a few million light-sensitive electron ic clements capture th e image. each elemenL storing a digital. That section also introduced a key method of digitizi ng analog igoal. namely
binary number (perhaps 16 bit~) representing the intensity of light hilling the element. The camera binary numbers. and described how to convert among decimal and binary numbers.
Ihen performs several tasks: the cnmera reads th e bits of each of these clements. compresses the tens
Section 1.3 described how designers tend to prefer to implement digital ystcms by
of millions orbits into perhaps il few mill ion bits. andslOl-es lhe compre ssed bilSas a file in the cam-
writing software th at executes on a microprocessor. yet designers often use u tom digital
crn's nash memory. among other ta sks. Table 1.3 provides sample task exec ution tim es on an inex-
circuits to meet an applicati on's performance req uirements or other requirement .
pcnsive low-power microprocessor versus a custom digital circui!.
[n the remainder of this book you will learn about the exciting and challenging field
TABLE 1.3 Sample digital camera task execution times lin seconds) on a of digi tal design. wherein we convert desired system funcLionality into a custom digital
microprocessor versus a digital circuit, circuit. Chapter 2 will introduce the most basic foml of digital circuit. combinational cir-
cu its. whose ou tputs are simply a functi on of the present values on the circuit"s inputs.
Task Microprocessor Custom digital circuit That chapter will show how to u e a foml of math ca lled Boolean algebra to de - ribe our
Read 5 desired circuit functionality. and will provide clear sleps for conve-rting Boolean equa-
0. 1
tions to circui ts. Chapter 3 will introduce a more advanced type of ircuit. equential
Compress 8 0.5 circuits, whose outputs are a function not only of the present input value. but aI 0 of pre-
viou input val ues-i n other words. sequential circuits have memory. uch circuits are
Store 0.8 commonly referred to as controllers. ThaL chapter will show us how t u' another
].7
1.5 Ex ercises
26 Introduction
. ' . . etll od: J.2J Convert the following hexadecimal numbers to binary:
1. 12 Convert lhe foll owing decimal IHllllbcr::. to binary Ilumbers uSing the dl vlde-by-2 nl
(a) BOC4
(") 9 (b) I EF03
(b) 15 (e) F002
(e) 32
(d) BEEF
(d) 140 . lhOd:
e 1.22 Convcn Ihe following hex adeci mal num bers 10 decimal:
1. 1J C{lIlvcrt thl! foll owing dec imal numbers 10 binary numbers u Sing Ihe cli vide-by-2 1l1
(a) FF
(") 19 (b) FOA2
(b) 30 (e) OFIOO
(c) 64 (d) 100
(d) 128 . ' . . e l h Od : 1.23 Convert the rollowing hexadecima l numbers to decimal :
1. 1-' Convert the fo llowing deci mal numbers to binary numbers lI smg the c11 vlde-by-2 111
(a) 10
(") 3 (b) 4E3
(b) 65 (c) FFO
(e) 90 (d) 200
(d) 100 . ' . . ? m e l ]1od: 1.201 Conve rt (h e dec imal number 128 to the foll owing number sys tems:
1.15 COllvert th e following decimal numbers 10 blllary numbers usmg the dlv ldc-by--
(a) binary
, (") 23 (b) hex adeci mal
(Il) 87 (c) base Ihrce
(e) 123 (d) base fi ve
(d ) 101 (c) base fineen
l.16 Conve rt the followi ng binary numbers to hexadecimal:
1.25 Compare the number of digits necessary 10 represent the followi ng decimaJ numbers in bioary.
(") 11110000 octal, decil11111. and hex adecimal representauons. You need nOI determine the actual represen-
(b) 111 11111 tations-j ust the number of required dig it s. For example , representing the decimal number 12
(e) 010110 10 requires four di gits in binary ( 1100 is th e aClUal representalion), two digital in oct:JJ ( 14) . twO
(d) 1001101 101101 digils in decimal ( 12). and o ne d igi l in hexadeci mal (C).
1.17 COnVl:ft th e foll owing bi nary numbers 10 hexadecimal: (a) 8
(b) 60
(a) 11001101
(c) 300
(Il) 10100101
(d) 1000
(c) 11 110001
(e) 999,999
(d) 1101101111100
1.18 COllvert the f<? llowillg binary numbers 10 hexadecimal: 1.26 Delenni ne the decimal num ber ra nges thal can be represented in bina,). octal. decimal. and
hexadecimal using the following numbers of digits. For example. 2 digits can represent decimal
(a) 111 00 111
number range 0 Ihrough 3 in binary (00 through II ). 0 through 63 in octal (00 through 77), 0
(b) 11 00 1000
Ihroug h 99 in decimal (00 th rough 99), and 0 through 255 in he,xadecimal (00 through FF).
(C) 10100 100
(a) I
(d) (JIll 11'11
(b) 3
1.19 Convert the following hexadecimal numbers to binary: (e) 6
(a) FF (d) 8
(b) FOA2
(c) OF IOO SECTI ON 1.3: IMPLEMENTING DIG ITAL SYSTEMS: PROGRAi\(,\IlNG
(d) 100 M IC RO PROCESSORS VE RSUS DES IGNING DIG ITAL C IRCUITS
1.20 Convert the following hexadecirnallllllllber!> to binary: 1.27 Use a microprocessor like thai in Figure 1.14 to implement a system that sounds :In aJ3.lTll
(a) 4F5E whenever there is motion detec ted al the same lime in three different roon ~. Each n.'){)m~s
(b) 3FAD mot ion sensor output comes to us on tl wire as a bit 1 meaning motion. 0 meaning no mou(\o.
(e) 3E2 A \Ve sound the alann by selling an Output wire "alann" to 1. hm\ the l'Onnt."Ctions tl'l..Uld tn.")nl
(d) DEED the microprocessor. and the C code to execute on the micropf"()C"e ·or.
28 Introduction
1.5 Exerc ises 29
. I ent a system th at counts the number of
. I hat III
1.28 Use a mi croprocessor like has a senso r th at o ut put s a 1 .I f a car is
. FI gure I, 14 10 nnp em ~ DESIGNER PROFILE
cars in a parking 101 wit h seven spaces. Each space h Id be written in binary over three
. e The output 5 all
prC~l: nl . and thaI outputs a O Qt herWls . d the C code. Hint : use a loop an d an
Kelly firsl became ---.:::;:;;:::;::;::::"!! involving almoSI any kind of sen or. like motion or lighl
interested in engineering
wires. Show the connecti ons with the microprocessor an a 1 if-else state ment or a switch sensors. Those blocks could "': used by Jcids 10 learn basic
b f cars present. then usc < I while allending a lalk
i nteger va riable to caunl the nutll er 0 fO nate 3-bit output. about engineering at a concepls of logic and compUlers, concepts which are quite
state ment to convert the integer 10 the app P . thn! displays the number career rair in hi gh chool. important 10 leam these days. Our hope is that these
. ... I 1410 Implemenl a syslem . d blocks will "': used as leaching lools in schools. The
1.29 Use a microprocessor Itke thai In Figure . . II LEOs 'Irranged III a rowan " I was dazzled by Ihe
of people in a wai ting roo m o nlO an LED display There
• ' .
are
.
elg 1 , 1
th at wi ll output a when e th interestin g ideas and the blocks can also "': used 10 help adu lts sel up useful
I . ped with a sensor < •
cool graphs." While in systems in their homes, perhap to mOrU lOr an aging
eight chairs in the wa iting room, eac 1 equip d I number of sealS being occupied,
I· LED 1"1 viII corres pon 10 11e fi college. Ihough. she parenl, or a child al home sick. The polential for these
SCa l is in lise. The number 0 S I \ f l ' h two seats those are), th e r5t two
· d (regardless 0 1I' 1iC . h learned Ihat "Ihere was blocks is greal-il will "': interesting 10 see whal impacl
For insHlllcc. if two se:lts are OCC Up lC " ed the first three LEDs in th e row will hg t up. Ihey have:·
LEDs will lig ht l~ P: if lh~ee seats are OCCUpl i~d the lights will light up incrementall y. Show
much more to engineering - ......._ .........__
Regard less of whic h particular seals arc occup c ~ ro ri ate C code. Ihan ideas and graphs. Engineers apply Iheir ideas and "My favorite thing about engineering i the variety of
th e connec ti ons with the microprocessor and th l PP P . d I ' d c pIing skills and creativily invo lved. We are faced with problems
ski lls 10 build Ihings lhat reall y make a difference in
I orts encrypted Video. an t lal e ry people 's li ves, for generat ions to comc." Ihat need 10 "': solved. and we solve them by applying
1.]0 Suppose a pan icul ar TV SCI-lOp box al a hole supp C Th ec uli on limes of each lask on known techn iq ues in crealive ways. Engineers must
each video frame consists of three sub· tasks A . B. and . e ex I 1S ~o r A 10 I11 S versus 2 In her first few years as an enginee r. Kell y has worked
. I ' . are 100 ms ve rsus n
continually learn new [echnologies. hear new ideas. and
a microprocessor versus a custom dl·glla CirCUl i •
th e microprocessor and on a varicty of project "(hat may help numerous
lrac k current prod ucls, in order 10 be good designers. It's
f C Panilion Ihe tasks among individuals," One project was a ventilator system li ke the
ms fo r B, and 15 ms ve rsus I ms or . . f Stom di gital circuitry, while all very exciting and challenging. Each day a( work is
custom d igital circ uit ry, sllch that you minimize the amount 0 cu one mentioned earlier in this chapter. "We designed a new
diffe rent. Each day is exciting and is a learning
meetin o the constra int of decrypting at least 30 fram es per seco nd. conlrol system that may enable people on ve ntilators to ex penence.
a er tic kets for oaining entrance to base- breathe with mOre comfort while still getting the proper
1.31 The owner of a ba ebnll stadium wan Is to ei1mll1a~e p p w tho~e auending the game to
e .. '·Studying 10"': an engineer can "': a great deal of work
amount of oxygen," In addition, she examined alternative
ba ll na mes. She would like 10 sell lickels eleclrollicall y and allo . . II . Ihe fin gerprinl bUI it"s wonh il. The key is 10 lay focused, 10 keep your
implementations of Ihat control system. incl Uding on a
TI has two opllons for Instu II1g mind open. and to make good use of available resources.
enter by sc~u1lling theIr finge rpnnt. le owner
e . .
. h fi erprint recoonition microprocessor, as a Custom di gital circuit, and as a
Staying focused means to keep your priorities in order-
recog nition system, ~e rst opt~on IS a s t: The'second option is a custom di gital circuit
T fi .. yst"m thm Implements L e 1I1g 0
combination of Ihe Iwo. 'Today"s lechnologies. like
for exa mple. as a Student. studying Come firsL recreation
using soft ware exec uting on a m,lcroproces.s?r, Th ftware system req uires 5.5 seconds to FPGAs, provide so many differenl oplions. We examined
desig ned specificall y for fi ngerpnm recognition. e so d' . I ' '1 second. Keeping you r mi nd open mean [0 alway be
several options to see what the tradco ffs were among
recoe· nize an indiv idual'S finge rprmt · and costs 550 ' pe. r unit . whereas th e Iglta ClrCUI th I willing [0 listen to different ideas and [Q learn about ne\llo'
Ihem. Underslanding the Iradeoffs among Ihe oplions is
requie res 1.3 seconds for recognition and costs S I00 pe' r Ull!'t. The owner wants to ensure d Ih as technologies. Maki ng good use of resources means to
quite important if we wanl to build the best system
. be able 10 enter thestad·IU111 befo re Ihe ga me' starts, Can u ao g:ressively seek informatio n. from the lnterneL from
c~iieagues.
everyone attend ing the game will possible:'
needs 10 be ab le 10 suppon 100,000 people enlerin g Ihe sladium wilhin 15 mmules. ompare from books. and 0 on. You ne\ er knO\\ where
you ~ goi ng 10 get )our ne'U importrult bi, of
She also worked on a projecl Ihal developed ·'small
the two altern ati ve systems in terms of how many people per minute each sys Le~l1 can s~ppon, self-explanalory eleclronic blocks Ihal people could
info rmation. and you \\ On'l get that infonnarioo un}
how ma ny un its of each system would be neede d to support 100000 ' . people 111 15 mmules, connect together to build useful electron ic systems you seek il:·
and what Ihe overall cost of installation would be for the two competing systems.
1.32 How ma ny possible partiti onings are there 0 f a set 0f lasks where each (ask can be imple-
men ted on il microprocessor or ilS a custom di gi tal circuit?
1.33 *Wrilc a program th at automati call y partitions a set of 10 tasks among <l microprocessor and
custom digital circuit ry. Assumc th at each task has two assoc iated exec uti on times. onc for the
mic roprocessor and the other for custom di gital circuitry. Ass um c ulso that e~l c h task has an
assoc iated size num ber, representing the amount of di gital circuit ry req uired 10 implement th aI
tas k. Yo ur program should read in the exec uti on times and siles rro lll a fil e, Y~u r pro~~am
should seek to minimi ze th e amou nt of digital circuiLry while mee ting a con... t ~~lnl .spec lfied
on the sum of the task exec ution times. Your program shoul d output the pa rllll o n~n ~ (ea~h
tn k's na me and whe ther the task is mapped to the mic roprocessor or to cuc;tOI~ ~ l g lla,1 CI~.
cui lry), lhe 10lal exec ul ion li me of Ihe lasks for Ihat partilioni ng, and Ihe lowl d lgllal CIrCUlI
size. Hin l: you probably can'l Iry all possible panilionings of Ihe 10 ""k" >0 l"e a pnrll-
lioning approach Ihal makes Some ed ucmed gues,"s. Your program lif..c ly won." I. "': able 10
g uaran lee lhal il find, Ihe "':Sl panilioni ng, bUI il , houl d (I I le"'1 li nd a good parlillorlillg.
2.2 Switches 31
Electronics 101

2 You '. re probably fam iliar with the idea of e lectrons, or let's just say charged panicles.
fl ow ll1g through wires and causing lights to illu minate or stereos to blast mu ic. An anaJ-
ogous situation is Wate r flowin g through pipes and causing sprinklers 10 pop up Or
turbi nes to turn . We now describe th ree bas ic e lectrical terms:
Although
wu/erstalldillg 'he Voltage is the difference in e lectri c potential between two points. Voltage is mea-
Combinational Logic Design electronics
underlyil/g tligiftll
logic gtlles is
sured in volts (V). Conventi on says that the emh. or ground. is 0 V. [nformally,
voltage tells us how "eager" the charged panicles on one side of a wire are to get
Optiollal,II/("' )I to grou nd (or a ny lower voltage) on the wire's other side. Voltage is analogous to
peoplejilld II basic rhe press ure of wa ter trying to flow th rough a pipe-water under higher pressure
IIlu/ersflIlldillg
satisjies IIIlIch
is more eager to fl ow. even if the wa ter can't actually flow perhap becau e of a
clIriOSilY alld al,..o
closed faucet.
helps ill
2.1 INTRODUCTION I/Iulersullldil/g Current is a measure of the fl ow of the charged panicles. Informally, current teli
SOllie of the 1/01/· us the rate that panicles are ac tua ll y flowing. Currem i analogou to water
A dio ita l c ircuit. whose out puts d epend sole ly on the present. .combinatioll
.. of the
b circlIit
. b t ideal digital gate
. ". I 's called a combillatiollal circllit. Combll1ali onal CirCUIts are a aS lc u flowing th ro ugh a p ipe. Cu rrent is measured in amperes (A). or amps for hon.
behavior later 01/.
II/PillS va lies / , b . ponantly
. I . f di oital c irc uits ab le to imple ment some syste ms. ut more 1m Resistance is the tendency of a wire (o r anything. really) to re i t the flow of cur-
Important c ass 0 ,," f' . T I 's chapter introduce the re nt. Res istance is a nalogous to a pipe's diameter-a narrO\ pipe re isIS water
scrvin o as the basis for more complex classes 0 ClrCLlIlS. 11 . (
des ion"of bas ic combinational c ircuits. Late r chapters will deal with mo re ad ~anced com- fl ow. while a wide pipe lets wate r flow more freely. Electrical resistance i mea-
sured in o hms (Q ).
bi na7ion'1 1 c ircu its and with sequent ial circu its. whose outputs depend on t e seqhuedn?re
"
(hi story) of va lues . that have appea red at t he CirC
. UIt. ,s.Inpu ts .Fio
" ure 2 . I Illustrates te l -
Cons ider a battery. The panicles at the positive terminal Want to flo" to the
Ference between combinati ona l and seque ntIal Clrcu tts. negat ive te rmina l. How "eager" are they to flow ? That depends on the \oltage dif-
fe rence be tween the terminals-a 9 V battery'S panicles are more eager to flow
f"
than a 1.5 V battery's panicles. because the 9 V battery'S panicles ba\e more
a
'"l> potential energy. Now suppose yo u connect the positive tenninai through a light
F F 2 ohms
b bul b back to the negati ve terminal as shown in Figure 2.~ . The 9 \ ' batteI) will
result in more current fl owing. and thus a brighter lit light. than the 1.- V baneI).
If we know the present input bi t values, We cannot determine the output value 9V Prec isely how muc h current will flow is detemlined using the equation:
then we can determine the output value. just lrom tooking at the present input
If ab=OO. then F is a values. We must atso know the history V = IR (k nown a Ohm's Law)
tl ab=O l , then F is 0 01 input va tues. 4.5A
If ab= l 0, then F is 1 e.g., il ab was 00 and then 10, F is 0 where V is voltage, I is current. a nd R is resistance (in this case. of the light bulb).
If ab=ll, then F is 0 but il ab was 11 and then 10. F IS 1 Figure 2.2 9V battery So if the res istance were 2 ohms. a 9 V battery would re ult in ~.) A lsint'e 9 =
connected to light bulb. 1*2) of c urrent. while a I.) V battery would re ult in 0.75 A.
Figure 2.1 Combinati onal versus sequential digital circui ts.
Rewriting the equation as I = VIR might make more inruitive ense--the
higher the voltage. the more current: the higher the resistance_ the k -- current.
The chapter will introduce the basic bu ilding blocks o f combinati o na l c irc uits, Ohm's Law is perhap the most fundamental equation in electroni s.
know n as logic gates. and will also introd uce a form of ma thema tiCs, known as Boolean
a lgeb ra, that is usefu l for designing com binationa l c ircuits.
The Amazing Shrinking Switch
Now back to swi tc hes. Figure 2.3(b) show_ that a s"'it h has three pans-let's call them
2.2 SWITCHES
the source input. the o utput , a nd the ontrol input. The source input has hlgher \OIt3~
Electronic sw itc hes form the basis of all di gital c ircuits, so they make a good sta ning than the ou tput. so c un'ent wanlS to flo\\ from the source input through the ,,,it -h It> the
point for the disc ussion of di gi ta l circuits. You usc a type o f switch, a li ght ,witch, whel.l- OUlpUt. The who le )JlIIlJose of a switch is to block t1U1 current" h 'n th' 'onrrol '{~ th
ever you turn li ghts on or ofr. To understand a switch, it help, to understand some ba IC switc h "ofr." and to allow that cmrent to Ilo\\ \\ hen control, 'ts th <\\I(.:h "(,n." F...'r
e lec tron ics. exa mple. when yo u flip a light switch up to tum th' ,,, it-'h on. the ," Itch 'au ,~, t

30
32 Combinational Logic Design
2.2 Switches 33
wire so curren t flows. When you flip the
Source input wi re to physically touch the output
. . If h
. ' II
'tch physlca y separ
ates the source input from ~ MDfBUGGING"
switch down to turn the SWItch a . t e SWI . I'k r cet valve that determi nes
the o utput. In our wa te r analogy. the control input IS I 'e a au , In 1945, a moth got stuck in one of the relays of the Mark 11 computer
whether water fl ows through a pipe. at Harvard. To get the compu ter working properl y again. technicians
found and removed the bug. Though the tern, "bug" had been used for
conlrol decades before by engineers to indicmc a defect in mechanical Or
input electrical eq uipment. the removal of that moth in 1945 is considered
to be the origin of the term "debugging" in computer programming.


I \ source
input
/ "off"

output
Technl~.ans taped that moth to their written log (shown in the picture
to the s.de), and that moth is now on display at the National Museum
of American History in Washington , D.C.
discrete con trol
transistor input
"on" The machine said to be the world 's first general.purpose computer. the ENIAC (Elec~
relay vacuum tube Ie
I trOI1J~ Nu mencal Integrator And Computer), was completed in the U.S. in 1946. ENIAOO
source output contatned about 18.000 vac uu m tubes and 1500 relays. weighed over 30 ton . was I
quarter input
(to see the relative size) fee l long and 8 feet high (so it would likely not fit in any room of your house. unles you
(a) (b) have a n absurdly big house). and consumed 174,000 wans of power. Imagine the heat
generated by a room full of 1740 IOO·wan light bulbs. That' hot. For all thaI. E'llAC
Figure 2.3 (a) The evolution of switches: relays (1930s), vac uum tubes ( I940s). discre.te transistors could compute aboul 5000 operations per second-compare that to the billions of opera-
( 1950s). and integrated ci rcuits (Ies) contain ing transistors ( 1960s-present). lC's on gmally held tions per second of today's personal computers, and even the tens of millions of
about len lransislors: now they can hold more than a billion. (b) Simple view of n SW Jlch. computations per second by a handheld cell phone.
Although vacuu m tu bes were faster than relays. they consumed a lot of power. geo-
Switc hes are what cause digital ci rcuits to uti lize binary numbers made from ~its­ erated a lot of heat, and failed frequeJ1lly.
the on or off nature of a switch corresponds to the Is and Os in binary. We now dtscuss Vacuum tubes were commonplace in many electronic appliances in the 19605 and
the evolution of swi tches over the 1900s, leading up to the CMOS tran sistor switches 1970s. I remember taking trips to the store with my dad in the early 19705 to buy replace-
commonly used today in digital circuits. ment tubes for our television set. Vacuum rubes sti ll live today in a few electronic de\;c<7'
One place you might still find tubes is in electric guitar amplifiers. where the rube
1930s -Relays unique-sounding a udio amplification is still demanded by rock guitar enthusiasts who
Enaineers in the 1930s tried to devise ways 10 compute using electronically controlled want their version of classic rock songs to ound just like the originals.
sw~ches-s\Vitches whose control input was another voltage. One such swi tch , an electro·
1950s-Discrete Transistors
magnetic relay like tha t in Figure 2.3(a), was already being used by telephone industry for
switching telephone calls. A relay has a control input that is a type of magnet, whtch T he invention of the transistor in 1947. credited to William Shockley. John Bardeen. and
becomes magnetized when the control has a positive voltage. In o ne type of relay, that Walte r Brattain of Bell Laboratories (the research am, of AT&n. resulted io mailer and
magnet pulls a piece of metal down, resulting in a connection from the sou rce input to the lower-power computers . A solid·state (discrete) transistor. hown in Figure 1.:(a). uses a
output-akin to pulling down a drawbridge to connect one road to another. When the small piece of silicon. "doped" with some extra materials. to create a wit h. inee these
control input re turn to 0 V, the piece of metal returns up again (perhap pushed by a small switches used "solid" materials rather than a vacuum or even moving pans io a rein}. the}
spring), disconnecting the source input fro m the output. In telephone systems, relays were common ly referred to as solid·state transistors. Solid· tate transi tors were maller.
enabled calls to be routed from one phone to another, without the need for those nice c heaper. fas ter. and more reliable than rubes. and became the dominant mputer swit h
in the 1950s and I 960s.
human operators that previously would manually connect one phone's line to another. Jo ck Kilby 01
Texas IlIsfmmellls
1940s-Vacuum Thbes mill Roben No}'ce 1960s-lntegrated Circuits
01 Fojr"hild .
Relays relied on metal pans moving up and down, and thus were rather slow. In the 1940s The invention of the illtegrated circuit (IC) in 195 reall) Ie\ luti nized computing.
SemicOIu/lictors
and I 950s. vacuum tubes, shown in Figure 2.3(a) and ori ginally used to amplify weak lire often credited An Ie. n.k .a. a chip. packs numerou tiny tran$i'tor.; on a fingernail·sized pi f :ili o.
1I';,h ellch/IO"'·lIg So instead of 10 transistors requiring 10 discrete ele troni mponc.>nt> n} our lx>ani.
e lec tri c signals like those in a telegraph , began to replace relay. in computers. Vacu um illriept'lIdt'lIIly
tubes had no moving pans, so the tubes were much faster than relays . 10 transistor.; can be implemented on one component. the ·hip. Figure _.:\3) .. \\ . a
im't'lIled rhe I e.
picture of an IC thut ha$ a few million transistors. Though earl} I ,fe3tured < nl_ t us f
34 Combinational Logic Design
2.3 The CMOS Transistor 35
lransistors. improvemen ls in IC technology have resulied in nearly ONE BfLLlON tran·
sistors on a chip loday. IC lechnology has shrunk transislors down 10 a tota lly dIfferent ~ HOW 00 THEY MAKE TRANSISTORS SO SMALL? USING PHOTOGRAPHIC METHODS
scale. A vacu um lUbe (aboul 100 mm long) is 10 a modem IC transislor (aboul 100 nm) as If you look a pencil and made Ihe smallest dOl Ihat you
could on a sheel of paper. Ihat dOl'S area would hold regions form pans of transislors. Repeating this proces
a skysc raper (aboul 0.5 km) is 10 Ihe Ihickness of a credi l card (aboul 0.5 mm). over and over again. with different chemicals at
I've been worki ng in Ihis field for IWO decades. and Ihe amounl of transIstors on a many thousands of transi stors on a modem sil icon chip.
different steps, results not only in transistors. but also
chip slill amazes me. The num ber I bill ion is bigger than mosl of us have an intuilive feel How can chip makers create such liny transistors? The
wires connecting [he transistors. and insuJators
key lies in photographic mel hods. Chip makers lay a
for. Th ink of pen nies, and consider Ihe volume Ihal I billion pennies would occupy. preventing crossing wires from touching.
special chemical OnlO the chip, special because Ihe
Would Ihey fil in your bedroom? The answer is probably no (unless you have a really chemical changes When exposed 10 light. Chip makers
huge bedroom), since a Iypica l bedroom is aboul 40 cubic meiers, while I billion pennies Ihen shine Iighl through a lens Ihal focuses the lighl
wou ld occupy aboul 400 cubic melers. So you would need aboul 10 bedrooms, roughly down to ex tremely small regions on the chi p-si milar
Ihe size of an el1lire house, packed from wall to wall , floor 10 ce iling, wi th pennies, 10 to how a microscope' lens ICls us Sec li ny things by
Photograph of a Pentium
slore all Ihal money. And if we Slacked the pennies, Ihey would reach nearly 1000 miles focusing light. but in reverse. The chemical in Ihe small
processor's silicon chip
imo Ihe sky-for comparison. a jel fli es at an allilude of about 5 mi les. That 's a lot of illu mi nated region changes. and lhen a solvent washes haviflg millions of
pen ni es. BUI we manage to fi l I billion lransislors onto si licon chips of jusl a few square away th e chemical-but some regions stay because of lraflsislors. Acltlal si:e is
cemimelers. Truly amazing. the lighl that changed thaI region. Those remaining about I em each side.
The wi res thai connecl all those transistors on a chip, if straightened into one straight
wire. wou ld be several miles long.
IC Iransistors are much smaller, more reliable, fasler. and less power-hungry than wide enough to lei electrons pass through. People have been predicting the end of
Moore's Law for over a decade now. but transistors keep shrinking.
discrele lransislors. Thus, IC lransistors are now by far the mo t com monly used switch in
computing. Not only do smaller transistors and wire provide for more functionality in a Chip.
but they also provide for Faster circuits. in pan because electrons need not travel as far to
ICs of the early 1960s could hold tens of transistors, and are known today as small.
get from one transistor to the next. This increased speed is the main reason why personal
scale il1legrati on (SS/). As transistor sizes shrank. in the late I960s and early I970s, ICs
computer clock speeds have impro ed so drastically over the past few decade. from kilo-
cou ld hold hundreds of transistors, known as medi um-scale integration (MS/). The 1970s
hem frequencies in the 1970 to gigahenz freq uencies in the early 2000 .
saw the developmem of large-scale integration (LS/) ICs with thousands of transistOrl;,
while very- large scale integrat ion (VLS/) chips evolved in the I980s. Since then, ICs
have cominued to increase in their capacity, to around I billion transistors. To calibrate 2.3 THE CMOS TRANSISTOR
your underst:lI1ding of thi s number. consider thai the first Pentium microprocessor of the
early 1990s required only aboul 3 million transistors, and some popular but relatively The most popu lar type of IC transistor is the CMOS transi tor. Although a detailed e.~pla­
small microprocessors require only about 100,000 transistors. Many of today' high-end nation of how a CMOS tran istor works is beyond the cope of this book. nevertheless.
chips Iherefore comai n dozens of microprocessors, and can conce ivably comain hundreds I've found that a simplified explanation seems to satisfy much curiosity.
of the relatively small microprocessors (or just one or two big microprocessor ). A chip is made primarily from the element silicon. A hip. also known as an inte-
IC density has been doubling roughly every 18 months since the I960s. The doubling grated circuit, or IC, is typically about the size of a fingernail. Even if you open up a
of IC densi ty every 18 months is widely known as Moore's Law, named after Gordon computer or ot her chip-based device. you would not actually see the ilicon chip, inee
Moore, a co-fo under of [ntel Corporat ion, who made predict ions back in 1965 that the chips are actually inside a larger. usual ly black. protccti"e package. But )OU ""'nainl)
num,ber of componenls per IC would double every year or so. At some point, chip makcrl; should be able to see those black package. mounted on a printed ireuit board_ in ide a
variety of household electronic devices.
won t be able 10 hnnk transIstors any fun her. After all . the transistor has to at least be
Figure 2.4 illustrates a cross section of a tiny pan of silicon hip. howing the ' ide
~ A SIGNIFICANT INVENTION view of one type of CMOS transistor-an nMOS trnnsistor. The trnnsistor has the thre..>
parts of a switch: ( I) the SOl/ree input: (2) the output. which is ailed the drain. I suppo-
We now know lhal lhe inven tion of the transistor was the becau e electric panicles flow to the drain like water Hows to 3 drain: and (3) the :onO'OI
sian of the amazing computation and communication people at Ihe time of its invenlion. Newspapers did nOl
headline the news. and mosl stories Ihat did appear input. which is ca lled the gate. I suppose because the gate blocks the current Ho\\ like a
revolutions thaI occurred in the laller half of Ihe 20th gate bl ocks a dog from e caping the ba kyard . A hip maker o-eates the soun-e and drain
century. enabl ing us 10 loday do Ihings like see the world predlcled "mply Ihal transislOrs would improve things
like rad,os and heari ng aids. One may wonder whal by injecting cenai n elements into the -iii on. Figul'e _..! al'o 'ho\\ _ the el 'O'Onic s)mool
On TV. surf Ihe web. and lalk on cell phones. Bul Ihe
recently invented bUI unnoti cd lechnology mighl of an nMO transistor.
Implications of the transistor were not known by mOSI
SIgnificantly Change Ihc world once again. Suppose the drain was onne 'ted to a slllall po -ithe ,oltagc (Illodem t 'ho'iogi:
use about I or 2 ) knO\\~l as the "power suppl):' and the source \\:l> X'nn ·ted thn.'\U.gh

I D
36 2 Combinational Logic Design
2.3 The CMOS Transistor 37

nMOS~ ~
A positive ... aHracts electrons here,
vol tage here .. turning the channel {
between Source and drain
into a conductor. gate--jl ~ ~l
conducts does not
conduct
(bJ

PMOS~ ~
'-4\J
(aJ
Figure 2.5 CMOS transistor operation analogy-A person may not be able to cross a river until JUSt
,"'-4, '4
l enough stepping stones are attracted into one pathway. Likewise, electrons can 't cross the channel
between Source and drain until just enough electrons are attracted into the channel.

does not conducts


conduct We mentioned that nMOS was one type of CMOS transistor. The other type is
Figure 2.4 CMOS transistors: (aJ transistor on silicon. (b)
nM OS tran sistor symbol with indicati on of conductin g when (c) pMOS . A pMOS is similar except that the channel has the opposite functionality-the
gate; I. (c) pMOS transistor symbol condu cts when gate; O. chan nel is a conductor norma ll y, and then doesll'r conduct when the gate has a positive
voltage. Figure 2.4 shows the e lectronic sy mbol for a pMOS transistors. The use of these
a resistor to grou nd. Current would thus want to Row from drain to source, and on to two "complementary" types of transistors is where the C comes from in CMOS. The
gro und . (Note: unfortunatel y, conve ntion is that current How is de fined using positive MOS stands for Metal Oxide Semiconductor, but the reasons for that name go beyond the
cha rge, even tho ugh ac tuall y negati vely charged electrons are fl ow ing-so you may scope of thi s discussion .
notice that we say current fl ows from drain to source, even though e lecLrons flow from
source to drain .) However, the silicon channel between source and drain is not normally a
~ SILICON VALLEY, ANO THE SHAPE OF SILICON
cond uctor, or in other words, the channel is normall y an in sula tor. We can think of an
insul ator as a n extremel y large resistance. Since I ; VfR, then I will essentiall y be O. The Silicon Valley is not a city, but refers to an area in
sw itc h is off. In fact. to the naked eye. a silicon chip actw!lJy looks
Northern Califomia. about an hour south of San like a small mirror.
The really inte resting thing about silicon is that we can c hange the chan nel Francisco, that includes several cities like San Jose,
from a n ins ul ato r to a conductor just by applying a sma ll positive voltage to the Mountain View. Sunnyvale, Milpitas, Palo Alto. and
ga te. Th at ga te voltage doesn' t result in cu rrent fl ow from the ga te to c hannel , others. The area is heavily populated by computer and
beca use o f the s ma ll insulator (ox ide) between the ga te a nd the c hannel. But, that other high-technology companies, and to a large extent
gate vo ltage does create a posi ti ve electric fie ld that a ttrac ts e lec trons, whi c h have a is the result of Stanford University's (located in Palo
Alto) effons to attract and create such companies. What
negat ive c harge, fro m the larger silicon region into the channe l regio n-a kin to how
shape is silicon? Once. as my plane arri ved in Silicon
yo u can mo ve paper clips on a tab letop by mov ing a magne t under the tab le. When
Valley, the person next to me (who happened to be a
e nough e lectrons gather into the chan nel, the cha nne l all of a sudde n becomes a
college senior studyi ng Computer Science) asked
co nd uctor. A co nd uctor has ex treme ly low resistance, so c urren t flow s a lmost free ly "What shape is a silicon anyways?" I eventually
betwee n drain and source. The sw itch is now on. As yo u can see, s ili co n is not quite rea li zed he thought silicon was a type of polygon. like
a conduc tor but no t quite a n in sulator ei ther, mlhe r re presenting something in a pentagon or an octagon. Well . the words do sound
betwee n- he nce the te rm semicOllducl or. similar. Silicon is not a shape. but an element. like
An a na logy to the cu rrent trying to cross the channe l is a pe rson try in g to cross carbon or aluminum or sil ver. Silicon has un atomic
a ri ve r. No rma ll y, the ri ver mi ght not have e nough stepping sto nes for the pe rson to number of 14, has a chemical symbol of"Si:' and i the
second most abundan t element (next to oxygen) in the A d rip packagt w;lh its chip coveT ~mQ\-nJ-."ou C'lUt
be ab le to wa lk across. But if we could altract stones from othe r pa rt o f Ihe river
earth's crust, found in items like sand and clay. Silicon see rhl! mirror-like SiliCOII chip ill lite ctnur.
into one pathway (the c hannel), the pe rson could eas il y wa lk ((cross the river
(Figure 2.5). is lIsed to make mirrors and glass. in nddition to chips.
38 2 Combinational Log ic Design
2.4 Boolean Logic Gates- Building Blocks for Dighal Circuits 39
2.4 BOOLEAN LOGIC GATES- BUILDING BLOCKS FOR DIGITAL CIRC UITS "ob==O J" ij'
s/rorrlullld fo r OR returns 1 if either or both of its operands are 1. So the result of a OR b is 1
You'vc seen that CMOS transistors can be used to implement switches on an incredibly tiny "0= 0. b=I."
111 any.or the following cases: ab=OI , ab= 10, ab= 11. Thus, the only time a OR
scale. However. trying to usc switches as Our building blocks to bui ld complex digital ctrcults b IS 0 IS when a b-O O.
is aki n to urying to use small rocks to build a bridge. as illustrated in Figure 2.6. Sure, you
could probably bui ld someth ing from rudimentary building blocks, bu t the building process NOT returns 1 if its operand is O. So NOT(a) return 1 if a is O. and returns 0 if a
IS 1.
would be a real pain . Switches (and small rocks) are just too low-level as buildi ng blocks.
We use Boolean logic operators all the time in everyday thought such as in the state-
ment "I'll go to lunch if Mary goes OR John goes, AND Sally does ~ot go." To represent
thIS uSll1g Boolean concepts, let F represent my ooing to lunch (F-l means I'll go to
lunch , F=O means I won ' t go). Let Boolean variables m, j, and 5 represent Mary, John.
Transistors are and Sally each going to lunch. Then we can translate the above English sentence into the
00);)00 hard to work with Boolean equation:
These blocks ... ...are hard to work with.
F - (m OR j) AND NOT (s)
------ ---- ----- - - - - - - - - -- - - ---- -- --- - ----. ---- - . ---- -. ---- - - --- - ----- -- ------- -
So F wi ll eq ual 1 if either m or j is 1. and s is O. ow that we've translated the
English sentence into a Boolean equation. we can perform several mathematical activities
DOD with that eq uation.
One thing we can do is determine the value of F for different values of m. j . and 5:
The logic gates that we'll
soon introduce enable • m=I , j=O, 5-1 ~ F = (l OR 0) AND NOT(l) = 1 AND 0 = 0
greater designs
The right building blocks ... ... enable greater designs. • m=I , j=I , s=O ~ F = (lOR 1) AND NOT(O) = 1 AND 1 = 1
In the first case, I don 't go to lunch; in the second, I do.
Figure 2.6 Hav ing Ihe ri ght build ing blocks can make all the difference when building thi ngs.
A second thing we could do is apply some algebraic rules (which we'll discuss later)
to modify the original eq uation to the equ ivalent equati on:
Boolean Algebra and Its Relation to Digital Circuits
F - (m and NDT(s») OR (j and NOT(s»)
Fonunately. Boolean logic gates help us in the design task by representing digital circuit
bu ilding blocks that are much easier to work with than switches . Boolean logic was In other words, I'll go to lunch if Mary goes AND Sall y doe not go. OR if John goes
developed in the mid- 1800s by the mathematician George Boole. not to bui ld dig ital cir- AND Sally does not go. That statement, as different as it may look from the earlier ooe.
is eq uivalent to the earlier one.
cuits (which weren' t even a glim mer in anyone's eye back then), but rather as a scheme
for u ing algebraic methods to formali ze human logic and thought. A third th ing we could do is formally prove propertie about the equation. For
An algebra is a branch of mathematics that uses letters or sy mbols to represent example, we could prove that if Sally goes to lunch (5=1). then I don't go to lun b (F=O)
numbers or values, where those letters/symbols can be combined according to a set of no matter who else goes, using the equation:
known rules . Booleall algebra uses variables whose val ues can on ly be 1 or 0 (repre-
senting true or false, respectively) and whose operators, li ke AND, OR , and NOT, operate F - (m OR j) AND ND T(I) - (m OR j) AND 0 0
on such variables and return 1 Or O. So we might declare variables x, y . and z, and then No matter what the values of m and j . F will equal O.
say th at Z = x OR y , meaning z is 1 only if at least one of x or y is 1. Likewi se. we Noting all the mathematical activities we can do using Boolean equati ns. you can
might say z = x. AND NOT(y). meaning z is 1 only if x is 1 ;lI1d y is O. Contrast stan to see what Boole was trying to accomplish in formalizing human reasoning.
Boolea n algebra with Ille regular algebra you 're familiar wi th from perhaps high school,
in which variabl e va lues could be integers (for example), and operators cou ld be addition, EXAMPLE 2.1 Converting a problem statement to a Boolean equation
subtracti on, and multipli cation.
The basic Boolean operators are AND, OR. and NOT: Convert Ihe following problem st3lemenlS 10 Boolean equation. u ing roo R. and :O\OT ('3-
10rs. F shou ld equal I only if:
A D return. 1 if both its operands are 1. So the result of a A D b is 1 if both I. a is I and b is 1. A llslI'er: F = a A D b
a ~ 1 and b= 1, otherwise the result is O.
2. ci lher of a or b is 1. AllslI'er: F = a OR b

~. - -.._ --
40 Combinational Logic Design 2.4 Boolean Logic Gates- Building Blocks for Digital Circuits 41
AND, OR, & NOT Gates
J. both a and b are not O. AIl.\,wer:
(a l Oplion I: F; NOT(a) AND NOT(b)
To bui ld digital circuits that can be manipulated using Boolean algebra, we tirst imple-
tbl Oplion 2: F ; a OR b ment the Boolean operators AND, OR, and NOT using small circuits of switches, and we
4. a is 1 and b is O. AII.m·er. F ; a AND NOT(b) . Earlier we said a call those circuits Boolean logic gates. Then, we fo rger obour swirches_ and instead use
"gate" was the Boolean logic gates as Our building blocks. Suddenly, we have the power of Boolean
. J atemcnts (0 Boolean equati ons: .flilitch c:ofllrol
Convert the followin g English prob em 51 . d h ys tcm is set to enabled. algebra at Our fingertips when deSigning more complex circuits! This is akin to first
" r ifhiah heat IS sensed an I CS " d F rep _ iI/put of a CMOS
I A fire sprinkler system should spray wale . o . d " e represent "enabled, an tralls is to r, but fl OW asse mbling rocks into three shapes of bricks, and then building structures like a bridge
. Answer: LeI Boolean variable h represent "'ugh heat IS sense . we're tlllkiflg from those bricks, as illustrated in Figure 2.6. Trying to build a bridge from small rocks
. ,. TI quation is' F; hAND e. about "logic:
resent "sprayln g wate r. len all e . . haken or the door is is much harder than bUilding a bridge from the three basic brick shapes. Likewise, trying
. bl d and eilher Ihe car IS s gates." III all
, A ca r alarm shou ld sound if the alann I S e lla e . I "car 'IS shaken " d represent ulI/orluI/ate to build a moti on-in-the-dark circui t (or any digital circuit) from switches is harder than
_.. ., . bled" S rcprcsen , building a circuit from Boolean logic gates.
opened. AIIslI'er: Lei a re present alarm IS ena" ' i n is' F = a AND (s OR d). Iwming similarity,
.. . ' ~d " and F represent "alarm sounds. Then an equat 0 . oro
the sallie \.... Let's first implement Boolean logic gates using CMOS transistors. and then later
door ISopcne . ' . or d ro resenlS "door is closed" inslead of open (gate) refers to
we' ll show you how Boolean algebra helps bui ld better circuits. You really don ' t hove to
(al Alternali vely. assunllng Ihal our door sens p blain the following equation : F; two different
( mc~ning d=l when the door is closed, 0 when open), we 0 things. D Oli '/ understand the underlyi ng transistor implementations of logic gates to learn the digital
a AND (s OR NOT(d)). worry /h ollgh; design methods in the rest of this book, and in fact many textbooks omit the tranSistor
(I/ter Ihe "ext discussion entirely. But an understanding of the underl ying transistor implementation can
sec/io l/, we '/I jus t
EXAMPLE 2.2 Evaluatin g Boolea n eq uations f . bles a b be /I sing the wo rd be quite satisfyi ng to a student, leaving no "mysteries." Such an understanding can also
Evaluale Ihe Boolean equalion F ; (a AND b) 0 R (c AND d) for Ihe given values 0 vana " gate 10 refer /0 a help in understanding the nonideal behavior of logic gates that one may later have to
"'ogic gOle." learn to deal with in digital design.
C. and d:
We ' ll start by using "I " to represent the power suppl y's voltage level, which today is
a=I, b=I , c-1. d- O. AIISII'er. F = (1 AND 1) OR (1 AND 0) 1 OR 0 1.
usually arou nd I to 2 V for CMOS technology (e.g., 0.7 V, or 1.3 V). Let"s use "0" to rep-
a=O . b=1. c=O, d=I. AIISII'er. F (0 AND 1) OR (0 AND 1) o OR 0 O. rese nt ground. Note that we could have chosen any two symbols or words. rather than "I --
a-I. b- 1. C= l.d =l. AIISh'er. F ( 1 AND 1) OR (1 AND 1) 1 OR 1 1. and "0," to represent power and ground voltage levels. For example, we could bave used
" t rue" and " f a 15 e," or " H" and "L." Remember that the "1 '- does nO! nece sarily corre-
One might now be wondering what
spond to I V, and the "0" does not neceSSari ly correspond to 0 V. In fact each usually
represents a voltage range, like "1" representing any VOltage between 1.2 V to 1.4 V_
Boolean algebra has to do with bui lding cir- Boolean Boole's intent: formalize
cu its using switches. In 1938, an MIT algebra human thought
(mid-1800s) NOT OR AND
orad uate student named Claude Shannon
~vrote a paper (based on hi s Masters thesis)
describin o how Boolean algebra could be
Switches For telephone
switching and other
Symbol
xV- F ;D-F ;D-F
applied t~ swi tch-based circuits, by showing j ! (1930s)
electronic uses
xF xF
~
that "on" switches could be treated as a I (or Showed application y y
of Boolean algebra
Truth table o 1 0 0 0 ~
SIUlIIlIOfl, by,he true). and "ofr ' switches as a 0 (or fal se), by Shannon (1938) 1 0
to design of switch- 0 1 0 1 0

~
1\'0)', ;.roI50
connecting Ihose switches in a certain way based circuits 0 0 0
co".ridered the
fa/hero! (Fi oure 2.7). His thesis is widely considered
illfimllorioll as ~he seed that developed into modem Digital design
theory dill' /0
"if later l1'ork
dioital design. Since Boolean algebra comes
0 11 diXllal w~h a rich set of ax ioms, theorems, postu- Figure 2.7 Shan non applied Boolean ..
Figure 2.8 Basic logic gales
commlmic(J/WII .
lates, and rules, we can use all those things algebra 10 swilch-based circuils. p~vld ll1g
a form al basis 10 digital circuil deSign. symbols, trulh lables, and transislor Transistor
to manipul ate digital circu its USing algcbra. circuits: (a) NOT (i nverter) gate. (b) circuit F
In other wo rds: 2-i npul OR gate, (c) 2-input AND
We ca n build circuits by doing math. gate. Warning: rea l AND and OR
g",es arell " aClually buill this way,
That 's an ex tremely powerful concept. We' ll be building circuits by doing math but ra th er in n more complex
throughout this chaptcr.
manner-sec Section 2.8. (a) (b)

.. -... - -
42 2 Combinational Logic Design
2.4 Boolean Logic Gates- Building Blocks for Digital Circuits 43

NOT Gate I s be the oppos ite, or inverse, o~


A NOT gale has an input X and an output F. F should n way We can bu ild a NOT
a f X-lor thI s reason. a NOT gate IS common)
C • • I ' called an. F' ler . ? 8(a) The tri anole at
/IIver
gate using one pMOS and one nMOS transIstOr, a
. s shown In IOllfC
of' l
_ . ,. co
ower supply which
the top of Ihe transistor circuit represents the POSII1V .. e voll '.0.
aoe a lIe p ents ground , which
we represent as 1. The seri es of lines at the battorn of Ihe ClfCUlt" repres11 onduct ,but the
we represe11l as O. When Ihe .1I1put x .IS 0, Ihe pMOS transIstorh'WkI cf the circuit , as a F
nM OS will . not. as shown .111 F,gure
. ?_. 9( a) . In Ih at case
, we can t In. a1 the nMOS will
. fl'O m 1 to F, so when x =0, F =1. 0 n tIIe a ther hand
wife " when X IS , e can think of
Figure 2.11 OR gate conduclion
conduCI, bu t Ihe pMOS wil. l not, as shown .In FIgure 29(b)
. . In that. case,F' w 28 called a paths when: (a) one inpul is l. and
"
the ClfCUlt as .a Wife from 0 10 F, so when X= 1, F-- O. The table 111 ' Iguret .t ,for every (b) bOlh inputs are O.
Irlllll lab le, summarizes the NOT gate's behavior by listing the gate s au pu
poss ible input.
. Figure 2.1 2shows a timing diagram for an OR gate. (See Section 1.3 for an introduc-
tIon to lIm1l1g diagrams.) We set inputs x and y to each possible combination of values.
and show that F wlil be 1 if ei ther Or both inputs is a 1.
Larger OR gates, having more than two inputs, are also pos ible. If at least one of the
OR gate 's inputs are 1, the output is l. For a three-input OR gate. the tran iSlOr clrcuit
Figure 2.8(b) would have three pMOS transistors on top and three nMOS transi [Ors on
time the bottom, instead of two transistors of each kind .
Figure 2.12 OR gate
AND Gate
liming diagram.
A basic AND gale has Iwo inputs x and y and an outpul F. F should be 1 only if both x
Figure 2.9 In ve rt er conducti on paths when: and y are l. We can build an AN D gate usi ng two pMOS transi stors and two nMOS tran-
(a) the input is O. and (b) the input is 1.

Figure 2. 10 shows a liming diagram for an inverter-when the input is 0, the output
D- sistors, as shown in Figure 2.8(c) (again, we will see in Section 2.8 that AND gates are
actually built in a more complex manner). If both x and y are 1. then we get a connection
from power to F, but no connection from ground to F, so F is l. as hown in Figure
is 1; when the input is 1, the output is O. 2. 13(a). If at least one of x or y is 0, then we get a connection from ground [0 F. but no
Electrically, combining pMOS and nMOS in this way has the benefit of low-power. connection from power to F, so F is 0, as shown in Figure 2.13(b). The truth table for the
AND gate appears in Figure 2.8(c).
F1 ~ otice in Fi gure 2.8(a) th at fo r any value of x, either the pMOS or nM OS tranststor
wi ll be nonco nducting. Thus (conceptually), current can never now fl'Om the power
o
- - - - -... source to ground- thi s feature will also be true fo r the AND and OR gates we' ll define
time next. Thi s feature makes CMOS circuits consume far less power than other transistor
Figure 2.10 Inverter techn ologies, and part ly exp lains why CMOS is the most popular logic gate tranststor
liming diagram. technology today.

OR Gate x~JLJ
A basic OR gale has two inputs x and y and an OUIPUI F. F should be 1 only if at least
one of X or y is l. We can bui ld an OR gate using two pMOS transistors and two nMOS
trans istors, as shown in Figure 2.8(b) (although we will see in Section 2.8 th at OR gates
~~
are actually built in a more complex manner). If al least one of X or y is 1. then we get a
connecti on from 1 to F, but no connection from 0 to F, so F is 1, as shown in Figure
2.II(a). If both X and y are 0, then we get a connection from 0 10 F, but no connection
from 1 to F, so F is 0, as shown in Figure 2.11 (b). The truth table for the OR gate appears
F

0----1'
figure 2.14 AND gate
time
Figure 2.13 AND gate conduction
paths when: (a) all inpuls are l. and
(b) and input is O.

Figure 2. 14 shows a tim ing diagram for an AND gate. We set input$ \ and) to a h
in Figure 2.8(b). timing di:'lgram.
possible combination of va lucs. and show that F \\'ill be 1 onl) if both inputs :II\' a .
- - - - - -- - -- - ._-- ,-

44 Combinational Logic Design , 1 onl y if 2.4 Boolean Logic Gates-Building Blocks for Dig~a l Circuits 45
'ble The output IS EXAMPLE 2.5 Using AND and OR gates with more than two inputs
L aroer AND oates, having more than twO inPhu ts, are PsO
t ~~lcir~uit Figure 2. 8(b) wo uld
• " " . . ut AND oate. t e transl . tead of
'111 the inputs are 1. For a three-tnp d h " nMOS transistors on the bOllom , tnS Figure 2. 17(a) shows an implementalion of the equation F = a AND bAND c. using two-input AND
;,ave three pMOS transistors on top an tree gates. However. deSigners would typically instead implement such an equation using a single three-
twO transistors of each ktnd. tnput AND.gate, shown in Figure 2. 17(b). The function is Ihe same. but the three-input AND g3le uses
fewer tran IStOrs, 6 rather than 4 + 4 =8 (as well as having Ie s delay-more on delay later). Likewise,
F = a AND b AND c AND d would typically be implemented u ing a four-input AND gate.
Building Simple Circuits Using Gates how how to build

:%, ;0-,
.. bl k f om transistors, we now s F'
Detector Having bu il t logic gate butldtng ~Ic ~ r Recall the digi tal system example of Igure
use ful circ ui ts from those bUlld tng oc s. t'on and b=O meant dark, so we
a -,.- - --1 . d ark de tector. a= l meant 010 Ih , . erter to get NOT ( b ) , and
I 13 the mo ti on-tn-the-
. . d F - a AN D NOT(b). We can connect b throug an tnv . F The resulting circuit
wante - . AND oate whose output IS .
ect the result along with a tnto an '" . We now provide more
conn
appears .111 F'19ure I 13(c)
. , shown again.... to the left for conve ntence.
(3) (b)
examp les.
Figure 2.17 Using multiple-input AND gates: (a) using 2-input AND gates. (b) using a 3-input
AND gate.
EXAMPLE 2.3 Convertin g a Boolean equation to a circ uit with logic gates
Convert the following equation to a circuit:

F : a AND NOT( b OR NOHe)


We start by drawing F on the ri ghi, and then
-i>D-F (3)
The same approach applies to OR gates. For example. F = a OR b OR e "auld typically be
implemented using a single three-input OR gate.

We now provide examples tarring from Eng lish problem de criptions. which we
worki ng backwards toward ',he inputs. (We ~ould convert to Boolean equations, and then fi nally implement as a ci rcuit.
instead start by drawing the Inputs on (h.c left and
work ing toward the output.) The equation for F
ANDs IWO ilems: a. and the OUtpUI ~f a NOT. We
,~' EXAMPLE 2.6 Seatbelt warning light
Suppose you Want to design a system for an au tomobile Lhal
thus begi n by draw ing the CIrCUit of Figure 2.. 15(a). illuminates a warning light whenever the driver's seatbelt is
The NOT's inpul comes from an OR of Iwa Items: Figure 2.t5 Building the circuit for not fastened and the key is in the ignition. Ass ume the follow-
b. and NOT(C). We th us complete the drawtng tn F: (a) partial, (b) complete. iog sensors:
Figure 2. 15(b) by includi ng an OR gate and NOT
a sensor with output S indicates whether the driver's
gate as shown .
belt is fastened (5 = 1 means the belt is fastened ). and
a sensor with output k indicates whether the key is in
EXAMPLE 2.4 M e examples converting Boolean equations to gates . the igni tion (k =1 means the key is in).
or . . s to circui ts bUilt from
Fioure 2. 16 provides IwO more examples of convertmg Boolean equ~ lI on e fi ure shows the Assume the warni ng light has a single input wthaI i1luminales
lo:ic gates. We agai n start fro m the output and wo rk backwards to th~ tnputs. ~~e I~ced each gate the light when w is I. So the inputs to our digital system are 5
"
corresp a ndenee between equa ti on operators and gates. and the order In which p and k, and the Outpul is w. wshou ld equal 1 when both of the
in the circui t. fOllowi ng occur: 5 is 0 and k is I.
Let's first write a simple C program executing on a
F = (a AND NOT(b)) OR (b AN D NOTlc)) microproce sor to solve thi s design problem. If we connect S
2 1 3 to 10, k to n, and I. to PO, then Our C code inside the C pro-
gram's main () function would be:

wh i 1 e (1) I
F
PD - ! 10 && I I :
(3)

The code repeatedly checks the sen ors and updates the warning lighl.
(b)
Now leI's write a Boolean equlllion describing a ircui[ implementing the design:
Figure 2.16 Examples of conve ning Boolean equations to circuits.
w - NOT( 5) AND
46 2 Combinational Logic Design . lete the 2.5 Boolean Algebra 47
.
ccd earlier. we can eas il y cam p EXAMPLE 2.8
Usi no the AND and NOT logic gales that w~l1ltrod~ ld connect ing the resultin g NOT(s) and Seat belt warning light with initial illumination
' n ' f ~ur first system. by con necting s to a N? ~3te, [ L.et·s fu~her ex.tend th e previous example. Automo_
~e~~o:~~ inputs of a 2-input AND gate,. as shO\~n l~h:'~~r~Cu~: I ~~ a timing diagralll, we can set :~~ biles tYPically IIghl up all their warning lights when
BeltWam
Figu re 2. 19 provides n IImmg dwgram or draw the ou tput line to match the Clfeu you first lurn the key. so yo u can check that all the
inputs whatever values we want. but theOn \V~,;~U~~ then 10. then 11. The onl y time that the
to a wa~,"g IJ g,hts are working, Assume that Our sys tem
. I h figure we set 5 and k to . t ,
function. n t c . ' . 0 d k ' 1 as shown in the fi gure. receives an Input t that is 1 for th e first 5 seconds after
output \'1 wi ll be 1 IS when S IS an IS .
a key is inserted into th e ignition, and 0 afterward
Inpuls (don't worry aboui who Or whal sets t in thai way). So
we wan! '1=1 when p=l and s =D and k=l , OR when
BeltWarn kl~ t =1. NOle that When t =l . we illuminale the light,
o
5 1
0
_ _ __ J regardless of the values of p, s, and k. The new circuil
equation is: Figure 2.22 Extended seat belt
OulpUIS warning ci rcuit.
w = (p AND NOT (s) AND k) DR t
wl~
The circuit is shown in Figure 2.22.
o ..
Figure 2.19 Timing ti me
diagram for seatbell Some circuit drawing rules and conventions
Figure 2.18 Seat belt Seatbelt
warning circuit. There are some rules and conventions thaI designers
warni ng cireui !.
commonly fOllow When drawing circuits of logic gates:
Logic gates have one or more inputs and one
We stated earli er that logic gates are more outpul, bUI we typically don ' l label each gate's
approp riate than transistors as building blocks inpUlS or output. Remember that the order of the
for desionin o digital Clrcutts. Note, however, inputs inlo a gate doesn'l impact the logical
that the 100 i; oates are ultimately implemented behavior of the gale.
no yes
using trans"istors,
" as shown I.n . F'Igure .-:-
??O For
' _
C programmers, an analogy IS Ihal w rltmg soft-
ware in C is easier than wri ting 10 assembly.
even though the C ultimalely gets implemenled
w Each wire has an implicit direction. going from one
gate's Outpul to another gate's inpuI, but we typi-
Cally don' l draw arrows showing each direction.
A single wire can be branched OUI inlo two (or
=D- D-
using asse mbly. Notice how much less mlUltlve
and less descripti ve is Ihe tran s l slor-bas~d more) wires going 10 multiple gate inputs-the
circuit in Figure 2.20 than the equi valent logiC branches have Ihe same value as the si ngle wire.
gate-based circuil in Figure 2.18. But two wires can NOT be merged into one
Figure 2.20 Seat belt warning wire-whal wo uld be the value of that one wire
EXAMPLE 2.7 Seat belt wa rn ing light with driver sensor circuit usin g transisto rs, if the incomi ng IWO wire had different values?
Let 's extend the previous example by adding a sen-
sor. wi th ou tput p. that detects whether a perso~ IS BellWarn 2.5 BOOLEAN ALGEBRA
aClUally sitting in the dri ver's seat, and by ehang~ng
the system 's behavior LO on ly illuminate the warning Logic gales are lIseful for implementing circuits. bUI equations are bener for manipulating
when a person is detected in the seat (p=l) . So the circui ts. The algebraic tools of Boolean algebm enable us to manipulate Boolean equa-
new ci rcuit equation is: tions so we can do th ings li ke simplify the equations. check if two equati as are
equivalent, find the inverse of an equation . prove properties about the equati n . et '.
w- P AND NDT(s ) AND k
Since a Boolean equation consisting of AND. OR. and 'OT opemti n an be straight-
In this case, we need a 3-input AND gate. The cireuil forwardly transformed into a circuil of AND. OR. and 'OT gate' . manipulating Boolean
is shown in Figure 2.2 1. eq uations can be con idered as manipulating digital circuils.
Be aware thaI the order of the AN D gale's Figure 2.21 Seal be ll warning
Well informally introduce some of the most u 'eful algebraic I' f Bool an
inputs does not matter, circuit with person senso r,
a lgebra . API endi x A provides a fOrlnal definition of B lean algebr.l.
48 2 Combinational Logic Design 2.5 Boolean Algebra 49
TABLE 2.1
Boolean algebra precedence, highest precedence first
Notation and Terminology d 'b' n o Boolean equati ons. We' ll Symbol Name
. d linolooy fo r escn I " Desc ription
We now define some notation an tem I" book (J Parentheses
use these definitions extensively throughout t 1e . Evalullle expressions nested in parentheses fi rst
NOT
Evaluate from left to righl
AND
Operators tors in equatio ns is cumbersome. Thus, Boolean Evaluate from left to right
Writing out the AN D, O R and NOT opera . + OR
. f those operators. Evaluate from lefl to right
algebra uses si mpler notauon or _, I 'whic h o ne peaks of as
. .. ' or a. We I use a , Conventions
"NOT (a)" is typIca ll y wn tten as a I t of a or the illl'erse o f a .
. k 1 as the comp emell .
"a prime." a ' IS a lso ' noWI ' fi ' 11y 'Intended to look similar to Although we borrowed the multiplication and add ition operations from regular aJoebra
.
"a OR b" is typica ll y wntten as
"a + b " spec l C,I
:, b'" ve n referred to as the slim 0
f and even use the terms sum and produ ct, we dOli '! say "times" for AND or "plus" fo~ OR.
. I loebra. a + IS e Dtgl ta l. deSIgn tex tbooks typicall y name each variable u ing a single character.
lhe addition operator III regu ar a 0 II b"
" b" is usually spoken of as a or . . bec~use uSlOg a slOgle charac ter makes for concise equations like the equations above.
a and b. a + . . "* b" or "a. b." specifically Intended to look We II be WfltlO g many equations, so conci eness wi ll aid understanding by preventing
"a AN D b" is typ Icall y wntten as a. I Ige bra and even re fe rred to as
. I' . operator \0 regu ar a J ..." ~ h equations that ~rap across multiple lines or pages. Thus. we'l l usuall y follow the conven-
sim ilar to the multlp Icatlon . I b a we can even wnte a b ,or t e
t as In regular a ge r , . I tIOn of uS lOg slOg Ie charac ters. However, when you de cribe digital systems using a
the product of a and b. Jus d b are separate va riabl es IS c ear.
f and b as lon o as the fact that a an ". hardware description language or a programming language like C. you hould probably
prod uct 0 a . Of" d b" or even just as . a b. use much more deSCriptive names so that your code is readable. So in tead of u ing " s "
"a *b" is usually spoken 0 as a an
to represeot the o utput of a seat-belt-fastened ensor, you might instead use
. . h otations for Boolean operators, but the above nota- " SeatBel tFastened."
MathematiCIans often use ot er n . s like ly due to the Intenuonal
tions seem to be the most popular among englnee;o;'
EXAMPLE 2.10
simil ari ty of those operators wi th regul~r algea:~~;:xaampie of:
b Evaluating Boolean equations using precedence rules
Usi ng the simpler notallon, our ear ler se
Evaluale the following Boolean equations. ass uming a=l, b=l. e - D, d=l.
w = (p AND NOT(s) AND k) OR t
J. F = a * b + C. Answer: * has precedence over +. 0 we evaluate the equation as F ::: (1 '"
1 ) + 0 = (1) + 0 = 1 + 0 = 1.
could be rewritten more conci sely as:
2. F = a b + e . Allswer: the problem is identical to the previous problem. using the hortband
w = ps ' k + t notation for *.

which wo uld be spoken of as "w equals p s prime k, or I." 3. F a b ' . Answer: we first evaluate b' because OT has precedence O\'er AND. resulting in F
= 1 * (1 ' ) = 1 * (0) = 1 * 0 = D.
EXAMPLE 2.9 Speaking Boolean equations 4. F = (ae) ' . Allswer: we first evaluate what is inside the parentheses. then \\e :"OT the result.
yielding (l *0) ' = (0)' = 0' = 1.
Speak the fo llowi ng equations:
I. F = a' b' + e. Allswer: "F equals a prime b prime or c." 5. F = (a + b ' ) * c + d ' . Alls,,·.r: The parentheses h"e highest preceden e. Inside the
parentheses. NOT has highest precedence. So we evaluate the parentheses pan !IS ( 1 - l ' 1
2. F ::: a + b * c ' . Answer: "F equals a or band c prime:'
(] + (0» = (] + 0) = 1. ext. * has precedence O\er +, yielding (] ~ 0 - l '
Convert the fo llowing spoken eq uations into wti ucn equations: ( 0) + 1 ' . The NOT has preceden e over the OR. gi"ing (0) + ( I' ) = ( 0) _
= 0 + 0 - O.
I. "F equals a b prime c prime." Answer: F = a b ' e' .
2. "F equals abc or d e prime." Answer: F = a be + de'. Variables, Literals, Terms, and Sum of Products
Le t's define a few more concepts, u ing the e 'ample equation: F ( a . b. e) a' c
Th les o f Boolean algebra require that we evaluate expressions using Ihe precedence abc' + ab + c.
rul: ~at * has precedence over +, that complementing a v:triablc has precedence over *
d+ d that we of course compute what's in parentheses first. We can make the earlter Variable : A variable represen ts a quantil) (0 or D. The abo\e equJtit>n h ,three
an • an .. . , r II " w _ (p * (5') *
equation'S order of evaluation explICIt uSlOg parentheses as 0 OWS. variables: a . b. and c . We typically USe \:uiables in Boolean <'quation, to repn'-
k) + t. sent the inputs of our system ometimes \\e e'\plicitl) li,t" fun,'u n', \arlabl ,
Table 2. 1 summarizes Boolean algebra precedence rule, . as above ("F (a. b. c) = ..... ). Other times we omit th e'Pli~it Ii,t (" F _ ..... \.
50 2 Comb inational Logic Design
2.5 Boolean Algebra 51
. ble in either true or complemented
Literal: A literal is the appearance of a v~n ab ' a b c ' a, b, and e . Makes intuit ive sense right? OR ' . 0 . .
b h .. , . 1I1g a with (a+o) Just means that the result will
form. The above eq uati on has 9 literals: a .' e, T'h ' b 'equat ion has four e*w atever a IS. After all , 1+0 is 1, while 0+0 is O. Likewise, ANDing a with 1
Prodllctterm : A product term IS. a product of hterals. e a ove (a 1) results 111 a. 1 *1 is 1, while 0*1 is O.
terms: a' be, a be ' , a b, and e. . f d t terms is known as • Complement
. . as an OR1I1" 0 pro uc
SlI m-oJ-Prodll cts: An eq ual10n wntten
. f d
i
f n The above examp e e ,
quation for F is in sum-of- a + a' = 1
bei ng 111 sum-o -pro ucts . on . . II in sum-of-products form: a * a' = 0
producls form. The follow1l1g equations are a
This also makes intuitive sense. Regardless of the value of a, a' is the opposite.
abc + abc ' so yo u get a 0 and a 1, or you get a 1 and a O. One of (a, a ') will always be a 1.
a b + a ' e + a be h ve ' ust one literal). so ORlllg them (a+a ' ) must yield a l. Likewise, one of (a, a ' ) will always be a
a + b ' + a e (note Ihat a prod uct term can a J 0, so AND1I1g them (a*a') must yield a O.
.
The fo llowing equal10ns are alI NOT 111
' SUIn -of-products form: Let's now apply these basic properties to some digital desi!m examples to see how these
propertjes can help us. 0 ,
(a + b)e
(a b + be) (b + c)
EXAMPLE 2.11 Applying the basic properties of Boolea n algeb ra
(a') ' + b
a(b + e(d + e)) Use Ihe properties of Boolean algebra for Ihe foll owing problems:
(ab + be) ' Show Ihal abc ' is equ ivalenllO c ' ba .
The commutative property allows us 10 swap the operands bein. ANDed, so a*b*c'
Some Properties of Boolean Algebra a*e ' *b = c ' *a*b = c ' *b*a = c ' ba . -
I bra. Assume a, b, and e are Boolean
We now li st some of the key ru les of Boolean age Show thai abc + abc ' = ab .
variables, which each hold either the value of 0 or 1. The firsl di slributive property allows us 10 factor out the a b tenn: abc + abc'
a b (e+c ' l. Then, the compie men I property allow us to replace the c+c' by 1:
Basic Properties a b (c+c ' ) = a b ( 1 ). Finally. the identity property allows us to remove the 1 from the
AND lerm:ab(1) = ab*1 = abo
The followin g properties, known as postu lates, are assumed to be true:
Show that Lhe equati on X + x ' Z is equivalent to X + z.
• COlllmutative
The second distributive property (the tricky one) allows us to replace x+x' Z by
a + b = b + a (x+x' )*(x+zl. The complement property allows us to replace (x+x' ) by 1. and the
a * b = b * a identity property allows us to replace 1*( x+z) by X+Z.
This property should be obvious. Just try it for different values of a and b.
EXAMPLE 2.12 Simplification of an automatic sliding door system
• Distriblltive
Suppose you wi sh to des ign a system to conlTOl an
a * (b + c) = a * b + a * c aUlomatic sliding door. like one ~l at might be found at
a + (b * c) = (a + b) * (a + c) (litis aile is Irick)'!) a grocery slore's entrance. An input p lO ou r system
Careful , the second one may not be obvious. It 's different than .regular algebra. indicates whether a sensor detects a person in front of
But yo u can verify that both of the distributive properti es hold Imply by evalu- the door (p= I means a person is detected). An input h
indicates whether the door should be manually held
ating both sides for all possible values of a, b, and e.
open (h =1) regardless or whelher a person is detected.
• Associative An inpul C indicate whether the door should be rorced
(a + b) + c = a + (b + c) to stay closed (like when the store is closed for busi-
ness)-c = 1 means the door shou ld SlOy closed. The
(a * b) * c = a * (b * c)
latter two wou ld nomlally be set by a manager with the Figure 2.23 Initial door opener ircui!.
Again, try it for different values of a and b 10 see Ihat this holds. proper keys. An OUlput f opens the door when f is l.
We want to open the door if the door is set to be manunlly held open. OR ir the door is nOi set to
• Identity
manually held open but a person is detected. However. in either ase, we nl~ open the door if the
o + a = a + 0 - a door is not set lO stay closed. \Ve can tmnslate these requirements into 3 Boolean equurion as:
1 * a ~ a * 1 - a f - he ' + h' pc '
52 Combinational Logic Design 2.5 Boolean Algebra 53
.' h'· "" ( on as in Figure 2.23. • Idempotent Law
\Ve could bui ld a ci rcuit to lI11plemcnt ( IS cqua.l .' . d 'b'd earl 'ler Looki ng al the equa-
. . the properties escn l: ~ . . .
'ow let's manipulate the cquttllon lISlI1g . h b"" ble to simplify th e rema lnmg a + a - a
tion we believe we call factor Oul the c ' . We might 1 el~ ca e "
a * a - a
"h+'h ' p" part toO. Let's try some transformations. fi rst factonng out
Again, this should be fairly obvious. If a is 1 1+1-1 d 1*1-1 hil'f a' 0
O+O ~ O and 0*0-0. ,an , weI IS.
he ' + h'pe'
e ' h + e 'h'p (by the commutative property)
• Involution Law
e ' (h + h ' p) (by the first distribu ti ve property) . I
e ' ( ( h+h ' ) * (h+p) ) (by the 2nd distributive property-tncky one ,) (a ' ) ' - a
e ' ((l)*(h + p» (by the complement propert y) Again, fairly obvi@s, If a is 1, the first negation gives 0, while the second gives
e ' (h+p) (by the identi ty property) 1 aga in . Ltkewlse, If a tS O. the first negation gives I, while the second gives 0
agai n.
Note that th e simpler equation still makes intuitive DoorOpener
sense-we open the door only if the door is not set to stay • DeMorgall 's Law
closed (e ' ), AND either the door is set to be manua ll y
held open (h) OR a person is de tected (p). A circuit imple· (a + b) ' - a'b '
menting thi s equation is shown in Figure 2.24. Thu s. by (ab) ' = a' + b'
appl yi ng the algebraic properties. we obtained a simpler Thes~ are. not as obvious. Their proofs are in Appendix A. Let's consider both equa-
ci rcuit. In other words, we used math to simplify the ~Ions intUIti vely here, Consider ( a + b)' = a' b ' . The left side will only be 1
circuit. tf (a + b) evaluates to 0, which only occurs When both a AND b are 0 meanina
figure 2.24 Simplified door

Simplification of logic circuits will be the focus of


opener circuit. a ' b' - the right side, Likewise, consider ( a b)' - a' + b'. The left ide will
only be 1 if (a b) evaluates to 0 , meaning at least one of a OR b muSt be O.
Section 2. I I . meaning a' + b ' - the right s ide. DeMorgan 's Law can be stated in Englisb as
follows: The complement of a sum equals the product of the complements: the
EXAMPLE 2.13 Equ ivalence of two automatic s li di ng door systems complement of a product eq ua ls the sum of the complements. DeMorgan's Law i
Suppose yo u found a reall y chea p device for automatic sliding door systems. The device had inputs WIdely used, so take the ttme now to understand it and to remember it.
e . h. and p and output f , as in Examp le 2. 12, but the device's documen tati on satd that. Let's apply some of these additional properties in more example.
f = e ' hp + e ' hp' + e ' h ' p
EXAMPLE 2,14 Applying the additional properties
Does that device do the same as that in Example 2. J 2? One way to check is (0 see if we can manipu-
Convert the equation F = a b (e +d ) into sum·oF-products fonn.
late the above equation into the equation in Example 2.12:
The distributive property allows us to "multiply out" the equation to F = a be - a bd.
f = e 'hp + e ' hp' + e ' h'p
f e ' h(p + p') + C'h'p
Convert th e equation F = wx (x ' y + zy ' + xy) into sum-of.productS form. and make
= (by the distributive propert y) any obvious simplifications.
e ' h(l) + e ' h ' p (by the complement property) The distributive property allows us to "multiply out" the equation: wx (x ' y+zy' T y) =
e 'h + e ' h'p (by the identity property) wxx ' y + wxzy ' + wxxy. That equation is in sum-of·products form. The complemen!
he' + h'pe ' (by the commutat ive property) property allows us to replace wxx ' y by w*O*y. and the null element property means thO!
That's the same as th e origi nal equation of Example 2. 12. so the device should work for us. w*O*y = 0, The idempotent property al lows us to replace wx xy by wxy (because xx = Xl.
The res ultin g equation is 0 + wxzy' + wxy - wxzy' + w y.
Prove tha t x ( x ' + y ( X ' +y . ) ) can never evaluate to I.
Additional Pro perti es
Repeated application of the first di tributive property yields: xx' +xy ( '+y') - x •
Let 's consider some add itional propenies. whic h happen to be known as theorems
+ xy x ' + xy y , . The complement property tells us that X ' -0 and yy '-0. ~;elding
becau e they can be proven using the above postu lales: + O*y + x*O. The null element property leads to 0 + 0 + O. \\hich equals O. Thus.
• N ull elements the equation always evaluates to O. regardless of the a tunl \'a1ues of x and y.
a + I - 1 Determine the opposite function of F = ( a b' + e l.
a * 0 - 0 The desired fun ction is G = F' = (ab' +e ) ' . DeMortlan's Ul\\ ,i elds G - a ' ,
* e', Applying DeMorgan's Law again to the firs t term-) ields G ~ ( a'·.-( b' 1 '
These sho uld be fairly obvious. I OR anything i, going to be 1. while 0 AND e ' . The involution property yields (a' + b ) ~ c ' . Finall). the dlstributh. prope~
anything is going to be 0, yie lds G - a' c ' + be ',
54 2 Combinational Logic Design 2.6 Representations of Boolean Functions 55

EXAMPLE 2.15 Applying DeMorgan ,s Law .In an alrc


. raft lavatory sign
. ~ YOUR PROBLEM IS MY PROBLEM

~s
Commercial ai rcraft Iypica ll y have ,an 'II . 'Iled sioo indlcat-
I ~ J1lm. .:= se an air- The use of Boolean algebra for digital design is an
in£! whether a lavatory (bathroom) IS available. Suppo t (linD
example of ~l e powerful general concept of mapping be applied to the new problem. Immediately. the new
c~ft has three lavalorics. Each lav:llory has a sensor ~u ~u '0 one problem to another. By mapping a new problem problem can benefit from perhaps decades of work of
1 if the lavalory door is locked. 0 otherwise. OUf circuli WI Figure 2.25 Aircraft solving the old problem. Mapping one problem to
. 0 from those sensors. as (digital desig n) to an old problem (logic representation).
have three inputs. a. b. and c. camino . 'cd (whether lavatory sign block. another is extremely common in engineering. especialJy
shown in Fiourc 2.25. If (lilY lav:llory door IS un lock. . the SOlutions (Boolean algebra) to the old problem can
in computing. Afler all, why reinvent the wheel?
one. two. oreall three doors ~lre un lk
oc ed) , \\1,;
"shou ld I1lul11lnate
1
the "Available" si!!.Jl by setting the ci rcuit's output S to .
\\l~th' th is understanding. we recogni ze that the OR fu nc- Circuit
tion suits the problem. as OR ou tpu tS 1 if any of ,its i n~~ts are Su ppose your autommic door Contro l has an input with the opposite polarity as what we expect: 0
a
1. regardless 01. how many II1P.UtS
" ~re.1 We beglll
'. ~ an
0 wnun o means open th e door, while 1 means close. \Ve can compUle the function 9 lhat opens the door. and
simplify tha t func ti on, as fo llows:
equation fo r S. 5 should be 1 If a IS 0 OR b IS 0 OR c .
Saying a is 0 is the same as sayi ng a I. Thus. the equatIOn for 9 f'
5 is: 9 (c ' ( h+p » '
S-a '+ b ' +c ' (by sub tituting the equation for f)
9 (c ' ) ' + (h+p) '
(by DeMorgan's Law)
\Ve tran slate Ihe equation to Ihe ci rcuit in Figure 2.26. 9 C + ( h+ p) '
(by the Involution Law)
\Ve can apply DeMorgan's Law (in reverse) 10 the equa- Figure 2.26 Aircraft lavatory + h'p'
9 (by DeMorgan 's Law)
tion by noting th ai (a be) I a +b +c I so we can
I I I sign circu it.
replace the equation by:
= (abc) ' 2.6 REPRESENTATIONS OF BOOLEAN FUNCTIONS
The circuit for that equation appears in Figure 2.27.

:$5Cf
Figure 2.27 Circuit after applying
A Booleall jUllctioll is a mapping of each possible combination of values for the func-
tion's variables (t he inputs) to either a 0 or 1 (the output). An example of a Boolean
funcrion described in regu lar English is a function F of variables a and b. such that the
fun ction out pu ts I when a is 0 and b is 0, or when a is 0 and b is 1. There are e\'eraJ
DeMorgan's Law. bener representati ons than Engli sh for de cribing a Boolean function. including equa-
ti ons, circu its, and truth tables, as shown in Figure 2.28. Each repre emarion has its own
EXAMPLE 2.16 Proving a property of the automatic sliding door system advantages and di sadvantages, and each is useful at different times during design . Yet all
Your boss wants you 10 prOl'e lhal the automati c sliding door circuit of Example 2. 12 ensures th:t the the representations, as different as they look from one another. represent the very arne
door will stay closed when the door is supposed to be forced to stay closed. namely,when c- 1. [f funct ion. It 's like how there are different ways to represent a particular recipe for choco-
the function f = c ' (h+p) describes the sliding door, you can prove the door wil l stay closed late chip cookies: wri tten words, pictures, or even a video. But no matter how the recipe
(f=O) using propenies of Boolean algebra: is represented, it 's the same recipe.
f = C ' (h+p)
Let C = 1 (door forced closed) English 1: "F outputs 1 when a is 0 and b is 0, or when a is 0 and b IS 1:'
f 1'(h+p) English 2: " F outputs t when a is O. regardless of b's value."
f O(h+p) (a)

f
f
f = 0
Oh + Op (by the distributive prope rty)
0 + 0 (by the nu ll elements propenyl

Therefore, no matter what the va lues of hand p, if c= 1, f wi ll eq ual O-thc door will stay
F ~~ ~
o 1
1 0
I I
0
(b)
closed. Figure 2.28 Seven 1 I 0
representations of th e very Truth table
EXAMPLE 2.17 Automatic sltding door with opposite polarity same function F(a.b): a~F~ (d)

/~
(a) nvo English descriptions,
In Example 2. 12, we computed the function to open an automatic sliding door as:
(b) two eq uations, (c) two
f-c'(h+p) circuils, (d) a truth table.
56 2 Combinational Logic Design 2.6 Representations of Boolea n Functions _
the oUlput column (i n the case of a=O b= :>7
Equations determ1l1e the function 's output. , 0 , the OUtPUt shown in Figure 2.28(d) is 1
One way to represent a Boolean function is by using an equation. An eqllatioll is a Il) FIgure 2.30 shows the truth tabl ) to
. ' 'b' all)
ematical statement equatll1g one ex pressIon with another. F ( a , b) = a + a ' b .' fu nc t'lon , and a four-input function e structure s ~or a tWO-input functi on a th .
. ' ree-IOpUt
an example of an eq uation. The right-hand side of the equation is often referred to as IS
expressioll . wh ich evaluates to either 0 or l. at) a b F a b c
We've seen Il,at differe11l equations can represent the same function. The eqUa . F a b c d
o 0 0 0 0 F
F(a . b) = a ' b' + a ' b represents the same function as does the equation F ( a , b )tll)ll. 0 0 0 0
o 1 0 0
0
a ' . BOlh equations pcrfonn exactly the same mapping of the input values to output values '" 0 0 1
o 0 0 0 0
pick any input va lues (e.g .. a=O and b=O). and both equations map those II1put values tl) ------- 0
0 1 1
same output value (e.g .. a =0 and b=O would be mapped to F= 1 by either equation). th~ (aJ 0 0
0
0
0 1
0 0
One advantage of an equation as a Boolean functi on representation compared 0 1 0 0 1
other representations is that we can easil y manipulate an equation using propertie tl) 0 0 1 0
Boolean algebra, enabling us to simplify an equation. prove that twO equation s repre:eI)f 0 1 1 1
the same func!lon. prove propertIes about a function, and more. I)t (b) 0 0 0
Figure 0 0 1
. 2.30 Trut h table structure for: (a) •
0 0
two-Input functIOn F(. ,b). (b) • th ree-input
Circuits rullCbtlon F(. ' b.c).' an d (c.
F( ) four-input function 0 1 1
0 0
A second way to represent a Boolean function is using a ci rcuit of logic gates. A c;r . a·l,c,d)· Defining. specific function would
0 1
is an interconnection of components. Because each logic gate component has a Cll;, In.vo ve fi lIing in the rightmost column for F
. f ' I .' Pre wuh • 0 or a I for each row. 0
defi ned mappmg a Input . va ues to output values, and because wIres Just tran smit tl)e'- 1
va lues unchanged. a cIrcuIt descnbes a function. It- (e)
We've seen that differe11l circu its can represent the same function . The two circu its'
Figure 2.28 both represent the same function F. The bOllom circuit uses fewer gates b II) . . Truth tables are not only found in
the function is exactl y the same as the top circuil. ' Ut d~gllal deSIgn. If you've studied basic Gene pair Outcome
One advantage of a circuit as a Boolean function repre entation compared to Otl)er bIOlogy, you've likely seen a type of truth M o F
representa!lons IS that a CIrcuit may represent an actual physical implementation I)f blue blue blue
table describing the Outcome of various
Boolean function. and ultimatel y our goal is to implement digital circuits physical! a blue brown brown
gene pairs. For example, the table on the
brown blue
Another advantage IS that a CIrcuit drawn graphically can enable quick and easy corn y. fight shows outcomes for different eye brown
brown brown brown
hension of a function by humans. pre_ color genes. Each person has two genes
for eye color, one (labeled M) from the
mom , one (labeled 0) from the dad. Assumin o ani .
Truth Table bl ue and bro wn the table lists all ·bl" .y two poSSIble values for each "ene
, POSSI e combmati f " '
A th ird way to represent a Boolean function is usin o a trllth Inputs Output person may have. For each combinau' th b . ons a eye color gene pairs that a
an. e ta Ie h ts the t 0
table. A truth table's left side lists the input variabl es, and a b F h as two blue eye genes will they h bl au come. . nly when a person
. ave ue eyes' hav' 0
shows all possible valli e cOlllbillatiolls oj th ose illPlIts, with o 0 results In brown eyes (due to th b . . m" one or two brown eye 2cn
. e rown eye gene bemo d . - -
one row per combination, as shown in Figure 2.29. A truth o 1 Unhke eq uations and circuits a Boolea "ommant over the blue eye gene.)
table's ri ght side wou ld then li st the function 's output va lue (l o representation . , n functIOn has ani 0111' truth table
or 0) for the row's particul ar co mbination of input values, as One advantage of a truth table as a Boolean funetio .
was shown in Figure 2.28(d). Any function of two variabl es other representations is the fact that a f unctIon
. h
as only onen truth
representauon
bl compared
' to
wi ll have those fou r input combinati ons on the left side. figure 2.29 Trulh lable we can conven any other Boolean f ' . ta e representanon. so
Mfucture for a two- .,.. uncnon representatIOn t th bl
People usuall y list the inpu t combinations in order of d.Illerent . representations represent th e same f '
un lion-if th
a tru ta e to determine if
input funclion Fen.b),
increasing binary val ue (00=0,01= 1,10=2,11 =3), as we've tlon. theIr truth tables will be identical T th bl C) rep,re :nt the same fun.:-
done above, though strictl y speaking. we could list the combi - readers, as a truth table clearly h tho ru ta e are also quite mtuiti\'e to human
ow e output for el'e 'bl '
nations in any order as long as we li sted all possible combi nations. For any comb ination that we 1I ed truth tab les in Figure 8 t d ' be . . f) .p."' I e mpul. Thu '. n'ti,
of input va l ue~ (e.g., a-O, b-O). we merely need to look at Ihe corre, ponding vaJue in basic logic gates. - . a esen 111 an IIltulme manner the beha\ior l f
58 Combinational Logic Design 2.6 Representations of Boolean Functions 59
ber of inpuls the number of truth funcli on Fo
A drawback of Irul h tables is Ihal fo r a Iarge num ' h b f input h . ' . r exa mple, suppose we are given
tabl e rows can be ex trem e))1 laroe. Given a function with It inputs, t e n~11l ~r 0 [ e CIfCUIl In Figure 2.32. To convert to an
" . Id h 2 10 - 1024 pOSSIble IIlpUI com- equall on,. we Slart wilh Ihe inverter, whose
combinations is 2". A funcl ion wilh 10 mputs wou ave -. f .
. . I ble havlllg 1024 rows A unction OUIPUI wI ll represen l C ' . We continue wilh Ihe
binali ons-you can' l easily sec much of anythmg m a a .
with 16 inpuls would have 65 .536 rowS in ils trulh lable. OR gale-nOle Ihat we can'l delermine Ihe
OUIPUI for Ihe AND gale yel umil we creale
EXAMPLE 2.18 Captu ring a function as a truth ta ble expr,essions for all Ihat gale's inpulS. The OR
Figure 2.32 Converting a circuit
TABLE 2.2 Truth table for Create a truth tab le describing a funct ion th at detects wh~ther a gale s OUIPUI represents h+p . Finally. we wrile
Ihe ompul of Ihe AND as C ' ( h+p) . Thus, Ihe 1O an equation.
three-bit input 'S' v<l lue. representing a binary number. IS 5 or
5-or-greater fun ction.
greater. Table 2.2 shows a trulh table for the funclion. We first equallon F ( C . h , p) QC' ( h+p ) repre ents
a b C F list all possible combinations of Ihe three Inpul bitS, whIch Ihe same funclion a Ihe circuit.
0 0 a a we' ve labeled a. b. and C. We then enter a 1 in the outpu~ row
a a 1 a if the inputs represent 5. 6. or 7 in binary. We enter as III all 3. Equations to truth lables
remaining rows. Convertin " an equat'ton 10 a Iruth lable can be done by fi rsl cre-
o
a 1 a a .
Inputs Output
a 1 1 0 allng a Irulh lable struclure appropriale for the number of
1 a a a funcll on InpUI . bl . F
. vana es. and then evaluallng the riahl-hand
1 0 1 1 SIde of Ihe equali on for each combination of inpul values. For o 0
example, 10 conven Ihe eq uation F ( a , b) = a' b' + a ' b 10 a 1 1
1 1 a 1
a truth lable, we would firsl creale Ihe truth lable structure for a
o 0
1 1 1 1 o
IWO-InPUI funclion , as shown in Figure 2.30(a). We would then
Converting among Boolean Function Representations eva luate the righI-hand side of the equation for each row's Figure 2.33 Truth table
comblnallOn of inpul va lues, as follows:
Given Ihe above representat ions, we can view combina-
tional logic design as defining Ihe appropriate Boolean
~1, . a=O and b=O, F 0' *0 ' + 0' *0 1*1 + 1*0
for F(a.b)=a'b'+ab.

+ 0
funcli on 10 solve a parlicu lar problem, and then cre- Equa1ions ' - - 2 / ~IS a= O and b=1, F 0 ' *1 ' + 0 ' *1 1*0 + 1*1 0 + 1 1
aLing a ci rcui l representalion of Ihal function. Defining ( ----4 6) a =1 and b=O , F 1' *0 ' + 1 ' *0 0*1 + 0* 0 0 0 0
the appropriale Boolean funClion requires nOl only thaI
we Ihink aboul what Lhal function should be, bUI also
3) ( 5 a =1 and b=1. F 1 ' * 1 ' + 1 ' *1 0*0 + 0~1 0 + 0 0
' - Trulh lables -""
Ih at we capl ure Ihal function in some form-Iypically We would Ihere fore fi ll in Ihe lable' righl column as shown in Figure 2.33. NOle thaI we
eithe r as an equal ion or a trulh table. Then , we musl Figure 2.31 Possible conversions applied propenies of Boolean algebra (mostly the identity pro~ny and null elemems
conven Ihe caplured funclion representation inlo a cir- from one Boolean fu nction propenYl 10 evaluale Ihe equations.
cuit. Thus, combinalional logic design requires Ihal we represen tation to another. Notice Ihal convening the equation F ( a . b ) =a ' 10 a truth lable re ults in exa t1v the
know how 10 conven from one Boolean funclion repre- same truth lable as shown in Figure 2.33, [n particular. evaluating the right-hand ide of
semation 10 another. For Ihe three representations we the equatIOn for each row 's combinalion of inpul values yields:
have di scussed so far (equalions, circuils, and truth lables), there are six possible conver- a =O and b=O, F 0'
sion from one represenlation 10 another, which we now describe (Figure 2.3 I). Inputs Output
a=O and b=1. F :
0' 1
a b a' b' a' b F
I. Equations to circuits a= 1 and b=O. F = 1 ' 0 0 0 1 0 1
Co nverting an equation 10 a circ uil can be done slraighlforwardl y by using an AND gale a = 1 and b= 1. F 1' = 0 0 1 0 1 1
for every A D operator. an OR gale for every OR operalor, and a NOT gale for every 1 0 0
I
0 0
NOT operalOr. We already gave several examples of such conversion. in Secti on 2.4. Some people find il use ful 10 creale inter- 1 1 0 00
mediate columns in the lruth lable 10 compule
2. C ircuits to eq uations Ihe eq uat ion's inlermediate value. Ihus filling Figure 2.34 Truth ubi. for Fla.b)=ab - ,
~ with intemledinte lumn ~
Conven ing a circuil inlo an equation can be done by slaning from Ihe circuils inpuls. and eac h column of Ihe lable from lefl 10 righl.
then wriling the OUIPUI of each gate as an expression invol ving Ihe gale's inpuls, The moving 10 the neXI column only after filling all ro\\S of the ' umm lliumn. An e:l.:lmpl
ex pres ion of Ihe lasl gale before the OUlpUt represents the expression for Ihe circuil's for Ihe equallon F ( a . b ) - a' b' + a ' b is h \\ n in Figure ~ .3 ~ .

.. . -_. _---
60 Combinational Logic Design
2.6 Aepresentations of Boolean Functions 61
4. T ruth tables to equations Inputs Outputs Term
P • a 'b' c + a ' be ' + ab ' e ' + abe
To conve rt a truth table to nil equation. we cre:.l (C b F F - s um of TABLE 2.3 Even parity
We could Ihen d ' h - . for 3-bit data.
a product lenn for cach 1 in the outpu t column. 0 0 a' b' OR gate. cSlgn 1 C Circuit using four A 0 gales and an
a nd we Ihe n OR a ll the product terms. For the 0 1 1 a' b Note that even p ' d . b e p
correct ( h amy OC5n t Illean for Sure that {he data is
table on the ri ght (Figure 2.35). we get the terrns 0 0
0
note 1 at we were c'lref J I
was "assumed" (0 b
.
. <, U 0 say earlier that the transmission
o o o o
shown in Ihe ri glllll10st column of lhat tab le.
two errOrs OCc e c?rrecI If I~C parity was correct), In particular. if o o
ORing those terms yields F = a ' b ' + a ' b. o o
Figure 2.35 Converting a truth table For e . I ur On dlffere nl blls. Ihen Lhe parity will sLill be even .
to 3n equatio n. xamp e, the sender may se d Oli 0
1111 1111 . ' . n
.
. but Ihe receIve r may receive o o
5. C ircuits to truth tables · . has even pari ty and thus looks correct. More powerful o o
W e can convert a combi nnLional circuit to a truth , error detecl lon meth ods 'bl
tabl e by firsl conve rtin g Ihe c ircu it to an eq uation (described earlier). a nd Lhe n converttng
b . nrc POSSI c to detect multiple errors like this
one, ut al the price of add ing ex tra bi ts.
o o
the equation 10 a lruth tab le (descri bed earl ier). Odd parity is also a common ki nd of parily-the parity bit value
o o
rna kes the lotal numb I' 1 od
b . er a s d. There's no quality difference
6. Truth tables to circuits . etween even parrly and odd parity- the key is simply Lhat the sender
We can convert a truth tab le 10 a c ircuit by first convertin g Ihe trut h tab le to an equatton and receiver must both lise lhe same kind of parity. even or odd.
(desc ribed earli er). and then convening the equation 10 a circ uit (described earlier). A popular rcpresenlalion of lellers and num bers is known as ASCII which encodes each char-
acter Into 7 blls. ASCII adds 1 bi t for parity. for a tOlal of 8 bils pcr ch~cter.
EXAMPLE 2.19 Parity generator circuit design sta rting from a truth table
EXAMPLE 2.20 Converting a combinational circuit to a tru th table
Nothing is perfect, and digital ci rcuits are no exception. Some ti mes a bit on a wire c ha"g~s when it's
not supposed to. So a 1 becomes a O. or a 0 becomes a 1. For example. a 0 may be travehng along a Conven the circuil depicted in Figure 2.36(a) inlo a truth table.
wire. when suddenly some electrical noise comes out of nowhere and chang.es the a 10 n. 1. While we . We beglll by convenin g the circ ui t 10 an equation. Starting from the gales closest (0 the
can redu ce the like lihood of such crrors. perhaps by usi ng we ll -insulated wires. we can"t completely IOPuts~the leftmost AND gale and the invener in thi case-we labe l each aate's output as an
preven t such errors. nor ca n we de lec t nod correc t all of th em- but we can de lcct some of them. express Ion of Ihe gate's inputs. We label Ihe leflmost AND gate' outpu t. for ex:mple. as a b. like-
Designers typicall y look for situat ions where errors are likely to occur, such as data being Lransmit- WIse,. we label .the leftmost inve ner's Output as C •. Continu ing through the circuit's oates. we label
ted between two chips over long wires-like from u compu ter over a printer cable to a printer, or the nghlmosl Inverter' ( b) ' _ . e
( ) s OUlput as a . Frnally. we label the nghtmo t D .ate's OUtpUl as
from a computer over a telephone line to anoLher computer. For those silUJ tions. designers add cir- a b ' c ' . which corresponds 10 the Boolean equation for F. The full y labeled ci';;uit is hown in
cuits that at least tfY to detect that an error has occurred. in which case the recei vi ng circuit can ask FIgu re 2.36(b).
the sending circui t to resend the data. . From the Boolean equation. we can now construct the truth table for the combinational circWL
One common method of detec tin g an error is called parity. Say we ha ve 7 data bi ts to transmit. ~lIlce our circuit has three in pu ts-a, b. and C-there are 23 = possible combinations of inputs
We add an extra bit. ca lled Ihe parity bi!. to make 8 bits tala I. TIle sender sets the parity bit to a 1 if (I.e. abe~OOO. 001. 010. 011. 100. 101. 110. 111). so our truth table has the ei.ht rows
Lhat wo uld make Ihe lotal number of 1s even-thai's called evell parity. For example. if the 7 data ~hown in Figure 2.37. For each input. we compute the value of F and fill in the correspondim! com
bils we re 0000001. then the parity bil would be 1. making the 10 101 number of Is eq ual to 2 (an In the lruth lab le. For example. when a ~O . b~O. and e=O. F is (00) • ~O' - (0)' ~ 1 : I ~ i
even number). The complete 8 bi ls wou ld be 00000011. If Ihe 7 data bits were 1011111, then =, 1~ We con~pute the circuit's output for the remaining combinations of input using a truth table
the parity bit would be O. making Lhe total number of Is eq ual 10 6 (a n even num ber). The complete with IIllcrmedmlc values. shown in Figure 2.37.
8 bi lS would be 10111110.
The receive r now can detec t if a bit has changed du ring transmission by checki ng lhat there's
an even number of 1s in the 8 bits received. If even, the transmission is assumed correct. If not
even, an error occurred durin g tra nsmission. For example. if the rece iver receives 0000 0011. the
trans mission is assu med 10 be correct, and the parity bit can be di scarded. leavi ng 000 0001. F
Suppose instead Ihat an error occurred and the receiver receives 10000011. Seei ng the odd
num ber of 1 s, the receiver knows th at an error occurred- note that the receiver docs tlot know
which bil i, erroneous. Likewise, 000000 I 0 would represenl an error 100. NO lice in this case that
the error occurred in the parity bit. but the receive r doesn' t know where the error occurred.
For tllIJ exllmple,
Let's describe a fun ction Ihat ge nemles an even paril y bit P for 3 dal" bit> a, b. and e . Staning
Harling/rom a
wble IS II
IfUl/J from an equation is hard-what's the equation? For Ihis example. sian ing with a truLh lable is the F
more natural natural choice. a, , hown in Table 2.3. For cach configuration of dOIa bil' (i.e .. for each row in the
{'/J oice t/Jall (III Lruth lable). we 'et the parily bilto make Ihe lOla I num ber of 1, eVCl1. From Ihe (ruth tub lc. we then
t!qUUfIfHl,
obtain the followi ng equati n for the pari ly bi!:
Figure 2.36 (a) Combinalional ireu il. and (b) cireuit \I it h gates' output c' prc:"on, lJt-ckd.
62 2 Combinational Logic De Sign
2.6 Representations of Boolean Functions 63

Inputs Outputs While compari ng truth tables works fine when a function has only? inputs wh t 'f
functton has 5 ' 10 32? - , a J a
a b e ab (ab)' c' F . tnputs, or . ,or . . Creating truth tables becomes increasingly cumber-
0 0 0 0 1 1 1 ;?'~' and tn many cases Just pl atn unrealistic, since a truth table' number of rows equals
0 0 1 0 1 0 0 , here n tS the number of tnpulS. 2" grows very quickly. 232 is approximately 4 billion
0 1 0 0 1 1 1 for example. We can't reali stically expect to compare 2 tables of 4 billion rows each. .
0 1 1 0 1 0 0 However, 111 many cases, the number of output Is in a truth table may be very small
1 0 0 0 1 1 1 cOmpared to the number of output Os. For example. consider a function G of 5 variables a
1 0 1 0 1 0 0 b, c, d, and e : G = a bcd + a ' bc de. A truth table forth is fu nction would have 32 rows'
1 1 0 1 0 1 0 but .only three Is in the output column-one 1 from a ' bcde, and two 1 from abed
1 1 1 1 0 0 0
(which covers rows corresponding to a bcd e and a bcde '). This lead to the question :
Figu re 2 37 TrUlh table ror the circu it 's equation Is there a more compact but still standard representation of a Boolean function ?

Standard Representation and Canonical Form Canonical Form-Sum-of-Minterms Equation


Truth tables as a Boolean function standard representation
The answer to the above. question is "yes". The key is to create a tandard representation
We stated ea rli er that . although there are many possible equation representations and
that only deSCribes the situations where the function outputs 1. with the other situations
circu it represem3tions of the same Boolean function. there is only one pos ible truth
assumed to output O.A n equation, s uch as G = abcd + a ' bcde. is indeed a repre en-
table representation of a Boolean function . Truth tables therefore represent a standard tatIOn that only deSCribes the Slluatl ons where G is 1, but that representation is not unique.
representation of a function-for any func tion, there may be many poss ible equations, that IS, the representation is not tandard . We therefore want to define a standard form of
and many possible circuits, but there is only one truth table. The truth tabl e representa- a Boolean equation, known a a cal/ol/ical Jorm .
tion is unique.
You've seen canonical forms in regular algebra. For example, the canonical form of a
One use of a standard representation of a Boolean fu nction is for comparing two polynomlal ofdegree twoi s:a x 2 + bx + c. Tocheck if the equation 9x 2 + 3x T 2
function s to see if they are equivalent. Suppose you wanted to check if two Boolean equa- + 1 is equivalent to the equation 3 * (3x 2 + 1 + x), we conven each to canonic,a1
tion s were eq ui va lent. One way wou ld be to try to manipulate one equati on to be the form , resulting in 9x 2 + 3x + 3 for both equation.
same as th e ot her equation. like we did in our automatic sliding door example in Example One canonical form for a Boolean function i known as a um-of-minterrns. A
2. 13. But suppose we were not successful in gelling them to be the sa me- is that because mil/term of a function is a product term whose literals include every variable of the func-
they really arc not the same, or because we just didn't manipul ate the equation enough? lion eraclly ollce. in either true or complemented form. The function F (a . b . e l = a' bc
How do we really know the two equations are not the same? + abc ' + ab + c has four terms. The firs t two terms, a' be and abc ' . are minterrns.
A conclusive way to check if two The third term, a b. is not a minterm since c does not appear. Likewise. the fourth term. C.
fu nctions are the same is to create a truth is not a min term, since neither a nor b appears in that term. An equation i in sum-oJ-min-
table for each. and then check whether the F=ab+a ' F = a'b'+
a' b + ab terms Jorm if the equation is in sum-of-product form. and every product term i a mimerm.
truth tables are identical. So to determine b F a b F Convening any equ ation to sum-of-minterms canonical form can be done follo\\i n!!
whether F = a b + a ' is eq uivalent to F o o 1 o o 1 just a few steps: -
= a ' b ' + a ' b + a b. we could gen- 1 o 1 1
erate truth tables for each, using the o o o l. First, we manipulate the eq uation until il i
III um-of-product form . uppo ewe

method described earlier of evaluating the are given the equation F( a . b . e) =( a+b)(a '+ aclb. We manipulate it as
function for eac h output row, as shown to follows:
the right. F = (a +b )( a '+a c) b
We see that the two functi ons are F = ( a+b )(a 'b+ac b l
indeed equivalent, because the outputs are (b. the di triburiYe propenYl
F = (a+b) , F = a ( a ' b+a c b ) + b( a' b+acb) (distributive property)
identical for each input combinati on. Now
let's check if F = ab + a ' is equi valent F = aa ' b + aa cb + ba' b + ba cb (distributi"e propel'!) )
b F a b F
to F = (a+b) ' by comparing truth tables. o o o o F = O*b + a c b + a ' b + acb (complement. commUl3ti\e.
As seen to the right, those two func- 1 1 o 1 idempotent)
tions are clearly not equ ivalent. Comparing o o o o F = acb + a'b + acb (null elements)
truth tables leaves no doubt. o F = a c b + a' b (idempotent)
64 Combinational Logic Design
2.6 Representations of Boolean Functions 65
2. Second, we expand each tenn until every term is a minterm:
corresponds to 1111 0, or 30; and a be d e corresponds to lIllI, or 3 I. Thu . we can say
F aeb + a'b that the function H represented by the equation:
F aeb + a ' b* l (identity)
F aeb + a'b* (e+e ' )
H - a ' bede + abede ' + abede
(complement)
a e b + a ' be + a 'b e ' is the sum of the minterms 15,30, and 31 , which can be com pactly written as:
(d istributive)
H ~ 1:m(l5 , 30 , 31)
3. (Optional step) For neatness, we can arrange the literals within each te~m to a con-
sistent order (say alphabetical), and we can also arrange the terms In the order The summation symbol means the sum , and then the numbers insi de the parentheses rep-
they would appear in a truth table: resent the minterms being summed on the right side of the equation.
F ~ a ' be ' + a ' be + abc
Multip le-Output Combinational Circuits
The equation is now in sum-of-minterms form . The equation is in sum-of-products form,
and every product term includes every variable exactly once. Many combinationa l circuits not only invol ve more than one input. but also involve more
An alternati ve canonical form is known as product-of-maxterms. A max/erm is a than one output. The simplest approach to handling a multiple-output circuit is to treat
sum term in which every variable appears exactly once in either true or complemented each output separately. leading to a separate circuit for each output. Actually, the circuits
form. such as (a + b + e ') for a function of three variables a, b, and e. An equation need not be completely separate-they could share common gates. We'lI show how to
is in produc/-of-maxterms form if the equation is the product of sum terms, and every handle multiple-output circuits through examples.
sum term is a maxteml. An example of a function (different from that above) in product-
of-maxterms form i J ( a . b, c) ~ (a + b + e') ( a ' + b ' + e ' ). To avoid EXAMPLE 2.22 Two-output combinational circ uit
confusing the reader. we will not discuss the product-of-maxterms form further here, as
Design a circuit to implerncnllhe rollowing two equations of three inputs a. b. and c:
sum-of-minterms form is more common in practice, and suffi cient for our purposes.
F = ab + e '
EXAMPLE 2.21 Comparing two functions using canonical form G ab + be
Suppose we wanllo delenmine whelher Ihe fun ctions G( a , b, e , d . e) ~ abed + a' bede and We can design the circuit by simply creating Iwo separate circ ui ts. as in Figure 2.38(a).
H(a,b.e . d . e) = abede + abede' + a ' bede + a ' bede(a ' + e) are eq uivalent. We
first com'cn G to sum-of-minterms form :
a
G abed + a ' bede
b
G abed( e+e') + a ' bede F F
G abede + abede ' + a ' bede
G- a ' bede + abede ' + abede
We then conven H to sum-of-m imerms form:
H abede + abede ' + a'bede + G
a'bede (a ' + e)
H abede + abede ' + a ' bede
+ a'bedea ' + a 'b edee
H abede + abede ' + a 'b ede +
H abede + abede ' + a' bede a ' bede + a ' bede
(b)
H a ' bede + abe de ' + abede (a)
Clearly, Gand Hare equivalent. Figure 2.38 Multiple-output circuit: (a) trealed as two separale circuits. :lIld (b) \\ ith gale sharing.
NOle thai checking Ihe equivalence using truth tables would have resulted in 2 rather large
tru lh lables having 32 rows each . Using sum of mintenms was probably more appropriale here. We can instead notice thai the lenn a b is common to both equations. ThUs. the £\\ 0 circuits an
share Ihe gate thai compules a b. as shown in Figure _.3S(b).
Compact sum-of-minterms representation
A more compac~ represent~tion of sum-of-minterms form involves listing each minterrn EXAMPLE 2.23 Binary number to seven-segment display converter
as a number, ~Ilh each mtnterm 's number determined from the binary representation 1nny electronic appliances display 3 number for us 10 read. E.ample applian< - indud: a d<xc -. 3
of Its vanables values. For example, a' bede corresponds to 01111. or 15 ; abede ' mi~ro\Vave oven. and a telephone answering ma hine. A \ cry ~pul~ and simple dC\I:-e tor dl~r tJ.~.
ing a single digit number is a se"en-segment display. illustraled III FIgure 2.39.
66 Combinational Logic Desig n

a
f -----,
b
9
,
,", ,,:.-, 2.7 Combinationa l Logic Design Proc ess

We can create a Custo m logic ci rcuit to implement th e converter. Note that the above table is in
the fonn of a truth table hav ing multiple outpu ts (a th rough g). We can treat each output separatel y.
so ~e deSign a circuit for a . then for b, elc. Looking al the Is in the a column. we obtain the fol -
lOWing equation for a:
67

'..'
e -----,
e - - - -_
a - w' x ' Y' Z' + w' x ' yz ' + w' x ' yz + w' xy ' z + w' xyz ' +
d ------,
w' xyz + wx ' y ' z ' + wx ' y ' z
abedefg = 1111110 0110000 1101101
Looking at the 1s in the b column, we obtain the following eq uation for b:
(0) (b) (c)

Figure 2.39 Seven-segment display: (3) connections of inpu ts 10 segments. (b) input values for b - w' x ' y ' z ' + w' x ' y ' z + w' x ' yz ' + w' x ' yz + W' xy ' z '
numbers O. I. ::md 2. ~nd (c) a pni r of real seven-segment display components. + w' xyz + wX ' y ' z ' + wx ' y ' z

The displ ay consists of seven light seg ments. each of which can be illu minated independently We could then proceed to create equ ations for lhe remaining outputs C through g. Finally. we
of the others. We can display the desired di git by sell ing the signals a , b . c , d . e: f , and would create a circuit for a hav ing 8 4-inpul AN D ga tes and an 8-input OR gale, another circuit for
9 appropriately. So to display the di git 8, we set all seven signals to 1. To display the di git 1, we set b hav ing 8 4-input AND gates and an 8-i nput OR ga te. and so on for C throu gh g. We coulci of
b and C 10 1. Co urse, have minimi zed the logic for each equ::uion before cremi ng each of the circuits.
A useful combi national circu it is one thai converts a binary number to the seven-segment You may notice th at the equat ions for a and b have several terms in common. For exam ple.
display signal s a- g thm display the number as a deci mal digil. We need fo ur bi ~s . say w, x, y, ~d the term w' x ' y ' Z ' appears in both eq uations. So it would make sense fo r both outpu ts to share
z. to represent the binary values of the ten possible di gits 0 to 9. Table 2.4 deSCri bes the conversion one AND gate ge nerating th at term . Looking al the trul h table. we see that the tenn w' X ' Y , z '
of cach binary nu mbe r to the seven-segment display's signals. We decided to ac tivate no segments is in fact needed for outp uts a, b, C, e, f, and g, and thus the one AND gate generating thaI
for the nu mbers 10 through 15. te rm cou ld be shared by all six of th o e outputs. Likewise. each of the olher requ ired tenTIS is
shared by several outputs. meaning each gate ge nerating each term could be shared among
Fo r rhis t·xwnple. severa l outpu ts,

.-.'-'
starting f rom a TABLE 2-4 4-bit biDary number to seven-segment dis pl av truth table
Inlll!table is a
mo re natural w x y z a b c d e f 9 2.7 COMBINATIONAL LOGI C DESIGN PROCESS
choice {han all
a a a a a
eqllarion. 1 1 1 1 1 1

a:-. :.-.•
Based on the prev ious sections, we can define a traighrforward method for designing
a 0 a 1 0 1 1 0 a a a combi national log ic, sum mari zed in Table 2.5

.'-1- .-
a 0 1 0 1 1 a 1 1 0 1 TABLE 2.5 Combinational logic design process.

.:..-. :J.
a 0 1 1 1 1 1 1 0 a 1
Step

-.
Description
a 1 a 0 a 1 1 a a 1 1
Capture the
0. Create a truth table or equations. whichever is most natural for the gin~n
a I a 1 1 0 a ~ f unclion

-.
1 I 1 1

.:..
problem. to describe the desired behavior of the combinational logic.
a 1 1 a 1 0 1 1 I 1 1
N COllvert (0
a This step is onl y necessary if you captured the function using 3 truth

1
1
0
1
a
1
0
1
1
1
1
I
1
a
1
0
I
0
I
0
1
:. fr
c;:j
equm iol/s rable instead of equations. Create an equation for each output by ORing
all the mintenns for lhat output. Simpl ify the equations if desired.
1 0 a 1 1 1 I 1 0 1 1 ~ Implem efll as a For each out put. crente a circuit corresponding ro the outpu( equation.
1 0 1 0 a 0 a a a a a J5 gate-based cireuit (Sharin g gales among multiple Outputs is OK optionally.)

1 0 1 1 a 0 a a a a a
a Gate-based circuits designed such thaI Ihe inpul reed into a column of , glHl!S
1 I a a 0 a a a a a that feeds into a single OR gale are known as two-iel'eilog;c impiemenlariollS.
1 1 a 1 a a a 0 a a a
1 1 1 a a a EXAMPLE 2.24 Three 1s pattern detector
0 0 a a a
1 1 1 1 a We want 10 implemelll a circuit th at cnn detec t whether :1 pattern of at least thret' adJ3~nt h IJxur
0 0 a a a a anywhere in an 8-bit input. and that outputs a 1 illthut case. The inputs are a . b. c . d. e. f. g. JnJ ".
68 Combinational Logic Design
2.7 Combinational Logic DeSign Process 69
_ 000 III aI y should be !. si nce there are three
and the output is y. So for an input of a bc de f~th;; 10 I aI aII: the output should be a,. since th~re Step I: Capture the funclion. Capturing the function for this example is most naturally
Is 111 a row (on IIl puts d. e . and f J. For an IIlp III I 0000 should res ult in y = !. Slllce havlllg aChieved usi ng a truth table. We list al l the possible input combinations, and the desired
are not three Is in a row anywhere. An Input of h . uil is an ex treme ly simple example of output nu mber, as in Table 2.6.
h Id '11 tput a I Sue a wc I
more than three I s in a row S Oll 511 Oll •. p . de tectors arc widely used, for exarnp e,
. . k lIem delcclOrs altern d TABLE 2.6 Truth table for number-of-ls counter.
a general class of CirCUits ' nown as pOl . k ' a digi ti zed video image. or to elect
in image processing to detect things. like humans or tan ' 5, In
Inputs (# of l s) Outputs
specific spoke n words in a digitized audio stream. . .
For this example. . re the functi on as a rathe r large truth table, listing b c y z
stoning f rom an Step I: Capillre Ihe JII"cllO". We could captu . 1 'or y in eac h row where at least
.. f . ts and entenng a ., , 0 0 0 ( 0) 0
eqllMiml is a lIIo re out all 256 combmnllons 0 I,"PU . od for ea turing this particu lar function is to 0
naruml choice th ree I s occur. However. a slmple~ melh currence:of three Is in a row. One possibility 0 0 ( I) 0
fholl a lrutf, toble.
create an equation th at lists thepoSSlble oc =111 Likewise, if cde =11!. def=lll,
is that of a bc= 11 J. Anot her IS that of bcd I
e f g= 1J1 orfgh=1 11 we should output a . or eac
F h possibi lity the values of the
, fd
0 0 (]) 0
0 1 1 ( 2)
. ' , . S
other In puts don I mailer. 0 1 'f a bc= III
.. '
we output
.'
a I.
.
regardless of the values 0 , 0
e . f, g, and h. Thus. an equ ati on desc nbmg y IS simply. 0 0 (I ) 0
y = a bc + bc d + cde + def + efg + fgh 0 1 (2) 0
Step 2: Convert to equations. We can skip this step since we already have an equation. 1 0 (2) 0
Step 3: . . No simp
Implem ellt as a gate-based CIrcUli. I ea t"Jon of the equation is possible. The
. I'fi (3 ) 1
resulting circ uit is shown in Figure 2.40.
Step 2: Convert to equations. We create eq uations for each OUrpUI as follows:

y - a'bc + ab'c + abc ' + abc


l - a'b ' c + a'bc ' + ab'c' + ab c

We can simpli fy the first eq uation algebraically:

y = a'bc + ab ' c + ab (c ' + c) = a'b c + ab' c + ab


Step 3: Implement as a gate-based circuit. We then create the final circuits for the two outputs.
as shown in Figure 2.41.

a
g--_1-1 b

a
b
c
Figure 2.4ll Three Is pattern detector. g
c
EXAMPLE 2.25 Num be ~o~l s coun~r a
Fo r fh is e;wmple.
b
We want to design a circuit that counts the number of Is present on 3 inputs a. b. c. and outputs that c
starring fro m a
I fll l h wb/t! is a
number in bi nary using 2 outpu ts, y and l. An input of 110 has two I s. so our circu it should output
Figure 2.41 Number-of-ls counter gate-based circuit.
more natural 10. The number of I s on 3 inpu ts can range from ato 3. so a 2-bit output is suffi cient. since 2 bit
cho ice Ihon an can represent 0 to 3. A nu mber-of- Is COunter circuit is useful in a varie ty of situations. such as
equation. detecting the density of electronic particles hitting a collection of sensors by countin g how many Simplifying circuit nolations .
sensors are ac tivated. As another example, there are airpon parking lots th at have sensors above each We u ed a couple of new simpli fying nOlations in our circuits in ~e pre.n u~ ex~ple.
parking spot, coupled with signs that inform drivers of the number of avai lab le parking spots on a One simplifying nOla lion is to lisl the inputs multiple times. to a,·o.d hanog ltn~S tn OUT
panicular level of a multilevel parking Structure (by cou nting the number of zeros, but th ut's the drawi ng crossing one anolher-an inpul lisled multiple times is - umed to ha,e been
same as counting the num ber of ls with all inputs first complemented).
branched from the same input.
70 Combinationa l Logic Design
2.7 Combi national Logic Design Process 71
Another s implify ing nolttli on is Ihe use o f
a n inversio n bubble at th e inpul o f a g al e. ~ SLOW DOWN! THE QWERTY KEYBOARD
ra th er th a n the use of a NOT ga te. An tnpUI Inside a standard computer keyboard is a small micro-
th:ll is inverted in to many gates is ass umed 10 processor and a ROM . The microprocessor delec ts An annoying problem
feed throu gh a s ing le inve n er th at is then which key is being pressed. looks up th e 8-bit code with rype:writers was
branched ; ut to those gates. An alternative for that key (muc h like the 12-button keypad in that arms would often
Exa mpl e 2.26) fro m th e ROM . an d send s th at code to get jammed side-by-
simplifi ca ti o n is to simpl y .include co mp le-
the computer. There's an intercstin g story behind the side up near the paper
m e tHed va ria bles. like b ' . as tnputs. if you typed too fast-
way th e keys are arranged in a standard PC keyboard ,
whi ch is known as a QWERTY key board beca use like too many people
EXAMPLE 2.26 12-button keypad to 4-bit code co nve rte r th ose are th e keys that beg in th e top left row o f lette rs. getting jammed side-
The QWERTY arrangement was made in th e era of Arms stuck! by-side while Irying 10
Yo u've pro bably seen 12-bullon keypads in many
typewriters (s hown in the picture below), whi ch, in simultaneou Iy walk
diffe rent places. like on n telephone or at an ATM
through a doorway. So
mnchi nc as shown in Figu re 2.42. The first row has case yo u have n' t typewrite r keys were arranged in the QWERTY
bUllons I. 2. and 3. th e second row has 4. 5. and 6. _r1 seen one, had each arrangeme nt 10 slow down typing by separaring
Ihe third row has 7. 8. and 9. 3nd th e las l row has *, key conn ected to common lellers. since slower typing reduced me
O. and #. The ou tputs of such a keypad consist of _r2 an arm th ai woul d occurre nces of jammed keys. \Vhen Pes were invented.
seven signals-one fo r ench of the fo ur rows (r 1. swi ng up and press the QWE RTY arrangement was the natural choice for
r 2 . r 3 . and r 4). and one fo r eac h of the three col- an in k ribbo n PC keyboards. as people were accuslomed to that
_<3
um ns (c l. c2, and c3). Pushing a part icul ar button agai nSI paper. arrangeme nt. Some say the differently-arranged D\orak
causes exactly two outpu ts 10 become 1. corre- keyboard enables faster r) ping. but that type of
spond ing to the row and co lumn of Ihal button. So keyboard isn' t very common. as people are JUSt too
pushing butto n ''1'' causes r I : I and C I: I. while Keys cOllnected to arms accustomed to the QWERTY keyboard.
pushin g bUllan " #" ca uses r4 :: 1 and c3=1. \Ve
wa nt to design a circuil th nt co nve rts the seven sig- c1 c2 c3
Fo r t"is example.
starling Jrom na ls frolll (he keyp.:ld illla a 4-bit binary number Figure 2.42 12- bullon keypad. Us ing this fab le. wc call derive equ rni ons for each of the fo ur OUlpUlS, as follows:
equatio ns is a 'v-I XY Z indi cat ing whic h bU lton is pressed. We wa nl
mOre natural >, r 3c2 + r3c 3 + r4 c l + r4 c3 + rl ' r2 ' r3 ' r4 ' cl ' c2 ' c3 '
buttons "0" to "9" to be coded as 0000 th rough 100 I (0 th rough 9 in binary), res pectively. Let's
c" oice I" {UI a Irwh X r 2c l + r 2c2 + r2 c3 + r3cl + rl ' r2 ' r3 ' r4 ' cl ' c2 ' c3 '
table. although lI 'e encode butt on " . .. as 1010. # as lOll. and let's let III I mea n that no button is pressed. Let's
used all inJo rmol ass ume for now lhat only "one" bu tton can ever be pressed at a given time. y rlc2 + r l c3 + r 2c3 + r 3cl + r 4cl + r 4 c3 +
table (1101 a lruth \Ve could capture the functi ons forw, X. y, and Z using a truth lab Ie. with the seven inputs on the rl'r 2 'r 3 ' r4 ' c l' c2 ' c3 '
table) 10 help LIS left side of the tab le. and the fou r outputs on the ri ght side. but that table would have 2' = 128 rows,
dete nnille the : r lc l + rl c3 + r2c2 + r3cl + r3c3 + r4c3 -
equations. and most of Lhose rows wou ld correspond merely to multi ple bunons bei ng pressed. Le t's try instead rl' r2 'r 3 ' r4' c l' c2 ' c3 '
to capture the functio ns using eq uat ions. The infonn al Table 2.7 mi ght he lp us ge t tarted.
We could then creale a circuit ror each OUlpUt. Obviously. the h SI teml of each equation L'OUld be
TABLE 2.7 Informal table for the 12-bunon keypad to 4-bit code converter. shared by a ll fo ur out puts. Likewise. other tenns could be shared too (like r2c31.
No te that th is ci rcuit wou ld not work well if multiple bUllons can be pre~ "00 ~i muJtaneousl~ .
4-bit code outpu ts Our circuit will outp ut ei ther a valid or inval id code in that situation. depending on \\hich bunoos
Bulton Signals 4-bit code outputs
Bullon Signa ls we re pressed. A prererable circuit would trem multiple buttons being pres!'cd as no button being
w y
w Y pressed, \Ve leave the desig n of that circui t as an exercise.
I rl cl 0 0 0 I 8 r3 Circuit s sim ilar 10 what we des igned above exis t in computer ke~ board..!., e',,"eptlhat lh~re are
c2 0 0 0
2 rl c2 0 0 9 a lot more rows and colu mns,
r3 c3 0 0 I
3 rl c3 0 0 r4 I
cl 0 0 EXAMPLE 2.27 Sprinkler valve co ntroller
4 r2 cl 0 0 0 r4 c2 0 0 0 a Automatic law n sprinkler systems use a digital y ~ tem to control th~ opc:nmg. and d o'-mg of w:uc-r
5 r2 c2 0 I II r4 c3 0 va lves. A sprinllcr system u~lIull) !':upports se\ernl ditTcrent zonc~, ~UCh;b the;" bal' ~.ln.i. kft 'Ide
6 r2 c3 0 0 ( no ne ) ya rd. rig ht si Ie yard. fro ll! yard, elc. Onl~ on~ zone 's \'!the can ~ o~~t."'d 3t.1 tlTlk~ In \.wdt"r tQ m~
tain enough water prcs~ure in the !<oprinklcrs in that l one. Up(X.l !'t" J 'pnn~lt'r'~ 'tl~nt 'UPPl"'\ft.!'o up tt." ..
7 r3 cl 0
zoncs. 1)lpical ly. n ~prink. ler~) stem i, controlkd b~ a '\111311, inC\pelhl\(, I1lh•.' ropn: ~ , ~,")re\t: "uang l
progralll tha t ol>t:ns e:lch \'ahc ani) nt sJ:k~\.'itic tim!?!' of the d!t~ ~Uld fo r 'J.lt.."Cltk Jut'"J two, :urp...r

.. ' '--'---
72 Comb inational Logic Design
2.8 More Gates 73

th e microprocessor onl y. has


4 ( I pi ns avaihble to control the va lves. not 8 outputs as req uired
all pu th Illic<roprocessor to use 1 pin (0 indicate wheth er a valve
2.8 MORE GATES
~ h 8 les We C'1O Instead program e . b' Th we
or I e ZOI. < . the 3 other ins to output th e active zone (0. I, ... , 7) In mary. us,
should be opened. and u,eiollal circuit ~avi llg 4 inputs. e (the enabler) and a. b. c (Lhe bi nary value
We earlier introduced three basic logic gates: AND, OR, and NOT. Designers commonly
use several other lypes of gates too: NAND, NOR, XOR, and XNOR.
need to deSign
.
. combln.L . 8
) and haVing outputs
d7 . d6 . ...• dO (Lhe val ve controls). as shown In Figure1
of th e nC ll ve zone . (.
2.43. When e: 1. the clrCUiL. shauId decode the 3-bit binary input by setting exacLl y one ou tput to . NAND & NOR
Step I: Capture t he fune ron
I . Valve 0 should be ac ti ve when abc:OOO and e~l. So Lhe equa-
NAND
Lion for dO is: A NAND gale (short for "not AND") has the opposite output as an AND gate, OUtputting a
dO : a ' b ' c ' e
. .sc. vaJ ve I SIlOUIdbeacLive when abc:OOl and e:l,so LheequaLionfor dl iS:
LlkevJl
d I : a ' b ' ce
G o when all inputs are 1, and outputting a 1 if any input is a O. A NAND gate has the same
behavior as an AND gate followed by a NOT gate. Figure 2.45(a) illustrates a AND gate.
A NOR gate (short for " nol OR") has the opposite output as an OR gaLe, OUtputting
NOR a 0 if at least one input is a 1, and outputting I if all inputs are O. A NOR gate has the

dOf-- - - - - - -r'I
dlf----------~
=I>- same behavior as an OR gate followed by a NOT gate. Figure 2.45(b) illustrates a OR
gate.
We earlier warned you in Section 2.4 Lhat our CMOS transistor implementations of
AND and OR gates were not realistic. Here's Why. It turns out that pMOS transistors
d2f-------~--_,
Micro- d31-- - - - - - - " don'l actually conduct Os very we U, but they conduct I s just fine. Likewise. nMOS tran-
processor d41-- - - - - - - . . . sistors don ' t conduct Is well , but they conduct Os just fine. The reasons for these
d51-- - - - - . , asy mmetries are beyond this book's scope. But the implications are that the AND and OR
decoder d6f-- - - - --., gates we built earlier (see Figure 2.8) are not feasi ble, since they rely on pMOS transis-
d7f------." tors to conduct Os (but pMOS conducts Os poorly) and nMOS rransistors to conduct Is
(but nMOS conduc ts 1s poorly). On the other hand, if we swap power and ground in the
Figure 2.43 Sprinkler valve controller block diagram. AND and OR circuits of Fig ure 2.8, we obtai n the gates shown in Figure 2.45 (a) and (b)-
Those gates have the behavior of NAND and NOR gates, which makes sense since output
For this example, The eq uati ons ror the remaining outpu ts can be I s become replaced by Os, and Os by 1s.
stoning from determ ined similarl y: dO
equations is a NAND NOR XOR XNOR

D D-
more nQlltral d2 a ' bc'e
c hoice {han a IrIIlh
lable.
d3
d4
a ' bce
ab ' c ' e
dl
;GF;D-F
d5 ab ' ce
d2 x F x F
y y xy F xy F
d6 abc ' e 0 0 0 0 1 0 0 0 0 0
d7 abce 0 1 0 0 0 1 1 0 1 0
d3 0 1 0 0 0 1 0 0
Step 2: Convert to equations. No conversion is 0 0 0
needed since we already have equ ati ons.
d4
Step 3: Implement as a gate-based circuit. The
circui t implementing the equations is
shown in Figure 2.44. The ci rcuit we've dS x-cj
designed is aCLually a commonly used
component known as a decoder lVilll
ellable. We'll introduce decoders as a d6
F F
building block in an upcoming section.
x---1
d7

Figure 2_44 Sprinkler valve conLrolier (e) (d)


Circuit (actually n 3x8 decoder wi th enahle). Figure 2.45 Additional gates: (a) NA D. (b) OR. (c) XOR. Cd) XNOR.
74 Combinational Logic Design
2.8 More Gates 7S
We can sli ll implemenl an AND gale in Detecting equality using XNO R
CMOS. bIll we would do so by appending a
NOT gale aI Ihe OlliPUI of a NAND gale XNOR gates can be used to compare two data ilems for equalily. ince a 2-input )(}\:OR
(NAND fo llowed by NOT gives us AND). as oUlputs a 1 only when Ihe inpu ts are bOlh a Or are both 1. For example. suppose a byte
shown in Figure 2. ~ 6. Likewise. we would II1pUI A (a7a6a 5... aO) to your system i counting down from 99. and you want 10 sound
implemen l an- OR gale by appending a NO: an alarm when A has Ihe sa me va lue as a econd byte inpul B (b7b6b5 ... bO). You can
gale at Ihe OUIPUI of a NOR gale. BUI Ihal s detect such equality u ing eight 2-input XNOR gate . by connecting a7 and b7 to the
~bviouSl y slower Ihan a circuil direclly imple- F
firsl XNOR gale, a6 and b6 to Ihe second )(j OR gale. elc. Each X OR gate leUs us
menled as NA ND and NOR . FOriunalely. we whether the bits in Ihal panicular pos iti on are equal. By ANDing all the XNOR OUlpUlS.
we can te ll whether every pos iti on is equ al.
can apply straightforward meth ods to convert
any AN D/ORINOT cireuil 10 a NA D-only
Generating and detecting parity using XOR
circuit. or 10 a OR -only circuit. We 'll
desc ribe Ihose melhods in Seclion 7.2. An XOR gate can be used to generate a parily bit for a set of dala bilS (see Example
2. 19). XORing Ihe dala bits result in a 1 if there's an odd number of 15 in the data. so
Figure 2.46 AND gate in CMOS .
EXAMPLE 2.28 Airc raft lavatory sign using a NAND gate XOR computes the correCI parity bit for even parity. ince Ihe XOR's output 1 would
make the tOlal number of 1s even. Notice that Ihe truth table we created for generating an
Example 2. 15 cre~Hed a 1 i1\,~lI ory available sign using
even parity bil in Table 2.3 does in fact represe nt a 3-bi t XOR.
the followi ng eq uatio n:
Circuit Likewise, an XNOR gate can be used 10 generale an odd parit) bit.
s• ( a be) , a-<-+--; XOR can also be used to detect proper pari ly. XORing the incoming data bilS along
b -<~--( P-- - t - S
Notici ng that the lenn on the ri ght side corres ponds c -<~--( with the incoming parity bit wi ll yield 1 if the number of I s is odd. Thu . for even parity.
to a NAND. we can implement the circuit using a XOR ca n be used to indicate that an error ha occurred. since the number of I s i up-
single NAND gale. as show n in Fig ure 2.47. posed 10 be even.
Figure 2.47 Circ uit usi ng NA ND. XNOR ca n be used to delect an error when odd parity is used.
XOR & XNOR
Completeness of AND/OR/NOT, AND/NOT, OR/NOT, NAND, NOR
A 2- inpul XOR gale. shorl for "exclu sive or" and pronounced as "ex or:' ou lPUIS a
1 if exact/" one of Ihe Iwo inpu ls is a 1. So if such a gale has inpuls a and b, then It should be fairly obviou that if you have AND gate. OR gate. and NOT gates. you can
the output' F is 1 if a· l and b·O, or if b'l and a·O. Figure 2.45(c) illustrates an implement any Boolean functi on. Th is is because a Boolean function can be represented
XO R gate (for si mpli city. we omit the transistor-leve l im plement at ion of an XOR as a sum of product . which consists only of Al D. OR. and NOT operations.
gate). For XOR ga tes with 3 or more inputs, the output is 1 onl y if the num ber of What might be slightly les obviou is that if you had onl) ro and ;\OT gat"". you
input Is is odd. A 2- input XOR ga te is equ ivalenl to the fun cti o n F ab ' + could still implement any Boolean fu nction. Why' Here ' a simple explanatioll--lO
a ' b. obtain an OR. si mpl y put NOT gates at the input and ourputs of an .~'\'O. TJtis \\Qrks
An XNOR gale. shari fo r "exc lusive nor" and pronounced "ex nor," i simply because F ~ ( a ' b' ) ' ~ a" + b" (by DeMo'llan' Law) ~ a - b.
the opposite of XOR. A 2- input X OR is equi va lent 10 F - a' b ' + abo Figure Likewise. if you had on ly OR and NOT gates. you could implement any Boolean
2.45(d) illu strales an XNOR ga le. omitting the transistor-level impl ementation for functi on. To obtain an AN D. you could si mply invert the inpuls and ourpUts of an OR.
imp lic ity. sinceF ~ ( a ' +b ' ) ' ~ a"*b " ~ a bo
It follows thai if you ollh' had NAN D gates a,ailable to you. you uJd still imple-
ment any Boolean fun cti on. Why? Because we can think of a NOT gate - a I-input
Interesting Uses of These Additional Gates
NA ND gate. and we an imp lement an D gate using n 1'1 , D gate follo\\ed by a 1_
Detecting all as using NOR input NA D gate. Since we can implemem any Boolean fun tion l"ing ;\OT and :\_'\'0.
o A NOR ga le can detect the situation of a data ilem eq ual 10 O. ; ince NOR outputs a 1 we can therefore impl emen l any Boolean fun lion u>ing just :\'A~D . :\ X ;\D gate"
o thus known as a 1I11i1'er sa/ gme.
o only when all inputs are O. For example. suppose a byte (S-b il ) inpu t to your system i
counting down from 99 10 0, and when the by Ie reache O. you wi~h to ound an alarm. Li kewi e. if )OU had ~lly I OR gate. you ould implement any Bool an fun,ti,n.
You can delect Ihe byte being equal to 0 by si mply connecting the 8 bit.'> of Ihe byte into because we an implement a NOT gUlf a! a I-inpul NOR gate. and an R gat llio~ng l
an 8-input NOR gale. NOR fo llowing by a I-input 'OR. inc" NOT and OR C~lIl lI11plement Jny B, ,I an tun ,-
li on. so can OR. OR gate is thus abo I.m",n 3< a uni,e _,t/ gat
76 2 Combinational Log ic Desig n
2.9 Decoders and Muxes 77
Number of Possible logic Gates 2.9 DECODERS AND MUXES
Having seen several diffcrent types of
basic 2-input logic gales (A I D, OR , b F Two additional components, a decoder and a multiplexer. are also commonly used as
lAND, NOR, XOR. XNOR). one might 0 0 Oar 1 2 choices ~ digital circuit building blocks, though they themselves can be buill from logic gates.
0 1 Oor 1 2 choices
wonder how many poss ible 2- inputlogic
0 Oorl 2 choices '" Decoders
gales ex ist. That quest ion is the same as
Oar I 2 choices '"
as king how many Boolean functions '" A decoder is a higher-level building block commonly used in digital circuits. A decoder
ex ist for two variables. To answer the decodes an inputl/-bi t binary number by selling exactly one of the decoder's 2" OUtputs to 1.
possible functions
question. we firs t note th ai a two-vari- For example, a 2-inpu t decoder, illustrated in Figure 2.50, would have 22=4 outputs. d3. d2.
able functi on's truth table will have 22=4 Figure 2.48 Cou nting the number of possible d 1, dO. If the two inputs iIi 0 are 00 , dO would be 1 and the remaining outputs would be
rows. For each row. the funct ion could Boolean functions of two variJ bJ es. O. If iIi 0=01 , dl would be 1. If iIi 0=10, d 2 would be l.lf i 1 iO=ll, d3 would be 1.
output one of two poss ible values (0 or The internal design of a decoder i straightforward. Consider a 2x4 decoder. Eacb
1). Thus. as illustrated in Figure 2.48. output dO, dl, d2 , and d3 is a distinct fun ction. dO should be 1 only wben i 1=0 and
there are 2 * 2 * 2 * 2 = 2' = 16 possible iO=O , so dO = il'iO '. Likewise, dl=il ' iO, d 2=iliO ' , and d3 =iliO. Thus. we
functions. build the decoder with one AND gate for each OutpUl. connecting the true or comple-
Figure 2.49 lists all 16 of those functio ns. We indicate the 6 familiar func tions in the mented values of i 1 and iO ta each gate, as shown in Figure 2.50.
figure. Some of the mher function s are 0, a, b, a', b', and 1. The remaini~g functions
are not necessaril y common functions, but each could be usefu l for some panlcular apph-
cati on. Thus, we don't necessarily need to build logic gates to represent those fu nctions, dO
but we instead wou ld build those fu nctions as a circuit of the basic logic gates. dO dO 0 dO 0 dO 0 dl
0 iO dl 0 iO dl I 0 iO dl 0 iO dl 0
b 10 11 12 13 14 15 16 f7 18 19 110 111 112 f13 f14 f15 0 il d2 0 0 il d2 0 1 il d2
0 0 0 il d2 0 d2
0 0 0 0 0 0 0 1 1 d3 0
0 1 0 d3 0 d3 0 d3
0 0 0 1 1 0 0 0 0 1 d3
0 0 1 0 0 1 0 (a)
0 t 0 0 1
0 0 0 0 0 0 0 0
0 D
0 '" D D
a:
D
a:
D
a:
D
a:
~ io D
z 0 0 0 il iO
..: x 0
z 0
z z (b)
..:
'" '" '" '" x z Figure 2.50 2x4 decoder: (a) outputs for possible input combinations. (b) internal design .
'" '"
Figure 2.49 The 16 possibfe BOOlean func tions of two variables.
The internal design of a 3x8 decoder is similar: dO =i 2 ' iI ' i 0 '. d -i 2" •i ,
etc.
A decoder often co Illes with an extra inpm
A more general question of interest is how many Boolean fun cti ons exist for a called el/ab/e. When enable is 1. the decoder
Boolean function of N variables. We can detemline this number by first noting that an acts normally. But when enable is O. the decoder dO 0
N- vari able fun ction will have 2N rows in its truth tabl e. Then, we note lhat each outputs all Os-no output is a 1. The enable is iO dl 0 10) : ' 0
row can outputNone of two possible va lues. Thus. Ihe number of possible fun ctions will be useful when sometime you don't want to acti- il 0 11
2 • 2 • 2 *_2 times. Therefore, the total number of func tion~ is:
d2 d2 - 0
vate any of the outputs. Without an enable, one
e d3
22N
output of the decoder mllSf be a 1. because the ~O
decoder has an output for every possible value 1 0
of lhe decoders II-bit input. We rented and (a) (b)

function~
2 8 used a decoder with enable in Figure ~A-I. A
So 16
there arc: 2 ' = 2 = 256 possible Boolean of 3 vuriablc." and Figure 2.51 Dec."Od<r ",m n bl . \~ I
possible functions of 4 variablc~.
2 block diagram of a decoder with enable appears
2 ' = 2 = 65,536 e-l: !lOnnal lk..' IXiJl\!!. Ibl e- : all
in Figure _.51. outrut- O.
78 Combinational Logic De sign
2.9 Decoders and Muxes 79
. . h'ck if part (or all ) of the ,ystcm's function-
When designing :1 partIcular system. we C c . d cod'r red uces the amount of
. -. d oder USIng a e " Notice that we implemented this system without having to design any gate-level combinatiOnal
ah ty could be calTled oul by a ec . - '11 sec in Example 2.30. logic-we merely used a decoder and connected it to the appropriate inputs and outputs.
combinat ional logic design thaI we need to perfonll. as YOLI

Whenever you have outputs such that ex actly one of those outpu ts should be set 10 1
EXAMPLE 2.29 Basic questIons about decoders _ dO-l d 1-0
based on the value of inputs representing a binary number. think about u ing a decoder.
I. \Vll a! would be :J 2x.-l decoder's output values when the inputs nre DO? AIIJWel. -. •

d2-0 . d3-0. dO-O. dl-0. Multiplexer (Mux)


2. \Vhat wou ld be a 2x~ decoder's output va lues when the inputs arc II'! AIlSII'er:
d2-0. d3-1.
A multiplexer ("mu x" for short) is another higher-level building block in digital circuits.
J. \Vhm in put va lues of a 2x4 decoder cause more t lan 0
I' ne of Ihe decoder's OUlpul; 10 be 1 al Ihe
f d d ' Ulputs can be 1 at a An Mx I multiplexer has M data inputs and I output, and allows only one input to pass
same time ? AflSI,'er: No such input vn llles exisl. Onl y onc 0 a ceo er S 0
th rough to that output. A set of additional inputs. known as select inputs, determines
given lillle. '., ., :0 dl= l d2-0. d3-0? which input to pass thro ugh. Multiplexers are sometimes called selectors because they
·t \Vh m wo uld the input va lues of a decoder be If the output \allles.lre dO . . select one input to pass through to the ourput.
Answer. The in put vi:llues Illllst be ; 1=0. i 0= l.
A mux is like a rai lyard swi tch that connects multiple input tracks to a single outpur
5. \Vha! wo uld the inpu t val ues of a decoder be<> 'f I.he outpuI \'alues arc dO- 1 " d 1 - 1 d2:0.
I . d3-0? track, as shown in Figure 2.53 . The swi tch ·s control lever causes the connection of Lbe
AIlS11'e r; This question i ~ not valid. A decoder only has Olle ou tput equ:1lto 1 at any time. appropriate inpu t track 10 the output track. Whether a train appear al the output depends
6. How Illany outputs would:1 5-input decoder have'? Answer: 25. or 32. on whether a train exists on the presently selected input track. For a mUll. the switch 's
control is not a lever, but rather select inputs, which represent the desired connection in
EXAMPLE 2.30 New Year's Eve countdown display binary. Rather than a train appearing or nOI appearing at the ourpul a mUll outpu ts a 1 or
A New Year's Eve counldown display could make use of a decoder. The di>play may have 60 lighl a 0 depending On whether the connected input ha a 1 or a O.
bulbs goi ng up a tali pole. We want one light per second to turn on (with the prevIous one turning
off). slanin-g from bu lb 59 al the bollOIll oflhe pole. and ending wilh bulb 0 al the lOp. We could use
a microprocessor [0 cou nl down from 59 to 0, butlhe microprocessor probably d ocsn ~t hflve 60 OU I -
put pin s that we cou ld use to control each light. Our microprocessor progr~m cou ld Instead output
the num bers 59. 58 ..... 2. I. 0 in binary on a 6·bit OUIPUI pan (Ihus oUlpuIIlllg 1110 11. ~ 11010•
.... 000010 . 000001. 000000). We could conneCI Ihose six bits 10 a 6-lIlput. 64 (2 )-OUlpUI
decoder. wilh decoder OUlput d59 lighling bulb 59. d58 lighling bulb 58. elc. .
We'd probably want an enable on our decoder in Ihis example. since we'd wa nt all Ihe IIghlS
off until we started the COuntdown. The microprocessor would initially sct enable to 0 so that no
li.hlS would be illuminated. When Ihe 60 second countdown begin,. Ihe microprocessor would sel
e;able to 1. and Ihen Outpul 59. Ihen 58 (I second laler). Ihen 57. "IC. The final system would look
like that in Figure 2.52.

Happy
iO New Yearl Figure 2.53 A multiplexer is like a rn ilyard swilch. detennining \\ hich inpul track conn IS to !be
dO
(; - - il dl single outpul track. according 10 Ihe SWilCh's contrOl lever.
Figure 2.52 Using a 6x64 decoder 10
interface a microprocessor and a column .,
ill i2 d2
i3
~ d3
of ligh" for a New Year', Eve di ' play.
The microprocec,,')or sets e - 1 when the
a.
e
u
i4
i5 ... A 2- inpul mUltiplexer, known as a 2x I multiplexer. has two dala inputs i 1 and i~ .
one elect input 5 O. and o ne dnta output d. a- shown in Figure 2 .~. If 50-0. . . \llIue
la.." minute countdown begin..,. and th en ~
d58 passes th rough . If 50=1. i l's value pa' ses th roug h.
counte., down from 59 100 in binary on
d59 The intern al design of a 2x I multiplexer is hown in Figure 2 . ~ . When 50- i1. the
Ihe pill' i 5 .. i O. ole Ihal Ihe
d60 top A D gate OU lputs 1* i 0- i O. and the bOIlOIll AND gate outputs ~. 1- . Tb ' . th
microprocessor ... hould never output 6(),
d61
61.62. or63 on i 5 .. i O. and Ihu, Iho," 6x64 d62 )59 OR gate output iO+O - iO. a iO pa es th rough u, desired. U ke" i-. \\hen S -:.
OUlPU" of Ihe decoder go unu,cd dcd d63 the bOIlOIll gate passe i 1 while the t p gate outputs O. re.ulling in the R
pass ing i 1.
80 2 Combinational l ogic Design
2.9 Decoders and Mu.es 81
iO I. S IsO -.0 1. A llswer : Because sls0-01 passes inpul i Ilhrough 10 d, then d would have the

88
2x1 value of 1 1, whIch presently is 1.
iO
2. S 1 sO ~ I!. AlISlVer : Thai config uralion of seleci line inpul values passes i 3 through. so d
iO i1
i1 i1 wo uld have Ihe value of 1 3, which presently is O.
sa sa
0
sa
1
3. ~ow many select inp~ls mus., be ,present on a J 6x I mulLiplexer? Ansu:er: Four select inputs
ould be needed 10 ulllquely Ide ntI fy which of Ihe 16 inputs 10 pass through 10 the OUlpUt since
log,( 16)=4.
w ~ ~ ~
Fig ure 2.542 x I multiplexer: (a) block symbol, (b) conneclions for sO-O , and sO-l, and (c) 4. ~ow many s~lecl lines arc there on a 4x2 multiplexer? Answer: This question is not valid--there
internal design. IS no such thIng as a 4x2 multi plexer. A multiplexer has exac ily one ompul.

A 4-i nput muiliplexer. known as a 4xl multiplexer, has four data inputs.i 3, i 2, i 1, S. How ~any inpu ts arc there on a multiplexer hav ing fi ve select inpUlS? Answer: Five select inputs
and i O. two seleci inpulS S 1 and sO, and one dala outpul d (a mu x a/ways has Just one data can unIquely identify one of 2'=32 inputs 10 pass through 10 the OUlpUt.
Outpul , no matler how many inpulS). A 4x I mux block diagram IS shown III Figure 2.55.
EXAMPLE 2.32 Mayor's vote display using a multiplexer

iO Con.sider a srn alJ IOwn with a very unpopular mayor. Mayor's switches
4x1 Dunng every town mee ting, th e ci ty manager pre-
iO i1 sents four proposals to the mayor. who then indi -
i1 Cates his vote on Ihe proposal (approve or de ny).
d Very consislently. righl after Ihe mayor indicales his
i2 i2
vote, the town's citizens boo and shout profanities at
i3
the mayor-no matter wh ich way he Vo tes. Having
51 sO i3
had enough of Ihis abuse, Ihe mayor selS up a simple
digital system (Ihe mayor happens 10 have laken a
course in digital design), shown in Figure 2.56. He
provides himself with four switches that can be
sl sa positioned up or down, outpUlling 1 or O. respec-
(a) (b) ti vely. Wh en the time comes during th e mee ting for
him 10 VOle on the fi rst proposal. he places the firs l
Figu re 2.55 4 x I muhiplexer: (a) block symbol and (b) internal design.
swilch either in the up (accept) or down (deny) posi-
Figure 2.56 Mayor's \Ole displa~ ~S[em
tion-bUI nobody else can see the position of the implemented using a 4x I mu.<.
The internal design of a 4x I multiplexer is shown in Figure 2.55 . When S 1 sO-DO,
switch. When th e lime comes to Vote on the second
the to p AN D gale out puts i 0*1 * 1= i O. the next AND gate outputs i 1 *0* 1-0. the next
proposal. he VOles on the second proposal by placing the second swilch up or do"n. And 00_
gate o ut pu ts i 2*1 *O~O, and the bOllom gate outpu ts i 3*0*0=0. The OR gale outpulS
When he has fini shed casting all his VOles. he leaves Ihe meetine and he~ds OUI for off"". \\ith the
i O+O+O+O~ i O. Thus, i 0 passes through, as des ired. Li kewise, when s 1 sO-O l. the mayor gone, the city manager power up a large green/red light. \\'hen the input to the lighl is ,
second AN D gate passes i l. while the remaining AN D gates all o utput O. When the lighl Iighls up red. When the inpul is 1. the lighllighlS up green. The cil} manager controls!VoO
5150=10, the third AND gate passes i 2, and the other AND gates output O. When switches th at can route any of the mayor's switch outputs to the light. and so the manager sreps
s 1 5 0= 11, the bOllom AND gate passes i 3, and the other AN D gates OUtput O. For any through each configuralion of Ihe swilches. slarring with configuration 00 (and alling OUI "n,.,
value on s 1 sO, onl y I AN D gate will have two 1s for its select inputs and will thus pas mayor's VOle on Ihis proposal is ..... ). then 01. the n 10, and finally 11. causing the lighl lolighl
its data input ; the other AND gates wi ll have at least one 0 for its select inputs and will either green or red for each configuratio n depending on the IX> itions of the 013)Or'£ S\\;tcbes. The
thus ou tpu t O. sys tem can easily be implemented lIsi ng a 4x I mu ltiplexer. as shown in Figure 1.56,
An 8x I multiplexer would have 8 data inputs i 7... i 0,3 selec t inputs 52 . s l and sO,
and one data outpu t. More generally, an Mx I multiplexer has M data inputs, log2(M) N-bit Mxl multiplexer
select inputs, and one data outpu t. Remember. a mult iplexer always has just one output. Muxes are oft en used to sele ' rively po s through n t ju tingle bilS. but 'v-bit data item..<.
For example, one set of inputs A may onsist of four bits a3, aZ. d 1, a .:md anocMr: (
EXAMPLE 2.31 Basic questions about multiplexers
of inputs B may a lso consist of four bi ts b3 . b2. b1. bOo \\' \\:int t mullipl \ th<
AS5u.me a 4x I muhipJexer's four input.; presently have the followlllg valuc" i 0-1. iI - I. i 2-0, inputs to a four-bit output C. consi ti ng of c3. c2, cL eO. Figure 2.5 (al >hO\\S h,)\\ to
and 13-0. Whal wou ld be Ihe value on muhiplexcr', OUIPUI d for Ihe folio" illS ,ciCCI inpul volu ? accomplish ' uch mult iplexing using ~ ur _\ I lllU \ CS .
82 2 Combinational Logic Design
2.10 Additional Considera tions 83

4
4·bit
2xt
Simplifying
notation:
4
....... C
~~
;:;~ 8g-
0 =:....___ A
B
~_T_..,..--, 10
B·bit
4 Xl

rs
'"'-_~Il
A-.'-IO 8 D
4 D C short E iii ?-12
8-.'-11 for: 0'= M 8
sa u: ~ l - -.....-..1 3

sa
-
-
c3
c2
" j..,\~~ -------- ~~~-~

- cl
sO - cO
(a) (b) (e)

Figure 2.59 Above-mirror display using an 8-bil 4x I mux.


Figure 2.57 -J -bit 2x I 11l1lX: (a) intern ;}1design using four 2x I nlu xes for selec ti ng al11.ong 4-bi l data
items A or B. and (b) block diagr:lJl1 of iI -I.-bit 2x I mu x component: (e) The block diagram uses a
C01111110n simpli fying notation. using one thick wire with a slanted line and the number 4 to Notice how many wires must be run from the car's central computer. which may be under the
rep rese nt .... sin gle wires. hood , to the above-mirror display-B * 4 = 32 wires. That's a lot of wires. We'll see in a later
chapter how to reduce the number of wires.

Because mux ing data is so common. another common building bloc k is lhat of an
. Notice in the previous example how simple a design can be when we can utilize
N-bit- w ide Mx I mul7iplexer. So in our example. we wo uld use a 4 ·bil 2x I mu x. Don't get
hIgher-level building blocks. If we had to use regular 4x I mu xes. we would have 8 of
confused. lho ug h- an N-bil Mx I muhiplexe r is reall y just the same as N separa le Mx I
them, and lots of wires drawn . If we had to use 2ates. we would have -lO of them. Of
multi plexers. with a lilhose muxes sharing the same select inpuls. Fi gure 2. 57(b) provIdes course, underlying Our simple design in Fig ure 2~59 are in fac I eight 4x I muxes. and
the sy mbo l fo r a 4- bi l 2x I mu x.
underlyi ng those are 40 gates. And unde rl ying those gates are lOIS more rransislOrs. We
see that the higher- level building blocks make our design task much more managable.
EXAMPLE 2.33 Multiplexed automobile above ·mirror display
Some cars come with a displ ay abovc the rcar-
view mirror, ::I S shown in Figure 2.58. Th e car's
2.10 ADDITIONAL CONSIDERATIO NS
driver can press a button named mode to select
among di5.playing the outside tcmperatu re, th e Schematic Capture and Simulation
average miles-per-gallon of lhe car, lhe instanta-
neous miles-per-gallon, and th e approx imate When we design a circuil , how do we know that we designed the circuit correctly" Perhaps
mi les remaining until the C:1r ru ns out of gaso- we created the truth tab le wrong, puning a 0 in an outpul column where we houId have PUI
line. Assume the car's cenlral compUier sends a 1. Or perhaps we wrote down lhe wrong mintenn . writing y z when we should bave
Ihe dala 10 the di splay as four 8·bil binary num. wrole xy Z '. For exa mple, consider the num ber-of-one's counter in Example 2.25. We
bers. T (the temperature). A (average mpg). I c reated a tru th table, then equations. and fin all y a circu it. Is the circuit correcl?
(inslanlaneo us mpg). and M (miles rema ini ng). One method of checking our wo rk is to re verse engineer the function from the
T consists of 8 bils: t7. t6. t5. t4 . t3. t2. c ircuit-staning with the c irc uit. we could conve n lhe circuil to equations. and then the
tL to. Likewise for A. l. and M. Assume Ihe equatio ns to a trulh table. If we gel the same ori ginal tru th table. then the circuil 5h uld
display sys tem has two add itional inputs X and Figure 2.58 Above· mirror di'play.
be correct. Howeve r, sometimes we stan with an equalion ralher lhan a truth l:tble. 3S in
y. which always change accord ing (0 the fol- Example 2.24 . We can reverse eng ineer lhe circuit to an equati n. but that equation ma~
lowing seq uence-OO. 0 I. 10. ll-whenever the mode bU llon i, prc"cd (\I c'lI ,cc in [I lata chap- be different than o ur ori g ina l equation. espe ia lly if we algebmicaIl) manipu!JI~ the
ler how 10 creale such a ,equence). When xy-OO. we wanl lo di'play T. When xy - O1. we wa nt to original equalion when des igning the c ircuit. A nd checking that two equati ns are equi\'-
di'play A. When xy - l O. we wa nl lo di' play I. and when xy- I I. we wnnl to di'p lny M. A<s ul11e Ihe
a le nt may requ ire convening to canonical rOm! (su m-of-minten11S1. \\hich may result in
OUIPUI, D go 10 a display Ihal know, how 10 conven Ihe 8·bi l hinary number on 0 10 a human.read.
able di'played number like thaI in Figure 2.58. huge equalions if our functi o n has a large numbe r f inputs.
In fact . e ve n if we didn ' t make nny mistakes in nvening ur mental undersr.mding
We Can dc\ign the display ~yMem u'\ing eight 4:< I mu lt iplexer,. A ~lInplcr rei rc~cn l nti n of
Ihal ,a me design u,e, an g·bil 4x I multiplexer. a, ' hown in Figure 2.59. of the desired func lion into a lruth lable or eq uatio n. ho\\ d \\ e \"no\\ that our mental
under tanding was correct ?
84 Combinalional Logic Design
2.10 Additional Considerations 85
.' Ih'lI (I circuil works (IS we expect is called Nonideal Gate Behavior-Delay
A commonl y used method for checkmg ' f 'd' g omple in l)U IS 10 the circuit
. . . . . ' . h process 0 provl In u
sl mul allon. SlIlIlIlatlOlI of a CirCUli IS I e . ' 1' OUIPUI for the given inputs.
th I compules Ihe Circul s Ideally, logic gale oUlputs would change
and running a compu ler program a I The compUler program that
We can then check Ihal the OUIPUI malches whal we expec .
performs simulalion is called a s;mllialor.
To use simulalion 10 check a circuit, we
mUSI describe the circuit using a method that
Immediately in response to changes in
the gate's inputs. The liming diagrams
earlier in this chapter all ass umed such
ideal zero-delay gates, as shown again in
x
1Ju- 1Ju-
enables compuler programs 10 read the. Cl r-
cui!. One melhod of descri bing a cIrCUlI IS to
draw the circuil using a schematic capture
Figure 2.62(a) for an OR gale. Unfortu-
nalely, real gate oUlputs don' l change
immedialely, but ralher after some short
:lr o !
1! r-
o-f-l
1001. A schemat;c caplllre 1001 allows a user lime delay. After all , even the fastesl
~Jt-i___-
10 place logic gates on a com pUler screen and
10 draw wires con necting those gates. The
1001 allows users to save their ci rcuit draw-
au tomobi les can't go from 0 10 60 miles-
per-hour in 0 seconds. The delay in gates
is due in part 10 Ihe fac t that transistors
F

o , .. time
F
1

O !
JJ ..
time
inos as compuler files. All the circuit don't switch from nonconducting to con- (a) (b)
dr~wings in this chapter have represented ducting (or vice versa) immedi ately-it
Figure 2.62 OR gale timing diagram: (a) withoul
examples of schematics-for example, the takes some time for electron to accumu- gale delay. (b) with gale delay.
circui l drawing in Figure 2.50(b), repre- late in Ihe channel of an nMOS
Figure 2.60 Display snapshot of a commercial schematic
capture tool. senting a 2x4 decoder. was an example ofa transistor, for example. Furthermore, electric current travels at the speed of light. which,
schematic. Figure 2.60 shows a schematic whi le extremely fast. is slill nOI infinitely fast. Additionally, wires aren'l perfect and can
for Ihe same des ign. drawn usi ng a popular slow down electric current because of "parasitic" characteristics like capacitance and
Inpuls inductance. The timing diagram in Figure 2.62(b) illustrales how a real gate' output
commercial schemalic capture tool. Sche-
iO---.-fL matic capture is used nOI onl y to capture changes slightly after change in the inpulS. Gales delays for modem CMOS gates may

Outputsi1~mulat~
d3
Outputsi1
d3~L-
I'.M+
n-
circuils for simul ator lools, but also for tools
that map our circuits to physical implementa-
tions, which wi ll be di scu ed in Chapler 7.
take less than I nanosecond to respond 10 changes--extremely fast. but still not zero.

Demultiplexers and Encoders


d2 d2-.r-
Once we've created a circuit u ing sche-
matic capture, we must provide the simulator Two additional components, demultiplexers and encoders. can also be con idered as
d1 d1JL with a sel of inputs for which we want to combinational build ing blocks. However, those component" are far les commonly used
than their counterparts of multiplexers and decoders. everlheless. for completeness.
dO dO~ check for proper output. One way of pro-
we' ll briefiy introduce Ihose addi lional components here. You may notice throughout
viding the inputs is by drawing waveforms
(a) (b) thi s book thaI demultiplexers and encoders don't appear in many examples. if in any
for the circuit 's input ' . An input's waveform
Figure 2.61 Simulation: (a) begins wilh us defining Ihe inputs examples at all .
is a line thaI goe from left to right , repre-
signal over time. (b) automatically generales the oUlput
senting the va lue of the input as time Demultiplexer
waveforms when we ask the simu lato r to sim ulate the circui t.
proceeds 10 the right. AI different times, we
draw the line as high 10 represent 1, and low A demultiplexer has roughly the opposite functionality of a mulriple_<er. pecifically."
to represenl 0, as shown in Figure 2.6 1(a). After we are sat isfied wi th our input wave- I xM demultiplexer has one dala inpul. and based on the alue of 10g~(M) lecl liD
forms, we instruct the si mulator 10 simulale our ci rcu it for the given inpu ts waveforms . passes that input thro ugh to one of M OUTputs. The other outputs stay O.
The simulator determines what the circuit outputs wou ld be for each unique combination
Encoder
of inputs, and generates waveforms for the outpUts. as illustraled in Figure 2.6 1(b). We
can then check that the output waveforms malches the outpul val ues Ihat we wou ld expect An encoder ha Ihe opposite functionalily of a decoder. pecifi all~. an n r log;:(n)
for each input. Such checki ng can be done visuaJly. or by providing certain checking encoder has II inputs and log2(1I) OUlputS. Of the II inputs. e<8ctly one is 3S<umed I be _
statements (often called assertions) to the simulalor. al any given time (su h would be the case if the inpul n isted of a liding or rotating
Simulation still does not guarantee thaI our circuil is correct. but rather increa es our swi tch with II possible po ilion. for example). The en oder outputs a btn:ll) value 0 \ r
cOllfidence that our circuit is correct. the log2(1I) output indi ating which of the II inputs \\ as a L For e\ 301ple. ~~ en.: er
would have four inpttls d3 . d2. dl. dO . and t\\O UIPUI - el. eO. Rlran IIlput 1, ~

~ . .-- - -
86 Combinational Logic Design 2.14 Exercises 87
and NOT gates, enabling . us to build d .
OU lpUt is 00 . 0010 yields 01, 0100 yields 10 . and 1000 yields 11. In other words,
ex tremely powerful conee t S ' ~n manipulate circuits by using math-
d O~ 1 resul ts in an OUl pUt of 0 in binary, d 1 ~ 1 results in an output of 1 '~ blllary, d 2- 1 B I p . ecnon 2 6 Introduced I ' an
00 ean function s namely equation '.. severa dIfferent representatiOIl5 of
results in an output of 2 in binary. and d3~ 1 results in an omput of 3 ttl btnary.
straightforward th;ee-step process fO~' d~;~ UItS, and truth tables. Section 2.7 described a
A priority e/l eoder has si mil ar behavior, but handles situations where more than one
exa mples of bui lding real circuits usin th g~ng combinatIOnal CIrcuits. and gave several
input is 1 at the same ti me. A priority encoder gives pri ori ty to the hIghest Input that IS a
NA ND and NOR gates are actuall get ree-step process. Section 2.8 described why
1. and outputs the binary value of that input. For example. if a 4x2. pri ority encoder has
CMOS technology, and showed y mare commonly used than AND and OR eates in
inputs d3 and dl both equal LD 1 (so the inputs are 1010). the pnonty encoder gIves pn-
could be built with NAND gates ~~~ea:~ ~~~It built from AND. OR. and NoT gates
ority to d3 , and hence outputs 11. two other commonly used gates XOR d gates alone. That seCllon also introduced
commonly used combinational bUildin an XNOR. Section 2.9 introduced two additional
bl
2.11 COMBINATIONAL LOGIC OPTIMIZATIONS Introduced schematic capture tools ;. hoCI~s. decoders and mUltiplexers. Section 2.10
puter programs can re'ld thos . ,w IC a . ow us to draw our circuits uch that com-
AND TRAOEOFFS (SEE SECTION 6.2) , e CirCUitS and als ' d d' .
the output waveforms for user-pro 'd d' . a Intra uce SImulation. which generates
The earlier secti ons in this chapter described how to create basic combinational circuits. . . VI e Input waveforms t hi '
a CirCUIt correctly. That section also discu a e p us venfy that we created
This section. Secti on 2. 11 , physically appears in this book as Secrion 6.2, and describes between the time that I'n t h ssed how real gates actually have a small delay
how to make those circuits better (smaller, fas ter, etc.)-namely, how to make optimiza- pu s canoe and the t' b h ' -
secti on also introduced sam I '" Ime t at t e gate s output changes_ The
tions and tradeoffs. One use of this book covers combinational logic optimi zations and . e ess common ly used c b' . aI . . -
tlplexers and encoders. am illatIOn building blocks. demul-
tradeoffs immed iately after introducing basic combi national logic design, meaning cov-
ering that section now (as Section 2. 11 ). An alternative use of the book covers that section
later (as Section 6.2) , after al so introducing basic sequential design, datapath compo- 2.14 EXERCISES
nents. and register-transfer level design- namely, after Chapters 3, 4, and 5.
Any problem noted with an asterisk (*) represent an especially challenging problem.

2.12 COMBINATIONAL LO GIC DESCRIPTION USING HARDWARE SECTION 2.2: SWITCHES


2.1 A microprocessor in 1980 used aboul 10.000 transislors How
DESCRIPTION LANGUAG ES (SEE SECTION 9.2) .
would fil In a modern chip having J billion transistors? . mallY of those mtcroprocessors
Hardware description languages (HDLs) allow designers to describe their circuits using a
2.2 The fi rsl Pentium microprocessor had about 3 '1" .
textual language rather than as circuit drawi ngs. Thi s section. Secti on 2. 12, introduces the processors would fi l in a od . . m~ I~on lr.lnSISIOrs. HO\\ many of those micro-
m em chip havmg I billion transistors?
use of HDLs to describe combinati onal log ic. The section physically appears in the book
2.3 Define Moore'S Law.
as Section 9.2. One use of mi s book introduces HDLs now (a~ Section 2. 12), immediately
after introducing basic combinational logic. An alternative use of the book introduces 2.4 Assume for a pan icular year that a panicular ' h' .
cOOlain I billion t ' . sIze C lp uSing tate-of-Lhe-an technolo!!) :lD
HDLs later (as Section 9.2), after mastery of basic combinational, eq uenti al, and reg- ~ ranSlSlOrs. ASSUll11112 Moore's Law h Id h -
PLUS same size chip be able 10 contai n in Ie; years ? 0 s. 0\\ man) rran~islors will the
Ister-transfer level design.
2.5 Assume a cell phone co . 50 "
the phone used vacu um :~~~Sinst~:~I~~~r::::~t~rs. How. big would such 3 cell pb oe be If
I cubic inch? ' assumIng :1 \ 1} uum rube has a \olume of
2.13 CHAPTER SUMMARY
A modem
2.6 bi would
desktop proces. sor, suc h as [he Pentium -f, has about 300 million tr.Utsistors.. H<m
Section 2. 1 introduced the idea of u. ing a custom digi tal circu it to implement a system's
desired fu nctionality and defi ned combinational logic as a digi tal circuit whose outputs
are a function of the circuit's present inputs. Section 2.2 provided a brief history of digital
V3~UUI11 [U~ ~~~::~la~:k~~PI ~::re ~I:C~? if we used vacuum rube of the I ~ . -suoun~ a
switches, starting from relays in the 19305 to today's CMOS transi. tors, wi th the main SECTION 2.3: THE CMOS TRANSISTOR
trend being the amazing pace at which switch size and delay have continued to shrink for 2.7 Describe the behavior of Ihe CMOS lransislor circuit
the past several decades, lead ing to ICs capable of containing a billion transistors or shown .in Figure 2.63. clearly indicati ng when the tr:Jn-
mar:. Sect ion 2.3 described the basic behavior of a CMOS tr. nsiMor. j ust enough infor- Slstor ClrCUiI conducts.
matIOn to remove the mystery of how trans i s to r~ work . ecti on 2.4 introduced three 2.8 If we apply a voltage 10 the gate of a CMO lransbtor,
fundamenta l bui lding blocks fo r bui lding di gital ci rc uit ~-A D gates. OR gates. and \~hy ?ocsn', the CUITCnt fl ow from the gale 10 lht: tr..tn-
NOT ga te~ (i nverters), which arc far c.1sier to work with thun tranSiMor~ . Section 2.S Sistor S source or drain? Figure 2-63 Ctrcuit, Nlt lrung
showed how Boolean algebra could be u,cd to rcprc,ent circuit; built from D, OR, t\\ O ~t tr.ll~L"h.."'f'.
88 Combina tional Logic Desig n
2.14 Exercises 89
SEC TION 1.4: 800LEAN LOGIC GA TES-8 UILDING 8LOCKS FOR
(b) F - + a e + f'
a + bcd'
DIGITAL C IRCUITS . .
(c) F = (a + b) + (c ' * (d + e + f9»
OT . appropriatc for cach of Ihe fo li ow lOg.
2.9 \Vhich Boolean opcralion. AN D. OR. or . IS d' 'I house (c~lch motion sensor outputs 2. 19 We Want to design a system that sounds a buzzer inside Our home whenever motion outside is
(a) Detecti ng mOl ion in any Illolion se nsor surrOUIl 109 •
detec led al ni ght. Ass um ing we have a mo ti on se nsor wi th o utput M thal indicates whether
1 when Illolion is detected). ",' ssed simultaneous ly (eac h bUBo n ou tputs I when mol ion is delec led (M-l means motion delecled) and a lighl sensOr wilh Outpul L that indi-
(b) Detectin g that th ree bu ttons arc bt: lOg pre
ca les if li ghl is delecled (L = 1 means li ghl is delecled). The buzzer inside the home has a
a bu lt~1I is being pressed)"
(e) Delccllng (he :1 bscncc of light from a
r oll! sensor (the ligh t sensor out puts 1 when light is
10
sing le inpul B Ihat when 1 crea les a loud warn ing sound. Usi ng AND. OR. and NOT gates.
creale a s imple dig ital circuit 10 impl ement the mo ti on detec tor at night system.
sensed). .
Boolca." equ ations:
e. o
PLUS 2.20 A DJ ("·disc joc key." meaning someo ne who plays the mus ic al a party) would like a system to
2. J 0 COllvcn the rollowing English probl em sl.lI~mel ' d
. .,. lIS 10
"d ' d th e system is set 10 enabled. aUlo malically conlrol a strobe li ghl and disco ball in a dance hall depending on whether music
n h Id a pump If water IS CICCI!.; an
(a) A ood deleclor S ou lum on , . ; 11 if il is nig hl and lig hl i delecled inside a is play in g and anyone is dancin g. Assume we have a sound sensor with output S mat indicates
(b) A house energy monitor should sound an a an
whelh er mu sic is pl ay ing (5= 1 means music is p laying) and a motion sensor M that indicates
room bu t 1110 [ion is n OI de tected. . . . wate r valve if the sys te m is enabled and whether peopl e are dancing (11- 1 means people are danc ing). The strobe light bas an input L
(c) A n irriga ti on sys tem should open the spnnkl er s
Ihal lums Ihe lig hl on when L is 1. and the di sco ball has an inpul B thai turns the ball on
neither rai n nor freezing tempelJ.tures are de tec ted. .
when B is 1. The DJ wants Ihe di sco ball 10 tu m o n on ly when music i playiDg and nobody
1.11 Eva luale Ihe Boo lean equali on F = (a AND b) OR c OR d fo r Ih e g ive n va lues o f vanables is dan ci ng. and Ih e DJ wan Is the strobe li ghl lo lum o n o nly when music is playing and people
a . b. c. and d: are danCing. Using A D. OR. and NOT gales. creale a si mple digilal circuil to activate: (a) the
(a) a-I. b=1. c=1. d=O di sco ba ll. and (b) Ih e stro be li ght.
(b) a=O. b=1. c=1. d=O 2.21 We wanl to concise ly descri be the fa llowing si ruation usi ng a Boolean equation. \Ve Wanl to fire a
(c) a=1. b=1. c=O . d=O foolball coach (by setting F -1) if he is mean (represented by M= 1). If he is nor mean. but has a
(d) a-I. b-O . c= l. d-l losing season (represented by the Boolean variable L- 1). we wanl 10 fire him anyway. Write an
2. 12 Eva lu ale Ihe Boo lean eq uali on F = a AN D (b OR c ) AN D d fo r Ihe g ive n va lues o f variables equ ation thai translales the siluation directly 10 a 8 00lean equation for F. "ithout any
a . b. c. and d: simplificmi on.
(a) a=l. b-1. c=O. d-l
SECT/ON 2.5: 800LEAN ALGE8RA
(b) a=O. b=O. c=O, d-l
(c) a-I. b-O. c=O , d=O 2.22 Fo rthe funclion F = a + a' b + acd + c':
(d) a-I. b=O. c=1. d=1 (a) Lis l all Ih e vari ables.
(b) Lisl all Ihe li lerals.
2.13 Eval uale the 8 00 lean eq uation F =a A D (b OR (C AND d) ) fo r Ihe g iven va lu es of vari·
(c) Lis l all Ihe prod uci lerms.
ab ies a . b. c . an d d:
(a) a-I, b-1. c-O. d-l 2.23 Fo r Ihe fun cli o n F - a ' d' + a ' c + b' cd' + cd:
(b) a-D. b-O. c-O . d-l (a) Lisl all the varia bles.
(c) a-l. b-O. c=O. d - O (b) Lisl all the lilerals.
(d) a-l. b-O. c-1. d - l (c) Lisl a ll Ihe prod uci lerms.

2. 1 ~ Show the conduc li on paths an d OUIPUI va lue of the OR gale transi lor c irc uil in Figure 2.11 2.24 Lei varia bl es T represent being tall. H bei ng heavy. and F being fast. Let" consider an)ooe
when: (a) x = 1 and y = O. (b) x - I and y = I. who is nOI lall as short . not heavy as li gh l. and not fast as slow. Write a Boolean equation to
represe nl Ihe fo llowing:
2.15 Show the conduc lion paths and OUIPUI va lue o f Ihe AN D ga le lrans i'lo r circ uil in Fig ure 2.1 3
(a) Yo u moy ride a panicu lar amu semen l park ride only if you are either tall !Uld ligb~ or
when: (a) x = 1 and y - O. (b) x = 1 and y - 1.
s hort and heavy.
2. 16 Conven eac h of Ih e fo ll Ow ing equali ons directly 10 gate-level circlIi l-" (b) YOll may NOT ri de an amuse ment park ride if you are either tnll !Uld lighl. or -bon !Uld
(a) F a b ' + bc + c '
heavy. Use a lge bru 10 si mp lify the eq uatio n 10 sum-of-produ IS.
(b) F - ab + b ' c 'd'
(c) Yo u are e li g ibl e 10 play o n a panicul", baskelball leam if you are tall !Uld fast. or tall :tnd
(e) F E « a + b' ) * (c ' + d» + (c + d + e' ) s low. Sim plify Ihi s equ alio n.
2.17 Conven each of Ihe following equali ons direclly 10 gate-leve l circ uits: (d) Yo u are OT e lig ible 10 play on " particular foolball 1<!Ul' if you are shoo !Uld ". or if
(a) F - a ' b' + b' c you are light. im plify 10 s um-of-products fonn .
(~ F - ab + b c + cd + de (c) Yo u are eli g ible 10 play o n both the baskelball and football leams .00\ • . based on the
(c) F - (( a b) ' + (c» + (d + e f) , above criteri a. Hi n!: combine the two equ3tion ~ into one- equ3tion b~ ANDing them.
2. IS Conven each of Ihe fOllowing equation; direell y 10 gale- leve l C"ClliL'.
(a) F - abc + a' bc fu-s 2.25 Le i variab les 5 represenl n pockagc being -mall. H being he3\). and £ being <\pensl\e. Let",
co ns ider a package th aI b not small as big. nO( heJ.\') ~ light. and not c:\pensl\~" \ n-
si"c. \ rile n 8 00lean equmion I represent Ihe fOllowing:
90 Combinat ional Log ic Design ~
2.14 Exercises 91
k "'S are either small and c,'<pensive. or big and TABLE 2.9 Truth table.
(a) You can deliver packages on ly if the pac 'age. • . 2.36 Convert the functi on F shown in the truth table in Table 2.9 10 an equation. Don't minimize
b c [he equation.
inexpen sive. . r led above. Use nlgebru to simplify th e eq ua tion
F
(b) You can NOT deli ve r a package Ihal IS 15 .
a o a a 2.37 Use algebraic manipulafion to minimize the equation obtained in Exercise 2.36.
10 sum -of-products . k I 'f Ihe pockoges "rc small and lighl, small
(c) You can load the pac bges into you r truc on ~ I 0 0 1 2.38 Convert Ihe funclion F shown in Ihe lrulh lable in Table 2. 10 10 an equalion. Don'I minimize
the equation.
. S' rfy Ihe equallon .
and heavy. or big and light. IInp I "b d ' bove Simplify to sum -of-products.
OT I d h packaoes descn ea . a a 2.39 Use algebraic manipulation to minimize the equaLion obtained in Exercise 2.38.
(d) You can N oa l e , . o I ' equarion (0 sum-of-products form:
2.26 Use algebraic manipu lation to convert the fol OWing a 2.40 Convert the function F shown in the truth table in Table 2. I I to an equation. Don't minimize
the equat ion.
(b - c)(d ' ) + ac ' ( b + d ) F
+
a aloebrnic h followin o o a
2.27 Use manipu lation to conve rt te d )'o equation to sum-of-products ronn: 2.41 Use algebraic man ipulat ion to minimize rile equation obtained in Exercise 2AO,
a ' b( ; + d ' ) +a ( b ' + c) +a ( b+ c.. +a 'b o 2.42 Creale a lrulh table for Ihe circuil of Figure 2.64.
. f the following equatio n: F ;;;: abe .
2,28 Usc DeMoroan's Law 10 find Ihe IIl verse 0 . F ' = ( a bc + a ' b) ,
Hint' Stan wllh
a 2043 Creale a lrulh table for Ihe circuil of Figure 2.65.
2.4~
e r
Redu ce 10 sum· of-products ,om1. . . ' .F _ ' + a bd' + Convert Ihe funclion F shown in Ihe lrulh lable in Table 2.9 10 a digital circui t.
o '
' 9 Use DeMorg an's Law to find the .In verse 0 f the follo\\llOoe equation .
PLU·S _.- acd . Reduce 10 sum-of-produc ls fom1 .
ac
2.45 Convert Ihe funclion F shown in Ihe lrulh lable in Table 2. 10 10 a digital circuit.
T}BlE 2.10 Truth table. 2.46 Convert the function F shown in the truth table in Table 2, 1J to a digital circuit.
SECTION 2.6: REPRESENTATIO S OF BOOLEAN
FU CTIONS . --.a b c F
2.47 Convert th e following Boolean equa tions to canonical sum-of-m imenns fonn:
(a) F ( a , b , c ) a ' bc + a b
(b) F (a , b , c) a'b
2.30 Convert Ihe following Boolean equalions 10 a digi lal circ u,,: (J 0 a 1
(c) F(a , b, c) a bc + ab + a + b + c
(a) F (a , b , c ) a ' bc + a b
(b) F ( a , b , c ) a' b c> a a (d) F (a , b , c) c'
(c ) F( a , b , c ) abc + ab + a + b + c
Figure 2.64 Combinalional circuit F.
(d) F ( a . b , c ) c' d a 2.48 Delenn ine whelher Ihe Boolean funclions F
( a + b ) ' *a and G - a T b' are
equivalen!. using: (a) algebraic manipulation. and (b) !ruth lables.
2.3 1 Creme a Boolean equation representalion of the dig ital circuit o a
~·S in Figure 2.64.
1 o a
2.49 Detennine whelhcflhe Boolea n funclion s F = a b ' and G = ( a ' + a b) ' are eq ui\'alenl
using: (a) algebraic manipulalion. and (b) lrulh lables.
2.32 Create a Boolean eq uat ion repre entation for the dig ital circuit 2.50 Delennine whelher Ihe Boolean funclion G _
b -----r~ in Figure 2.65.
a a 'b'c + ab' c + a bc ' + abc isequiv-
2.33 Convert each of Ihe Boolean equalions in Exercise 2.30 10 a a alent to the function represented by the circuit b
lrulh table. in Figure 2.66.
o 2.51 Determ ine whether the two circuits in Figure
H
2.34 Convert each of lhe foll owing Boolean equalions 10 a !ruth
G table: 2.67 are eq ui va lent circuits using: (3) algebraic
c:- - - - (a) F ( a , b . c) = a' + bc ' TABtE 2.11 Truth table. manipulalion. and (b) lrulh lables.
(b) F( a , b , c) = ( ab ) ' + a c ' + bc b c F
f igure 2.65 Combinalional circuil C. a Figure 2.66 Combinational irruil H.
(c) F( a , b , c ) ab + a c + a b ' c ' + c '
(d) F ( a , b , c , d ) = a ' bc + d ' o o a a
2.35 Fill in Tab le 2.S's columns for Ihe equalion: F- ab + b ' . o o 1 1
TABLE 2.8 Truth table, a a a G

a a
o o a
a o
Figure 2.67 Combinntional circuils F and C.
a
2.52 · Figure 2.68 shows two circuit~ in \\ hich Ihe- inputs of the cirt'uil'\ Jre un~;)tx-Ied_ .
(a) Dctenninc whether the 1\\ 0 circu jt~ arc cquh-ak nl. Hint : Tr) ;1JI ~l\.)'~I t'tle IJ~hng' ,)t lM
inpulS fi r both circuit:"
92 Combinational log ic Design
2.14 Exercises 93
.. ' . ,'II au need 10 perform to dc temli nc if IWO circuits with SECTION 2.8: MORE GATES
(b) How many circ uli compansolls ,\ I Y
10 unlabeled inputs arc C(lui va lenl ? 2.60 Sh~w the conduction pmhs and ou tput va lue of the NAND gale transistor circuil in Figure
2.4) when: (a) X = 1 and y = O. (b) x = 1 and y z 1.

Dr>-F 2.61 Show Ihe eonduelion parhs and OUlpUI valu'e of Ihe NOR gale lransislor eireuil in Figure 2"+5
when: (a) X = I and y - O. (b) x - a and y = O.

D
2.62 Show the conducti on paths and output va lu e of the AN D gale lransislOr circuit in Figure 2.46
when: (a) X = 1 and y - 1. (b) X = a and y _ 1.
G
2.63
Two people, denoted using variables A and B, wanl (Q ride with you on your mOlorcycle. Write
<'1 Boolean cquUlion that indicates thaI exac lly one of the Iwo peopl e can come (A=l means A
can cOllle, A=O means A can ', come). Then use X OR 10 si mpli fy your equation.
Figure 2.68 Combinalional eireuils F and G. 2.64
Simplify Ihe fo llowing equarion by using XOR wherever possible: F = a ' b + ab' ~
cd ' + c ' d + ae.
SECTION 2.7: CO/l'IBI NA TIONAL LOGIC DESIGN PROCESS
2.65 Use XOR 10 creale " cireuil thaI OUIPUIS a 1 when the num ber of Is on inputs a. b. c. d i
1.53 A mu seum has three rooms. each with a IllOlion sensor (m O. ~1. and m2) ,thai outputs 1 when odd.
moti on is detected. At nigh!. the only person in the museum IS one s~c unly guard who wal~ 2.66 Use XOR or.XNOR [Q creme a eireuil Ihal deleclS if al l inputs a. b. c. d are as.
from room to room Create a circuit thai sounds an alaml (by CUing an output A to l ~ Ir
2.67 Use XOR or XNOR creme a eircuil Ihal de leels if an even nu mber of rhe inputs a. b. c. d
nlOl ion is ever dClcc;ed in more than one room at a lime (i .e~ . in two or three rooms). meanmg
10
are Is.
there must be an imruder or inlnJders in the museum. Start with a truth table.
2.6S Show Ihal a 4-bi l XOR gale is an odd funelion (meaning Ihe OUIPUI is 1 only if rhe number of
2.54 Creale a cireuil for the musem of Exercise 2.53 thaI delccls whelher the guard is properly inpUI Is is odd).
patrolling the museum. detected by exactly one mOlion sensor being 1. (If no mOlion sensor is
1. Ihe guard musl be sining or sleeping.) SECTION 2.9: DECODERS AND MUXES
2.55 Consider the museum security aJarrn function of Exerci se 2.53. but, for <l mu ~e urn wi ~ ,10 2.69 Design a 3x8 decoder using AND. OR. and NOT gates.
rooms. A lrulh table is not a good starting point (too many rows). nor IS an equation de cnbmg 2.70 Design a 4" 16 decoder using AND. OR. and NOT gales.
when the alarm should sound (too many tenlls). However, the in verse of the alann ru nclion can 2.71 Design a 3x8 decoder with enable using AND. OR. and NOT gares.
be straightforwardly captured as an equation. Design the c i rc ~ i t ror l~e 10-room security ~yst~~ 2.72 Design an 8x I mu hi plexer using AND. OR . and NOT gales.
by designing the inverse or the function. and then just adding an Invener berore the CircUli s
2.73 Design a 16xl muhiplexer using AND. OR , and OT gales.
ou tput.
2.74 Design a 4-bit 4x I rnull iplexer using 4x I multiplexers.
2.56 A network router connects multiple computers together and allows them to send messages 10
2.75 Create a circuit th at rings a bell whenever motion is dClccrcd from one oflwo motion .sensors.
each other. Ir two or more computers send messages simultancou Iy, they collide and the mes-
sages musl be rese n!. Using the eombinarional design process of Table 2.5. creale a collision
A switch 5 determines which sensor to pay allention to: 5=0 means ring the bell when there's
moti on at motion sensor 1. 5=1 means motion sensor 2.
detec tion circuit for a router that connects 4 computers. The circuit has 4 inputs labeled MO
th rough M3 th aI are I when Ihe corresponding compuler is <ending a message and a other· 2.76 A home enlenainmenr cenler has four differenr audio ourees thar can be pla)ed over rhe same
wise. The eircuil has one OUIPUI labeled C Ihal is 1 when a coll bion i. deleeled and 0 sel of speakers. Each aud io Source. named A. B. C. and D. is eonnccled using " ires on "hieb
otherwise. the digitized audio signal is tmnsmiued. The user seleclS wttich audio Duree i.:, 10 be pla)ed
using a rolary swilch wilh four OUlpUIS. 5 O. 51. 52. 53 . of which e.• a tI) one wil1 be . . al
2.57 Using Ihe eombinalional design process of Table 2.5. creale a 4· bil prime number deleclOl.
The eireuil has four inpu ts. N3. N2. NI. and NO Ihar corre'pond 10 " 4-bil number (N3 is the
any give n lime. If 5 a = ' I ' . Ihe audio souree A shoul d be pla)ed. if 5 I = 'I '. rhe audio =
B should be played. and so on. Creare a digilal cireuil \Virh a single -bit ourpur a thaI "ill
most ~ i gni ficam bit) and one output named P Ihal outpUl!oo a 1 when th e input is a prime output the user's selec ted audio source.
number or 0 otherwise.
SECTION 2.10: ADDITIONAL CONSIDERA TIONS
2.58 A car has a fuel-level deleclor th aI OUlpu ts Ihe currenl fuel-level ", a 3-bil binary number. wirh
000 mea ning emply and III meaning full. Create" cireuil Ihal ilium in:!le, a " low fuel" indio 2.77 Design a 1..4 demulriplexer using A D. OR. and NOT gores.
calOr lighl (by , cuing an OUlpO! l 10 1) when Ihe fucl level droJl~ below level 3. 2.78 Design a Ix 8 de muhiplexer using AN D. OR. and NOT gale .
2.-9 A car has a low- ti re-pre\'~u re ~ nsor Ihat outputs the current lire prc ...... ure as 5-bil binal)
3 2.79 Design a 4x2 encoder using A D. OR. and NOT gales.
number. Create a circuit that iliumlllarcs a "low tire prc.."'''iurc'' inthc;.lIor fift ht (by setting an 2.110 Design an 8x3 encoder using AND. OR. and OT gale . . , ume rhal onl) Oil<' inpul will be
OUIPUI T 10 I) when the lire pre"ure drops below 16. Il in!: YOIl mighl find II ""'Ier 10 ereale l I ;.11 any given time.
ci rCUli that detccl,lhe invcl'M! ru nction. You can lhcnjU"i1append an IIlvencr 10 'he outpul ort~ 2.8 1 Design 3 -'-':2 priori t)' encoder usi ng D. OR. nnd NOT gales. ""urne th;u e\<f) mput be'l\!!
circuit a is encoded as 00.
9-1 2 Combinational Logic Design

~ DESIGNER PROFILE
<unSell enjoyed physics
and math in college. ~nd
focu sed his advanced
experience. "For the sma ller team projc~l . each pe~on
had more responsibility. and overull effiCiency was high.
For the lame team project. each per!;on worked on a
~3
i<.lUdics on integrated spec ific pa'; of the project-the chip lVas div ided into
circuit (Ie) design. clu sters. each clu ster into units. and each unit had a
be lieving the industry to leader, We relied heavi ly on design nows and
ha ve a great future.
Years laler now. he
methodologies."
Sam son has seen th e industry's peaks and valleys
reali zes he was right: during th e past IwO decades: "Li ke any industry. the Ie
Sequential logic Des ign-
"Looking brick 20 years job market has ils ups and downs." He believes the
in high tcc h. we have industry survives the low !>oims in large pan due to
Co ntro" e rs
experienced four major "innov3lion:' "Brand name sell products. but without
revo lutions: the PC innovation , markets go elsewhere, So we have to be very
rc\ OiUlion. digital rC\Olulion. cOlll l1luni ca ti on revolution. innovati ve, crea ting new products so lhal we are always
Jnd Internet rc\oJution-all four enabled by the Ie ahead in the e:lobal competition,"
indul.,ll') . The impacl of these revolutions 10 ollr daily life But. "inno~alion doesn't grow on trees ," Samson points 3.1/NTRODUCT/ON
is profound:' out. "There are two kinds of innovations. The first is
He has found his job to be "vcry challenging. invention. which requires a good unders tanding of the
interestin!! . and exciting. I cO lllinually learn new skills to
The output of a comb ina tion a l circu it is a function o nl y of the circuit's present inputs.
physics behind technology. For example. to make an
keep up. ; nd to do m~ job more efficicm ly:' analog TV into a digi tal TV, we must know how human
A combina tion a l c irc uit has no me mo ry-we cannot to re bits into a combinational
One of SJmson's key design projects was for digital ci rc uit a nd later read the bits o ut th at we saved. Combinational circuits by them eh'
eyes perceive video images, whi ch parts can be digitized,
television. namely. high-definition TV (HDTV). involving how digital images can be produced on a silicon chip. elc. are rather limited in the ir usefulness. Desig ners ins tead typicall y use combinational cir-
companies like Zenith, Philips, and Intel. In particular, he The second kind of innovmion reuses existing technology c uits as part of larger c irc uits called sequ entia l ci rc uits--circuits that do have memon .
led the 12-person design teal11 that built Inters first Liquid for a new application. For example, we can reuse A sequel/tial circllit is a circuit w hose o utputs depend not only on the circuit's prese~t
Crystal on Silicon (LCoS) chip for rear-projection HDTY. ad vanced space technologies in a new non-space product inputs, but also o n th e c irc uit 's present state, which is all the bits stored in the circuit.
"Traditional LCoS ch ips are analog. They apply different serving a bigger market. c·8ay is ano ther example-it The circuit 's s ta te in turn depends on the past sequel/ce of value that ha\'e appeared at
analOR voltage.. on each pixel of the display chip so it can the c irc uit 's inputs.
reused Internet technology for on-line auctions.
produ~e an image. But analog LeoS is very sensitive to Innovations lead 10 new products, and thus new jobs for
noise and temperature variation. We used digital signals to many years, An everyday example of a combinational circuit i a doorbell-push the button (the
do pube width modulation on each pixel." Samson is input) now. and the bell (the o utput) rin gs. Push the butt on again. and the bell rings again.
Thus. Samson point out that ''The industry is counting
quite proud of his team', accompli,hments: "Our HDTV Pus h the button tomorrow. o r next week. and the bell ring the arne en h time. A door-
on new engineer from college to be innovative. so they
picture quality was much bener." bell has no state, no memory-its o utput value (whether the bell ring or not ) depends
can continue to drivc the high tech industry forward.
Sam son also \\-orked on the 200-mcmber design team solely on its present input value (whether the button i pressed or not ). In ontnst. an
When you graduate from college. it'~ up to ),011 to make
for Inlers Pentium II processor. Thai was a very differen! things beuer." example of a sequentia l circuit is an automa ti c garage door sy tem-pu h the button (the
input) now. and the door opens. Push the button again. and this time the door loses. Pu b
the button tom orro w. a nd the door opens again. The system' output (\\ hether the door
opens or closes) depends o n the s ta te of the system (whether the door is pre. entl~ open or
closed). which in [urn depe nds on the sequ ence of pasl input value in e we turned on
the ystem .
Most di gita l sys tem with which you are familiar in\'oh e sequential cin:-uits that
store bits. A handheld ca lculator mus t contain a sequential cin:-uit. because [he ,'a/culator
mus t store the numbers you en te r. in order to operate on tho ' e nWllbe~. A digital amen
s tores pictures. A traffic lig ht controller store. infonnmion indicaring \\ hi h light i. pres-
e ntly g reen. A c ircuit t.h at counts d wn from 59 to 0 Stores the present 'l'unt \ alu', to
kn ow what the nex t val ue should be.
In th is c hap ter. we describe ba~ic sequential ireuit building bl 'I..s. Jnd th- d "tgn 01
a cennin c1a~~ of sequential circui ts kno\\ n as c ntrollers .

.---~ . -- -
96 3 Sequential Logic Design-Contro llers
3.2 Storing One Bh-Flip-Aops
3.2 STORING ONE BIT-FliP- FLOPS
have no way of reselling 0 to O. But hopefully you understand the basic idea of feedback
To build a sequential circui!. we need a now-we did successfully store a 1 using feedback.
building block that enables us to store a Call ~_r---, Blue light
buHon ~ . We draw in Figure 3.3 the timing diagram for our attempted feedback circuit from
bi!. By store a bi!. we mean that we can
Ftgure 3.2. NOIe that we assume the OR gate has a small input to output delay, as was
save a bit in the block (say a 1) and latcr Cancel~ ~ discussed in Section 2. 10. Initially, we assume both OR gate inputs are 0 (Figure 3.3(a)).
come back 10 see what we saved. As an bunon
Then we set S to 1 (Figure 3.3(b», which causes 0 to become 1 slighlly later (Figure
example. suppose we want to bui ld the
3.3(c» , which in tum ca uses t to become 1 lightly later (Figure 3.3(d». Finally. When
fli ght attendant call-button system in Figure 3.1 Flight attendant ca ll-button we change S back to 0 (Figure 3.3(e». 0 will stay 1 because t is I. The firsl curved line
Figure 3.1 . An airline passenger can push system. Pressing Call turns on the light, with an arrow indicates that the event of 5 changing from 0 to 1 cau es the eVent of 0
the Call bunon to tum on a small blue which stays on afl er Ca ll is released.
changing from 0 to 1. The second curved line with an arrow indicates that the eVent of 0
light above the passenger's sea!. indicati ng Pressing Cancel turns otT the light.
changing from 0 to 1 in turn causes Ihe eVent of t changing from 0 10 I. And that 1 then
10 a fli ght attendant that the passenger . Continues to loop around, forever, with no way of 5 resetting 0 to O.
needs service. The light stays on even after the call button is released. The hght can be

~l2J-~l~~
turned off by pressing the Callcel button. Since the light has to stay on even after the call
button is released. we need a way to " remember" that the call button wa pressed. We can
remember by u ing a bit storage block. and storing a 1 in the block when the call button
is pressed. and storing a 0 when the cancel button is pressed. We connect the output of (a) : ! ; (b) ~C},: ' ' ----fei) -' (e)
this bit storage block to the blue ligh!. The light illuminates when the block's output is 1.
To introduce the internal design of such a bit storage block, we' ll introduce several S~ \ ~L /-;f'/ (

~)f-;£":;'~/__/_/_---__________
increasi ngly complex circuits able to store a bit-a bas ic SR latch. a level-sensitive SR
latch. a level-sensitive 0 latch, and an edge-triggered 0 flip-flop . The 0 flip-flop will then
be used 10 create a block capable of storing multiple bits, known as a register, which will
t f \
serve as our primary bit storage block in the rest of the book. Each success ive circuit elimi. Q0 \, 0 stays 1 forever
nates some problem of the previous one, leading to the robust 0 Rip-flop and then register. Figure 3.3 Tracing the behavior of our first attempt at bit storage.
Be aware that designers rarely use bit storage blocks other than 0 flip-nops. We
introduce the other blocks primarily to provide the reader with the underlying intuition of SR Latch
the 0 flip-flop 's design.
Basic SR Lalch
It turns out that the simple circuit in
Feedback-The Basic Storage Method Figure 3.4. called a basic SR latch .
The basic method used to store a bit in a digital circuit isfeedback . You've surcly experienced implements the bil slOrage building block
feedback in the form of audio feedback, when omeone talking into a microphone stood in we desire. The circuit consists of just a
front of the speaker. causing a loud continuous humming ound to come out of the speake~ pair of cross-coupled NOR gates.
(in tum causing everyone to cover their ears and snicker). The talkcr gcnerated a sound that Making the cireui!"s S input equal to 1
was picked up by the microphone, came out the peakers (ampli fied), was picked up again by causes Q to become 1. while making R
the microphone, came out the speakers again (amplified even more), etc. That' feedback. equal to 1 causes Q 10 become O. Making s
Feedback in audio systems is annoying, but in digital sy terns is ex tremely useful. both 5 and R equal to 0 causes whatever 0-----
Intuitively. we know that we need to somehow feed the output value 0 i. 10 keep loopi ng around. In
of a logic gate back into the gate itself, so that the stored bit
ends up looping arou nd and around, like a dog chasing its own
tail. We might try the circuit in Figurc 3.2.
Suppose initially 0 is 0 and 5 is O. At some poi nt. uppose
Srf2j- other words, S "sets" the latch to 1. and R
"resets" the latch to O-hence the lellers
5 (for set) and R (for reset).
Let's ee why the basic SR lalch
we set 5 to 1. That ca uses 0 to become 1. and that 1 feeds back works as it does. Recall that a OR gate
into the OR gate, causing 0 to be 1. ctc. So even when S rctums Ftgure 3.2 FiNt (failed) outputs 1 when all the gate 's input '
to O. 0 stays 1. Unfonunmely, 0 St;ty~ 1 from then on. and we
attempt at u\lng fecdbxk equal 0; if at least one input equals 1. o
to '-lore a bi!.
o
the NOR g1tle outputs O. Figure 15 R latch \\ hen =0 and R =I.
98 3 Sequential Logic Design- Conlrollers
3.2 Storing One Bit- Flip-Rops 99
S k 5 0 d R-l as in Ihe SR Imch ci rcuil or Figure 3.5. and that earlier 5= 1 stored a 1 into the SR
uppose. lhm we ma 'c = an - . • . e bOllom 'ate or Ihe cireuit has at
we don'l 11lIllally know the va illes or 0 and t. SlI1ce Ih . . g. . 1 latCh, also known as sellillg Ihe
.
leasl one IIlpUI equal 10 1 (R). the gale oulPUIS 0- in Ihe IImlll "O dlagrmn. R. becoming
. lalch, and thai 1 remains slored
callScs 0 10 become O. In the circuil. O's 0 reeds back 10 Ihe lap OR ga te. wh Ich WIU have even when we relU rn S 10 O. bunon
. OUIPUI equaI Ia 1. In the limin "o dIagram. 0 becoming
bOlh li. S .IIlPUIS equal 10 0 and liS . 0 The basic SR Ialch can be used
callses t 10 become 1. In Ihe cirell il. thai 1 reeds back 10 Ihe bOllom OR gale. whIch has 10 implemenl the flighl allendant
al leasl one inpul eqllal 10 1 (nclUnlly. bOlh inpuls equal 1). and so Ihe botiom gate will cal/-bullon syslem (Figure 3.9). We
contin lle 10 Oll lplll O. Thlls the OUlp11l 0 equals O. and all values are slable. conneCI the ca ll bUllon 10 5, Ihe
Cancel
Now suppose we make 5=0 and S 1 cancel button 10 R. and Ihe lighl 10 bunon
R=O . as in Figllre 3.6. The bOllom gme 0---- - O. Pressing Ihe call bUllon sels 0 10
slill has aI leasl one inpu l equal 10 1 (Ihe I, Ihus lurning on Ihe lighl. 0 stays
input coming rrom the top gale). so the I even when the call button is Figure 3.9 Flight auendant caIJ-bulton system using
botiol11 gale cOlliinues 10 OIl IPUI O. The released. Pressing Ihe cancel bUllOn a basic SR laleh.
lOp gale cOlllin lles 10 have bOlh inpu ls reselS 0 10 0, Ihus turn ing orf the
equal 10 0 and cOlllinlles 10 OUlpu l 1. lighl. 0 Slays a even when Ihe
The OUlpUi 0 willihus slill eq ual O. Thus cancel bUllon is released.
Ihe earli er R= 1 srored a 0 inlo Ihe SR o
lalch. also known as resellillg Ihe Ialch . Level-Sensitive SR Latch
and Ihal 0 remains slOred even when we Q A problem wilh Ihe bas ic SR Ialch is 5 and R both equaling 1 al Ihe same time causes
relUm R 10 O. Figure 3.6 5R laleh 0 unden ned behavior-we mighl have stored a I, we mighl have slored a 0, or we might
Now lei's make 5= 1 and R=O . as in when 5=0 and R= O. even cause Ihe latch ourp Ui 10 oseillale from 1 10 0 10 1 10 O. and so on. Lei's ee wby.
Figure 3.7. The lap gale in the circuil afler R equaled I.
If 5 = 1 and R= I, both gales have at leasI one inpul equal 10 1. and thu both gate
now has one inplII cqual 10 1. so Ihe lap S OUlput 0, as shown in Fig ure 3. 1O(a). A problem occurs when we rerum 5 and R 10 O.
0
gate ou tputs a O- the liming diagram Suppose 5 and R rerum to 0 al exaclly the same time. Then both gates will have all 0 ar
shows Ihe change or 5 rrom 0 10 1 Iheir inpulS, so Iheir ourp uls wi ll change from Os to Is. as shown in Figure 3.1 0(b). Those

~
causing t 10 change from 1 10 O. The R Is feed back 10 the gate inpuls, causing Ihe gates 10 OUIPUI as. as hown in Figure 3. IO(c).
0
lOp ga le'. 0 OUIPUI reeds back 10 Ihe ,_ - 1 _----- 0 Those as feed back 10 the gale inputs again. causing the gates to OUtpUI Is. And 0 on.
botiom gale. which now has both inpUis \. . . :><.: ,. ,. ", Going from I 10 a 10 1 10 0 and so on is cal/ed oscillation . Oscillation is not a de irable
equa l 10 0 and OUIPUIS l - Ihe limi ng \~-
di agram shows Ihe change or t rrom 1
- ---------1 Q 0
fealure of a bil slorage block.

10 0 causing 0 10 change rrom 0 10 1.


R=O
The botiom gale's (0) 1 OUiPUI reeds Q
back 10 Ihe lap gale. which has al leasl Figure 3.7 5R Inlch 0
when S= I and R=O.
one inpu l equal 10 1 (ac lUally. bOlh

~~
inputs equa l 1 now). a Ihe lap gale con-
S
linues 10 OUIPU I O. The OUIPUI 0
Iherefore equa ls 1. and all va lues are

~
slable.
R
ow lei's make 5- 0 and R=O aga in , ,_ - t _---- 0 I
, -- 0

\~::: :~------- 1
a, in Figure 3.8. The top gale slill has aI Figure 3.1 0 The silUation or S = I and R = I causes problems-Q as il/Oies \\ hen R re!Urn to 00.
leasl one inpu l eq ual to 1 (the inpul Q
comi ng from the botiom gale). so the lOp 0
In a real circuil . the delays or Ihe upper and lower gales and wires lI ould b.! ,tightl~
gale cOnlin ue, 10 output O. The botiom
R=O different fro m one anot her. a after a lime of os illation. one of the gale. ma~ gel ahead
ga le cOnli nuc, 10 have bolh inputs equa l
Figure 3.8 SR laleh Q of the olher (Ou lpu ll ing a 1 before Ihe other d . then a 0 b.!fore the other on de -,
10 a and eOnlinue, to oU lpul I. The
when 5=() and R =0. 0 cle.). II ntil it gets rar enough lI hend to cause the cirt'uil I enler a ~Iable siluati n of ither
ou ipul 0 " 51ill eq ual to I. Thu" Ihe aflcr 5 equaled I. OaO or 0= I-which case will happen. li e don'l knOll . u 'h a ~irualion. in IIh,,-h th· tinal
100 Sequential Logic Design-Controllers
3.2 Storing One Bit-Flip-Flops 101
value of a memory circuit depe nds on the delays of
A partial solution to this problem is to
gates and wires, is known as a race condition .
add an enable input C to the SR latch. as Level'sensitive SA latch
Figure 3. 11 shows a race condit ion involving oscil-
shown in Figure 3.1 4. When C:l, the S S
lation but end ing with a stable situation of 0: I.
and R signal s propagate th rough Ihe two
But we did n' t know wh ich value 0 wou ld eventu-
AND gates to the S I and Rl inpu ts of the
ally sellie into (it could have settled into 0:0), so Figure 3.11 Q eventually seliles to
basic SR latch circuit , because S*I:S and c
the fact that 0: I is not useful to us in our use of ei ther 0 or I. due to race condition.
R*I=R. However, when C:O, the two AND
the bit storage block.
gates cause S I and Rl to be O. regardless
In our fl ighl attendant call -bullon system, if the passenger pushes both buttons at the
of the values of S and R. Thus, when C:O,
same lime. the result could be thallhe blue light slarts osc ill ating. and then Ihe lighl either A
the basic latch 's value cannot change. (You
ends up on or off.
might note that a difference in the lOp and
5 and R should In summary. Sand R should never both equal 1 in an SR lalch.
flt'I'U bOlh equal bottom AND gate delays could result in S I Figure 3.14 Level-sensitive SR latcb-
I in all SR lotch In practice. we would never aClually conneci buttons directl y to an SR latch's inputs and RI both being I for a very short time an SR latch with enable input C.
(we did Ihal just for the purpose of an intuiti ve example). So we can safely ass ume the S
equal to that difference, but that time is too
and R inpuls come from a digi tal circuit. Thus. we can desig n that digi lal circuit such thai short to cause a problem.)
5 and R should never both equal 1. BUI even if we Iry 10 design Ihal circuit such thai S
The introduction of the enable input leads 10 the idea of setting the enable to I only
and R sho uld never both be 1. we could still fi nd that S and R inadvertentl y bOlh become
when we are sure that Sand R have stable val ues. Figure 3.15 shows the inverter/AND
I at the same time. For example. cons ider the simple circui l in Figu re 3. 12. In Iheory, S
circuit from Figure 3.1 2, this time using an SR latch with an enable inpu t. If we change
and R can' l both be I -if X:l. then 5: ] bUI R:O . If X:O. R may equal 1 bUI 5:0. So S
and R can' l both be I -in Iheory. X, we should wait for at least 2 ns before setling the enable input C to 1 in order to ensu;:'
that the SR inputs to the latch are stable and are not equal to II .
In rea lilY, both 5 and R could both be ] for a short lime in Ihis circuit. because of
the delay of real gales. as introduced in Figure 2.62. Suppose X has been and Y has a
been ] for a long time, so 5:0 and R: l. Then suppose we change X 10 1. 5 wi ll change Level·sensitive SA latch S~~
10 I almost immediately. but R will stay] for a short while as the new value of Xpro!>, 1 '
agates Ihrough the inverter and Ihe AND gate, after which R changes to O. If each
componenl has a delay of I ns (nanosecond). then 5 and R wou ld aClu ally both be I for
R 0 il'--+-----
2 ns (Figure 3. 13). Temporary values on ignals ca used by ga te delays are referred 10 as
glitches.
c~
1
S1~
; i r IL -
: :
,
Rl 0 !'
l---f-, :
.L...;;_ _ _ __

1 ~'----------- >2ns
X o~ Figure 3.15 Level-sensitive SR latch-an SR latch with enable input C.

y
o !
;
An SR latch with an enable is lenown as a level-sensitil'e SR latch . beeau e the lat h
is only sensilive to its S and R inpu ts when the level of the enable input is 1. uch a Iat b
is also called a transparent latch, beca use setting the enable input 10 1 makes the internal

' . SR latch transparent 10 the 5 and R inpulS.

--1J," .'
1 ' ,!.'::-'' "--------
S You may have noticed tllal the lOp NOR gate of an
o I ~ SR lalch outputs the opposite val ue as the bottom gale,
: : SA = 11
which i connecled 10 the oUlput O. Thus, we can include
Figure 3.12 Conceptually. Sand R can' t both be I
A
1~
' /!
:\ an o utput 0' on an SR lalch almost for free , j ust b con-
in thi' sample circuit. But in reality. they can. due o : '.. _ I
necting the top gate to Ihat out put. Mosl latche ' do in
to the delay of the invene r and AND gate. I
faci come with bOlh 0 and 0 ' outpul . The symbol for a
Figure 3.16 ymbol for
figure 3 13 Grllc delny' level-sensirive SR IMCh wilh such dual outputs is hown
dual-{lU(put 10\ ek nsnh
Con cau,c SR = II. in Fig ure 3. 16. R lalch.
102 3 Sequential Logic Design- Controllers
3.2 Storing One Bn-Flip-Flops 103
Clocks and Synchronous Circu its
. A c lock signa l's period is the time after which the signal repealS ilSelf-or mare
ble si nal C that we must sct to 1 a rter we are
The level-sensitive SR latch uses an ena gd h to set the enable C to I? Most SImply, the tllne between successive Is. The signal in Figure 3.17 has a period of20 ns.
5 d R bi B t how do we decI e w en A clock cycle refers to one such segment of time. meaning o ne segment where the clock
sure an are sta e. U '0 al that ulses at a constan t rate. For example,
sequentia l circuits simply use an enableslon, I0 ~s then low for IOns, then high for IS 1. and then O. Fig ure 3. I 7 shows th ree and a half clock cycle. A clock signa)'s fre-
we could make the enable SIgnal go hIgh for ' qu ellcy IS the number of cycles per second, and is compu ted as I/(the clock period). The
10 ns, then low for 10 ns. etc .. as in Figure 3. 17. slgn~1 III F,g ure 3. I 7 has a frequency of 1/20 ns = 50 M Hz. The units of frequency are
Freq. Period Hert z, or Hz, whe re I Hz = I cycle per second. MHz is short for Megahertz_ meaning one
safe 10 X. Y mdl'on Hz.
change must not 100 GHz 0.01 ns
X, y change 10 GHz . A convenient way to menta lly convert common computer clock periods to frequen-
/~, 0.1 ns
t 1 GHz 1 ns c Ies. a nd VIce ve rsa, IS to remember that a I ns period equals a I GHz (Gigahertz,
elk 100 MHz 10 ns meanll1g I bIllI on Hz) frequency. Then , if One is slower (or faster) by a factor of 10. the
10MHz 100 ns other is slower (or fas ter) by a fac tor of 10 a lso-so a 10 ns period equals 100 MHz.
o o whde a O. I ns period equals 10 GHz.

Figure 3.17 An example of a clock signal named elk. Circuil inputs should only change while
D Flip-Flop
elk z 0, such that lalch inputs will be stable when e lk - I .
While the SR la tch is useful for introd ucing the notion of storing a bit in a digital circuiL
most c irc uits actua ll y use slightly more advanced devices. namely. D latches and D llip-
The time high and time low need not be the same-for example, we cou ld create a naps, to store bi ts.
signal that is low for 10 ns, high for I ns, low for 10 ns. hIgh for I ns. etc . . .
Such a pulsing enable signal is called a clock signal. because the Ignal licks (hIgh, Level-Sensitive 0 Latch-A Basic Bit
S tore Olaleh
low, high. low) like a clock. A circuit whose storage elements (Ill thIS case. latc.hes) can
only change when a clock signal is ac tive is known as a sync hronous sequenttal CirCU li, or The SR latch has the an noying problem of
entering all unde fined tate if the 5 and R
j ust synchronous circllit (the sequential aspect is implied-there's no such thlllg as a
inputs are both I when the clock is high.
synchronous combinational circuit). A sequent ial circuit that does not use a clock is
Ensuring that we desig n c ircuits that don 't
caHed an asynchronous circllit. We leave the important but cha llengi ng topic of asyn-
set 5 a nd R to both 1 imposes a burden on
chronous circui t design for a more advanced di gital design textbook . The majori ty of
the deSigner. One way to relieve designers
seq ue ntial circ uits designed and used today are synchronous.
o f this burden is to instead u e a new type
Designers typicall y use an a ci llato r to generate a clock ignal. An oscillator is a
of latc h. called a D latch . shown in Figure
circuit that outpu ts a signal that aitemates between I and 0 at a constant freq uency, like
3.1 8.
that in Figure 3. 17. An osci llator component typica ll y has no inputs (o ther than power),
A D latch sto res whatever value is Figure 3.18 D latch internals.
and has an output representing the clock signal.
present at the la tch's D input when C= 1.
~ HOW ODES IT WORK?-OUARTZ OSCILLATORS, a nd holds that val ue when C = O. Internally.
the latch's D input connects to 5 d irectly.
Concept u al l y, a n oscillator a precise frequency a nd to R through an inverte r. Fig ure 3. I 9
can be thought of as an inverter determined by the provides a timing diagram of the D latch
feeding back to itself, as shown on quartz size and for sample input values on D and C. When
the left. If C is initially 1, the value shape. Furthermore,
will feed back through the inverter
D is I a nd C is 1. the latc h is et to 1.
when quartz vibrates, because 5 is I and R is O. When D is 0 and
and so C will become 0, which feeds back through the it generates a voltage.
inver1er causing C to become 1 again, and so on. The C is 1. the la tch is reset to O. because R is 1
So, by making quanL
oscillation frequency would depend on the delay of the a nd 5 is O. By making R the opposite of S.
a specific ,ize and
inverter. Real oscillators mu t regulale the oscillation we are ass ured that 5 and R won 't both be
shape and then R
frequency more precisely. A common type of oscillator applying a current, Oscillator Ie I at the sa me time. as long as we ani 0--+---,
uses qULJrlZ, a mineral consisting of silicon dioxide in we ge t a preci,e electronic o,cillator. We attach the c hange 5 and R when C is O.
crystal (arm. Quartz happens to be such that it vibrates o«illator 10 an IC', clock slg"al input, a' shown o
i( we apply an electric current, and thaI vibration i, at above. Some IC, come with a built-Ill osci liator,
o
Figure 319 D Iat<h tIming dlJ.\!r.un
104 3 Sequenlial Logic Desig n- Controllers
3.2 Sloring One Bit- Flip-Flops 105
The symbol fo r " 0 lalch wilh dual-oUlpU IS
(0 and 0 ') is shown in Figure 3.20.
--fo+ o lalch o latch

~
Figure 3.20
D larch symbol.

Edge-Triggered 0 Flip-Flop-A Robust Bit Store


The 0 latch slill has a pOlentially nasly problem Ihat can Ca use unprediclable circuil
behavior- namely_ signals can propagale from a lalch OUlpul 10 an olher lalch's inpul Clkt==================--.J '---'--'---- - _ _-..J
while the clock signal is 1. For example, consider Ihe circui l in Figure 3.2 1. When (a)
e lk = I. Ihe va lue on Y wi ll be loaded inlO Ihe firsl lalch and appear al thaI latch's output. r , Too short-ol
Clk~
If ( 1 k slill equals I. Ihen Ihat value will also gel loaded into Ihe second latch . The value Clk ~e
01
wi ll keep propagating Ihrough the latches umi l (1 k returns 10 O. Thro ugh how many 01 --.l '
0 1102 01 /02 _ _ _ _ _ __
la tches will the value propagale? It 's hard 10 say-we would have 10 know the precise
tim ing delay information of each lalch. S2===:t,)SR= 11 S2_______
R2 R2 _____________
02~dlalChsel 0 2 _ _ _ _ _ __
y (b)
01 01 0 2 02 03 03 04 04 (e)
Figure 3.22 A problem wilh level-sensitive lalches: (a) while C~ 1. 01 's new value may propagale
10 D2. (b) such propagation can cause S2 and R2 10 both be 1 for a shan time while the latch 's

Clk .....----+-__ =-.....=_--l enable is 1 (bul SR ~ 11 is never supposed 10 occu r). or can cause an unknown number of latches
along a chain 10 gel updaled, (c) Irying 10 shonen Ihe clock's high lime 10 avoid propagalion 10 the
neXl lalch, bUI long enough 10 allow a lalch 10 reach a slable feedback silualion. is hard. because
making the c1ock's high lime 100 short prevents proper loading of the latch.

Figure 3.21 A problem wilh lalches-through how many Ialches will Y propagale for each pulse
of Clk_A ? For Clk_B?
A good solution is 10 des ign a more robuSI block fo r storing a bil- a block that stores
Ihe bil al Ihe 0 inpul at Ihe illslalll lhal the clock rises from 0 10 1. Note thaI we didn 't
say thaI the block Slores the bil inslantly. Rather, the bit thaI wilJ eventually get slOred
Figure 3.22 ill uslrates Ihis propagat ion problem in more delail. Suppose 01 is ini- into the block is Ihe bil Ihat was slable at 0 al Ihe
lially 0 for a long lime, changes 10 1 long enough 10 be stable. and Ihen C1 k becomes I. inslal1l Ihal Ihe clock rise from a to 1. Such a
0 1 wi ll th us change fro m D 10 I after aboul Ihree gate delays, and Ihus 02 will also block is ca lled an edge-Iriggered D flip-flop . The
change from 0 10 1. as hown in Ihe left timing diagram. If C1 k is slill 1. then thaI new word "edge" refers 10 the vertical pan of Ihe line
va lue for 02 wi ll propagale through Ihe AND gales of Ihe second latch. causing S2 10 representing the clock signal, when the signal !Tan-
a
change from 0 10 1 and R2 from 1 10 D. Ihus changing 02 fro m 10 I, as shown in the a
sirions from 10 1. Figure 3.23 shows three cycles
left IlmlOg diagram. Also nOle in the left liming diagram that changing 02 whi le C2-1 of a clock signal. and indicales the Ihree ri sing Figure 3.23 Risi ng clock edges.
causes S2 and R2 10 both equal 1 for a short lime, due 10 Ihe extra delay on the palh 10 clock edges of those cycles.
R2 cau ed by Ihe Inverter. Ihough Ihe lime thaI bOlh are I is probably 100 short 10 cause a
prob lem.
Edge-Triggered D Flip-Flop Usillg a Masler-Serllalll Desigll. One \\'a 10 design an
You mighl suggesl maki ng the clock signal such thaI the clock is I onl y for a shan edge-triggered D flip-flop is to use 111'0 D latches. as shown in Figure 3.24.
amount of tl,,;e .. so there's nOI enough li me fo r Ihe new OUIPUI of a lalch 10 propagate 10 The first 0 lalch. known as the mOSIer. is enabled (can slore new val ue on Om) \I hen
Ihe nexl lalch s mpulS. BU I how short is shan enough? 50 ns? IOns? Ins? 0. 1 ns? And if a
C1 k is (due 10 the inverter). while the second D latch. known as the sen ·OIll. is enabled
we ~ake Ihe clock's time m I 100 short, Ihat li me may nOI be long enough for the bit al a when C1 k is 1. Thus, while C1 k is O. Ihe bil on 0 is slOred into the masler lal h. and
lalch s 0 mpullo Sl~btl l z~ m Ihe lalch's feedback circuil . and we mighl Iherefore nOI suc- hence Om and Os are updaled- bul the servant latch doe nOI lore this new bil beenu
cessfully Slore Ihe bll , as tl luslraled in Figure 3.22 (c).
Ihe serva nl latch is nOI enabled ince C1 k is nol 1. When C1 becomes 1. the mn ter
106 3 Sequential logic DeSign- Controllers
, , 3.2 Sloring One Bit- Flip-Flops 107

o flip·flop Clk - - r - - - L We'~e aClually been describing whal's known a.• positive Or risillg edge-triggered flip-
~- ..'--, i Aops. wh,ch are Inggered by Ihe clock signal going from 0 10 I. There are also Aip-Aops
o lalch o lalch O/Om
known as lIegati,'e or Jallillg edge-lriggered fl ip-ft0l s. which are triggered by Ihe Signal
o 0'
Om Om Os Os' Cm gOll1g from 1 10 O. We can build a negalive edge-triggered 0 flip-llop usi ng a maSler-servalll
Os 0 deSIgn where Ihe second fl ip-fl op 's clock inpul is invened. rather than the fi rst fli p- Aop 's.
Om/Os Posi tive edge-Iriggered fli p- fl ops are drawn
servant Cs using a small triang le al Ihe clock inpul. and nega-

In ill
tive edge-Iriggered fli p-flops are drawn USing a
Os _ _ ---,c-'
small Iriangle along wilh an in version bubble. as
shown in Figure 3.26.
Figure 3.24 A D fli p- flop implemenling an edge-lriggered bil slomge bolOCk. in l e r~a:I Ycut~ng ~w~ Bear in mind thar all hough Our maSler-scrva l1l
latches in a master-servan t arrangement. The master D i3lch slores 1,IS m Input W I e : : : : , UI design doesn'l change Ihe output unlil Ihe railing
lhe new va lue appearing al Om and hence al Os does 1101gel slored mlo the servant latch. because clock edge. Ihe fl ip-fl op i slill po ilive edge- Figure 3.26 Posili ve (shown on lhe
the servanl lalch is disabled when elk = O. When elk becomes 1. Ihe servanl D lalch becomes Iriggered. because Ihe fl ip-flop Slores Ihe value Ihal left) and negalive (righl) edge-
enabled and Ihus gelS loaded wilh whalever value was in Ihe mas'er lalch JUSI before elk changed was al Ihe 0 inpul al Ihe in' Wnl thm Ihe clock edge Iriggered D fl ip. flops. The sideways
from 0 10 1. j riSing. rriungle input rcprescnls an edge-
Iriggered clock inpul.
latch becomes disabled (relai ns ils stored value), thus hold ing whalever bit was at the 0 Latches ,'ersus Flip-Flops: Various lex lbooks defi ne the temls latch and fli p-nop differ-
input j usl before the clock changed from 0 to 1. Also, when elk is 1: the servant lalch ently. We'lI use what seems to be the mOSI common convention among des igners. namely:
becomes enabled. thus storing the bil that the master IS stonng. wh ,ch 's the bll thaI was
al the D inpu l jusl before elk changed from 0 to I-hence implementing an edge-trig- A latch is level-sensilive. and
gered storage block. • A jlip-jlop is edge-Iriggered.
The edge-triggered
y So saying "edge-Iriggered flip-Rop" is redundanl , since flip-fl ops are by defin ilion
block using two inlernal 01 01 02 02 03 03
edge-triggered. Li kew ise. saying "Ievel-sensilive latch" is redundant. since latches are by
latches thus prevents the defi nili on level-sen ili ve.
stored bi t from propagating
thro ugh more Ihan one elk ....-===::.....~==::.....~
lalch when Ihe clock is 1.
__----' Figure 3.27 uses an example liming
diagram 10 illuslrale the di fference belween
level-sensili ve and edge-Iriggered bil IOrage
,,
,
Consider the chai n of flip - blocks. The fig ure provides an example of a Clk~
Aops in Figure 3.25. which clock signal and a value On a signal D. The
is simil ar to the chain in Figure 3.25 Using D Rip-flops. we now know through how
Figure 3.2 1 bUI with 0 Rip- many Rip-Rops Ywill propagale for C1 k_A and for C1 k_B-
nex t signal trace is for Ihe 0 OUIPUI of a 0
larch, which as we know is level-sensili ve.
O~
fl ops in place of 0 lalches. one Rip-Rop exaclly per pulse. for either clock signal. The lalch ignores Ihe firs l pulse on D(labeled Q (0 latch)
We know that Y will propagate Ihrough exactly one Rip-flop on each clock cycle. a 3 in the fi gure) because elk is low. How-
The common The drawback of a maSler-servanl approach is that we now need two 0 lalches 10 ever, when elk becomes high (I), the latch 0 (0 flip.flop) f
name ;s actually
store one bit. So Figure 3. 25 shows four Rip-Aops, but Ihere are IWO latches inside each oUIPUI follows the D inpul , so when 0 :9 10:fr---
"master-slave...
Some clroou Aip-flop, for a tOlal of eight lalches. changes from 0 10 1 (4), so does the latch , i
insll!ad to use the OUlpul (7). The latch ignores Ihe nexl
term "servant "
There are many ahemati ve methods other Ihan the maSler-servant method for
due 10 some designing an edge-triggered Aip-Aop. In fac t, Ihere are hundreds of different designs for changes on 0 when elk is low (5). but then Figure 3.21 Lalch versus flip-Rop liming.
people finding lire latches and Aip-flops beyond the designs we showed above, with those designs differing follows D again when elk is high (6, 8).
term "slave "
in lenns of their size, speed. power, etc. When we use an edge-triggered Aip-nop, we Compare this wilh the nex l signal trace. howing the behavior of a rising-edge-trig-
offenSive. Others
use the turns usually don'l worry aboul whether the flip-flop achieves edge-triggering using Ihe master- gered 0 Aip-fl op. The Aip-fl op amples D at the fi r t ri ing clock edge (I). fi nding 0 to be
"primary- servant melhod or using some olher method. We need only know that the f1ip-Rop is edge- O. The flip-flop thus slores and oUlpul a 0 (9). The Rip-fl op amples 0 al the next rising
.ucondary. " clock edge (2). finding D 10 be 1, and thus stores and outputs a 1 (10). Olice that the Rip-
triggered, meaning the data value present when the clock edge is rising is the value thai
gets loaded into Ihe flip-Aop, and that appears atlhe flip-fl op's outpul some time later. fl op ignores all changes 10 0 Ihat occur belween Ihe ri ing clock edges (3. -1. 5. 6)-even
ignoring changes On 0 when Ihe clock is high (4. 6).

- -_._---
108 3 Sequenti al Logic Design- Controllers
3.2 Storing One Bit- Flip-Flops 109
EXAMPLE 3.1 Flight attendant call-button uSing a D fl ip-flop TABLE 3.1 0 truth table for
call-button system. Level-sensitive 5R lalch
Lei· ... dc\i gll ollr ni el ill ,lItcm.lant cu ll -bullon system lIsing a D
5 o lalch o flip-flop
flip-nop. If Ca 11 e i~ prcs:-.cd. we wanl 10 store a 1. If Ca II Ca ncel 0 0
Cance 1 i ~ prc',cd. \\ C \\Iu nl !oo lo re n O. If neither is pressed,
o latch Olalch
we W;J1l1 to siore whatever i, prcscnll y Siored. meaning O. 0 0 0 0 Om Om Os Os, O'
\Vc Ihu, I1c~d ;1 , imp/c l'olnbin:.Hiollal circuit in fron t of the 0 0 0 ) ) Cs Os 0
inpul. dc,cribe<i by tbe truth table in Table 3.1. If Ca II =0
and Ca nce 1=0 (thc li"t two rows). 0 equals D'' valuc. If 0 ) 0 0
Fealure: 5 =1 Fealure: 5 and R only
Call a O and Cancel - l (the next two rows). 0=0. If 0 ) ) Fealure: 5R can'l be 1/ Fealure: Only loads 0 value
0 sels 0 101 , R=I have eHecl when C=I.
Ca 11 t and Cance I BO (the nc.XI two rows). 0=1. And tI
B
resels 0 10 O. if 0 is slable before and presenl al rising clock edge,
both Call=) and Cancel a ) (the last two rows), we' lI ) 0 0 ) We can design oUlside while C= I , and will be II so values can'l propagate 10
Problem: circuil so SA:: 11 never
give priority to the Ca II button. so 0-1. 5R=11 yield for only a brief glilch even olher flip-flops during same
1 0 1 ) happens when C=I . if 0 changes while C=l . clock cycle. Tradeotf. uses
Aner ~OIl1 C algebraic liimplific:l.lion. we obtain (he fol- undefined O. Problem: avoiding
lowing eq ual io n for 0: Problem: C=I 100 long more gales inlernally Ihan 0
) 1 0 1 5R=11 can be a burden. propagates new values lalch. and requires more
D B Cancel ' 0 + Call 1 1 ) )
Ihrough 100 many lalches; exlernal gales than 5R- but
too Short may nOI enable gate count is less of an issue
The final !<tY'lcm iii ~ho\\ 11 in Figure 3.28. a slore. loday.
Figure 3.29 Increasingly better bit storage blocks. leadi ng to the 0 flip-llop.
The D flip-fl op-based design uses
more gates tha n Ihe SR lalch-based in Call ,--, _-' Flight Basic Register-Storing Multiple Bits
burton altendant
Fig ure 3.9 (w hich could have just as
Cancel call-button A reg ist er is a sequcn lia l componen l thai can store multiple bits. We ca n bui ld a basic
eas il y used an SR fl ip-flop) . One button system
reason ror the exira gate!' is Ihal a D reg isler simply by us ing multi ple fli p-flops, as shown in Figure 3.30. That reg ister can
flip- fl op always slores ils D inpul on (a) hold 4 bi ts. When the clock rises, all 4 fli p-fl ops get loaded wi lh inpu ts 10, 11. 12, and
13 si multaneously.
every c lock cycle, so we muSI explic-
il ly feed 0 back inl o D 10 mainta in the Call 13
button 12 f1
same va lue. In contrast. we could just
Cancel
SCI S=R~O 10 mainlain Ihe same va lue button
wilh an SR flip-fl op. Furthermore. we
must convert Ihe bU lion presses 10 the
appropriate D inpul value, requiring (b)
ext ra logic. rather than just cuing
Figure 3.28 Flight attenda nt call -button system:
ei ther 5 or R 10 1. In Ihe late 1970s and 01 00
(a) block diagram. and (b) implemented using a
earl y I 980s. Ihose ex tra gates were a o fli p-flop. Figure 3.30 A basic 4-bit register internal design (left) and block symbol (right).
big deal. beca use ICs came with just a
few gales on Ihem , so extra gales often meant extra ICs, meaning ex tra size, cost, power, This register, made simply from multi ple fli p- fl ops. is the mo t basic fornl of a reg-
etc. But today, in Ihe era of mill ion-gate ICs, the savings of an SR flip-fl op are trivial. In ister-so basic that some companies refer 10 s uch a register simply as a "4-bit D fl ip-
modern des ign. nearl y all designs u e D flip-fl ops, not SR flip-fl ops. fl op." We' ll describe more advanced regislers, namely, registers with more feat ures and
As a poin l of informal ion, deSigners commonly refer to fl ip-fl ops simply as flops . operations, in Chapter 4.
We wenl Ihrough several inlermediale designs before arri ving at our robust D flip-
fl op design for Our desired bil storage block. Figure 3.29 summ arizes those designs, EXAMPLE 3.2 Temperature history display using registers
including Iheir features and their problems, leading to the robust edge-triggered D We Want to design a system that records the outside temperature every hour and displays the last
flip-flop . In look ing Over the summary, notice that the D flip-fl op reli es on an internal three recorded rcmperalUrcs. so thai an observer can see the lcmpermure trend. An architecture of
SR lalch to mai nt ain a stored bil be/ ween clock cycles, and re lies on the designer to the system is shown in Figure 3.3 J.
introdu ce feed back outs ide Ihe D fl ip- fl op to mai ntain a stored bit fro rn aCIVss clock A timer generales a pulse on signal C every hou r. A tel11pera!u~ cnsor outputs the prese~I
cycles. lemperature as a 5-bit binary number ranging from 0 to 31 . cOlTespo~dtng to those temperatures tn
Celsius. Three display COIll'e l1 Iheir 5-bit binary inputs into a numencal dtsplay.
110 3 Sequential logic Design- Controllers
3.3 Finite-State MaChines (FSMsl and Controllers 111

This example dcmollSlrnres one of Ihe grea t things lIbOU I synchronous circuits built from edge-
triggered nip-nops-many Ihings happen at once. yel we need nOI be concemed aboul signals prop-
agating [00 fast through II register to nnOlher register. The rcason we need nOI be concerned is
because registers ollly gel loaded 011 lhe rising clock edge. which effectively is an infinitely small
period of lime. so by fhe lime signli is propagate through a register to a second regislcr. it's too

))) TemperalureHistoryStorage
laIc-that second register is no longer paying attention to its data inputs.

We should mention that , in practice. designers typically try to avoid connecting any
signal other than an oscillator ou tpul to the clock input of a Rip-flop or register. So in
Figure 3.31 Temperatu re timer C avoid connecting the timer output practice. we might Iry to avoid connecti ng the signal C to the registers' clock inputs, since
hislory display syslem. (In practice. we would actually ( 9 an oscillator output to a clock input.)
C to a clock input, instead only connec In C comes rrom a timer output, not an osci llator. We' ll show in Chapter 4, Example 4.3,
how to des ig n a s imi lar ys tem using an osci llator ror the clock.
. S" componen t usin o three 5-bil registers, a
·1/ Ill > prese~t teml>era ture on inputs
\'Ve can implemen t the Temperal/lreHlsfOl)' IOIfI!fR'
.. 31 E I I' signal C loads a WI 1 "
shown In Figure ~. _. ~c 1. pll se 0.11 . R . I the 5 input bits). At the same time that register 3.3 FINITE-STATE MACHINES (FSMS) AND CONTROLLERS
x4 . . xO (by load lllg the) flip-naps IIlslde a W.1t 1 Rb octs loaded with the value th at was in Ra.
Ra 2CIS loaded wi th that present tempera ture. reglstc~ d Ol · ';]1 the sam e time namely on the Registers store bits in a dig ital circuit. Stored bits means the circui t has memory, also
Lik;wisc. Rc gels loaded wilh Rb's value. Alllhree Oil s lappen . re Ih; clock cd e et
known as slale. resulting in what are known as sequential circuits. While a register
rising edge of C. The errect is [hat th e v:ilucs that wcre In Ra and Rb Just befo g g
shifted illlO Rb and Re. respec ti ve ly. storing bits happens to result in a circuit with state. we can ac tually use state to design
circuits that have a . pecifi c behavior over time. For example, we can specifically
design a circuit that o utputs a 1 for exactly three cycles whenever a button is pressed.
Or we cou ld design a circuit that blinks lights in a specific pattern . Or we could design
a circuit that detects ir three buttons get pushed in a particular sequence and that then
a4 a3 a2 al a0 r - - b4 b3 b2 bl bO , - - - -

--------
,----
c4 c3 c2 cl cO
unlocks a door. In all the e cases. we wo uld be making use of sta te to create specific
f-:-1 4 04 14 04 14 04J- time-ordered behavior for Our circ uit. A sequential circuit that controls Boolean
~13 03 13 03 13 03 I - - Ou tpulS based On Boolean inputs and a specific time-ordered behavior is often referred
~12 02 12 02 12 02 ~ to as a cOlllroller.

Figure 3.32 lnlemal


design of (h e
TemperalllreHiSlory
--
c
~It
~IO
xO

r~
01
00

r~
It
10
01
00

I~
tl
10
01
00
EXAMPLE 3.3 Three-cycles-high laser timer-a poorly done first design
Consider the design of a pan of a laser
Storage com ponent. TemperatureHistoryStorage surgery syslem. such as a syslem for scar
removal or correc tive vision. Such systems
work by turning On a laser for a precise
Fieure 3.33 shows sample values in Ihe regislers for several clock cycles, assuming all ihe reg- amounl of time (see "How doe it work ?- clk
isters i~itially held Os. and assuming that as tim e proceeds the inputs x4 .. xO have the values laser surgery" on page I 12). A general
shown al th e (OP of rhe timing diagram . archilec ture of such a system is hown in patient
Figure 3.34.
A surgeon activates the laser by
pressing the bUllon. Assume Ihe la er Figure 3.34 Laser timer system.

Figure 3.33 Example should Ihen Slay on for exaclly 30 ns. . .


of va lues in the Assuming our clock 's period is 10 ns. 30 ns means 3 clock cycles. (Assume thai b IS synchroruzed
Temperolure Hislory with the clock and Slays high for onl y I clock cycle.) We need 10 design a controller component
Ra Ihal. once delecting Ihal b ~ I. holds X high for exactly 3 clock cycles. thus luming on the laser for
Storage registers. One
panicuJar daw item ) exaclly 30 ns. . .
Rb This is one example for which a software solution may nol work. USlllg JUSI regular program-
J 8, is shown moving

through Ihe regiSlers ming statements reading inpul pons and wriling OUtpUI pons, we may nOI have a way 10 hold an
on each clock cycle. Rc OUlput pan high for exaclly 30 ns-for example. when Ihe microprocessor clock frequency IS not
fasl enough. or when each slalemenl takes 2 cycles 10 execule.
112 Sequential log ic Design-Conlrollers
J.J Finite-Slate Machines IFSMs) and Controllers
113
Let's try to crea te a sequential circuit implemen- Finite-State Machines (FSMs)
tation for the system. After th inking about the
problem for a while. we miglll come up with the (nol In the previous chapler, you. saw Ihal we cou ld design a combinational circuit by first
so good) implementation in Figure 3.35. clk descnbmg the deSired CirCUli behaVIOr using a malhematical formalism known as a
Knowing we need 10 hold the output high for
Boolean eq uation , and then converting the equation 10 a circuit. For a sequential circuit. a
three clock cycles, we used three flip-flops. with the
idea bein!! that we'll shift a I throu gh those three flip-
Boolean equatIOn alone is not sufficient 10 describe behavior-we need a mOre powerful
malhematlcal formali sm Ihal incorporales lime.
flops. taking three clock cycles for the bit (0 move
lhrough all lhree flip-naps. We ORed the nip-nap Figure 3.35 Firsl (bad) allempl 10 Finite-slale machines (FSMs) are jusl s uch a method. The name is a bil
outputs 10 generate signal x, so Ihal if any flip-flop implement Ihe laser surgery syslem. awkw~rd , but Ihe concepl is straighlforward . An FSM consists of severa l Ihings, the
comains a 1. the laser will be on. \Ve made b the mOS I Imponanl of wh ich IS a sel of states representing every possible stale, Or
inpu l lO the firsl flip-flop. so when b= 1, the firsl nip-Hop Slores a 1 on Ihe nex i clock cycle. One mode, of a system.

~
~··:'-d
. ·~I .1
clock cycle Imer. the second flip-fl op will get loaded with 1, and assuming b has now returned (0 0, 1.like 10 u~e my daughler's hamsler as an intuitive example. After baving a hamster as
Ihe firsl flip-flop will gel loaded with O. One clock cycle Imer. Ihe third flip-fl op wi ll gel loaded wi!h a family pel, I ve learned Ihal hamsters basically have four stales: Sleeping, Eating, Run -
1. and Ihe second flip-flop wilh O. One clock cycle Ialer, Ihe Ihird nip-flop wi ll gel loaded wi!h O. "
IIlIIg 011 The Wheel, and Try illgToEscape. They spend mOSI of their day leeping (being
Thus. the circuit held the aUipul X at 1 for three clock cycles after the bulton was pressed.

We did nOI do a very good job implementing this syslem. First of all, what happens if
_
".

-
. "
. '. i
. 1 nocturnal), a bit of tllne ealing or running On the wheel, and the rest of their time desper-
alely Irylng 10 escape from Ihei r cage.

.
' As a more electronics-oriemed example, lei 's design a system thai repealedly sets
the surgeon presses the button a second time before the three cycles are completed? Such a
- an OUlpul X 10 0 for one clock cycle and 10 1 for one clock cycle. The syslem clearly
The prel'iolls situation could cause the laser 10 Slay on 100 long. Is Ihere a simple way to fix our circuit to
example has on ly two states, which we' ll ca ll Off and Oil. In slate Of(, X = 0; in stale 011_ x = 1.
accounl for that behavior? Second, we didn'l use any orderly method for designing the
illllsrra,ed rhe We can show Ihose slales, and the transilions between them , usi ng the state diagram in
need for a way circuil-we came up with the ~Ring of nip-flop OUlputs, bUI how did we come up wilb Figure 3.36.
of describing 'he that? Will thai merhod work for all lime-ordered behavior that we mighl wan l 10 design?
desired behol'ior
of a sequential We need IWO Ihings 10 do a bener job al designing circuilS having time-ordered
cirr;ui,. behavior. Firsl, we need a way 10 explicitly represenlthe desired time-ordered behavior-
we' ll introduce the finite -slale machine represenlation for thi s purpose. Second, we need
~~~~ I I I I
Outputs: x
hcycle 2 hcycle 3 hcycle 4 i
I I t I
an orderl y method for implemenling such behavior as a sequenlial circ uit-we' ll intro- clk cycle 1
duce such a standard method.
i i ! i
~ HOW DOES IT WORK?-LASER SURGERY. slate~
Outputs: I I , I

Laser surgery has become very popular in the pasl


decade, and has been enabled due 10 digilal syslems.
Lasers. invented in Ihe early I960s, generale an
vaporized. The laser can also be used 10 vaporize skin
ceUs !hat fonn bumps on Ihe skin. due 10 scars or moles.
Similarly. lasers can reduce wrinkles by smoothing ille
X --r--1---J!
intense narrow beam of coherenl light with pholOns Figure 3.36 A simple slale diagram (len) and Ihe timing diagram de cribing the state diagram's
skin around the wrinkle to make the crevices more
having a single wavelength and being in phase (like behavior (ri ght). Above the timing diagram. we see the FSM going from one Sl'ate 10 the other in
gradual and hence less obvious, or by stimulating lissue
being in rhythm) wilh one another. [n contraS!, a each clock cycle. "e 1 k A" represenls Ihe rising edge of the clock signal.
under Ihe skin 10 slimulale new collagen growth.
regular light's pholons fly OUI in all directions. with a
Another popular use of lasers for surgery is for
diversily of wavelengths. Think of a laser as a plaloon
cOlTecling vision. [n one popular laser eye surgery Assume we Slarl in Slale Off. The diagram shows thai x is set 10 0 while the y lem
of soldiers marching in synch, while a regular lighl is
method, the surgeon CUIS open a fl ap on the surface of is in Slale Off. The diagram also shows thai on Ihe neXI rising edge of the clock signal .
more like kids running oul of school althe end-of-the-
Ihe comea_ and Ihe la er Ihen reshapes the cornea by C/kA, the syslem Iransilions 10 Slale 011, and the diagram shows thul i el 10 I in Ibal
day belL A laser's lighl can be so inlense as 10 even CUI
thlOnlOg Ihe cornea in a panicular pallem, with such Slale. On the next rising edge of the clock, [he diagram shows lhal the y "lem tran i-
steel. The ability of a digilal circuilto carefully control
IhlOmng accomplished Ihrough vaporizing cells.
the location, intensilY. and duralion of the laser is whal lions 10 slale Off again . A l.iming diagram showing the sy lem' beha,~or i hown in
A digilal syslem conlrols the laser's localion, energy, Figure 3.36.
makes lasers so useful for surgery.
and dural ion, based on programmed informalion of the
One popular use of laser for surgery is for Scar
removal. The laser is focused on the damaged cells d~Slred procedure. The availabilily of lasers, combined Recall in Example 3.3 thai we wan led a syslem Ihal held ils OUrpUI high for three
wuh low-coSi high-speed digilal circui ts. makes such cycles. Toward that end. lel's extend the simple Sime diagram of Figure 3.36 I ha\e on
sljghlly below the surface, causing Ihose cells 10 be
precise and useful surgery now possible. off Siale and three on slales, as shown in Figure 3.37. The OUIPUI will be 0 for one C) -Ie.'
and Ihen 1 for Ihree cycles. as shown in the liming diagmm of the figure.

~ ---- - -
Sequential Logic Design- Controllers 3.3 Finite-State Machines (FSMs) and Controllers 115

Outputs: x IkA 1 elk JUULJLILJLJLJl- EXAMPLE 3.4 FSM for the three-cycles-high laser timer

~
=o elkA x=1 elkA~ ~~3 State@ff lonl ;on2pn310ff lonlpn2 iO n310ff l
We can create an FSM to describe Ihe earlier introduced laser timer system. The system might have
four states: Off, all / , 0112. and On3. In the Off state, the laser should be off (x -D). The anI state
Off Onl ~ ~
Outputs: -.J U L wo uld be the first clock cycle the laser is On (x - 1), On2 the second cycle, and On] the third cycle.

~ ~
The state dIagram of Ihe FSM is in fact identical to that shown in Figure 3.38.
x . Here's how Ihe FSM shou ld be interpreted. We start in our initial state Off. We stay in state Off
. 'He diaornn1 (left). timing diagr:111l (ri ght ). until One of Us two outgoing transitions has a true conditi on. One of those transitions has the condi-
Fi ure 3.37 Three-cycles-hi gh system. st. 0 . tion of b' AND rising clock (b ' *c1 kA)-in Ihal case, we transition right back to state Off. The
g 'ti ons 10 funh e r ex tend the behaV ior. other of Ihose transilions has Ihe condi tion of a b AND a rising clock (b*c 1 k A)-in that case, we
. ditions on the LranSI .. . '.
We can introduce Input con . 38 b hanoin o the condillOn on the tra nSition transitIOn to Siale 0111. We Slay in Slale a,,} until its one outgoing transition 's condition. a rising
d' 11 in F, oure 3 Y c ~" . " I k
We ex tend the stat e wgn.lI e Ih~ new cond ition require s not Just a n S I ~l? CDC, clOCk. becomes true-in which case we transition to stale On2. Likewise, we stay in On2 until the
fro m state Off to stale ani. such that.. f OljJback to Off. wi th the condlLlon ofa ~ex( riSing clock. Iransilioning to 0113. We Slay in 0,,3 until the next rising clock. causing a transi-
dd a tranSillon rom ' d
but also that b= 1. VIIe a Iso a . ' . . the fi oure shows the state an outpUt tIon back 10 slate Off. In stale Off. we have associaled the action of setling x-O, while in states anI,
. . . d b=O The liming diagram Ill . " 0112, and On3, we have associated the action of selling X= 1.
rlSlIlg clock an . '. values on b.
behav ior for the given IIlput Thus. we have precisely described the desired time-ordered behavior of the laser timer system
using an FSM.

elk JLJLJLJllJuLl ,, ,,
/t 's inleresting to examine the behavior of this FSM if the button is pressed a second time
while the laser is on. No[ice thm the lransilions among the On sta tes are independent of the value of

Inputs:
b
rn'
1
i
b. So this system will always lurn Ihe laser on for exactly three cycles and then return to the
Slale 10 awail another press of the bUllon.
Off

I I I I jonl:on210n~ Off I
Srate !Off Off Off Off Off Simplifying FSM Notation: Making the Rising Clock Implicit
Outputs:
-------------" I L,
Thus far, we have included the rising clock edge
(c 1 k A) as pan of the condition of every FSM
transi tion. We included that edge because we are
. h system.. state diaorom (left), liming diagram (right). onl y considering the design of sequential circuits
Figure 3.38 Three·cycles·hlg e
that are synchronous and that use rising edge-
triggered Rip-Rops to store bits. Synchronous
. ajirrile-slate
From the above examples. we can see that . macilirr e, or FSM, is a math-
a"clkA sequential c irc uits with edge-triggered Rip-Rops

(J+D
emalical fomlalism consisLing of several things. make up the Vast majority of sequential circuits in
A set of states. Aur ex a mple had four states: {Onl. On2, 01/3,. Offl· modem design practice. As such, most textbooks
. and designers, to make their state diagrams more
A set of IIlputs, and a set 0 f outputs. Our example had one IIlpllt: {b }, and one
a' readable, follow the convention lhat every transi- Figure 3.39
output: {x }.
An .Illili
-' al tate, name Iy, a s
FSM 's .IIllli
tate to stan in .
. ' al tate can be .In d I'cated
,
when we power
graphically by a ;lIlgle ItCCle e ...
. h An
d ' up dthe d ystem.
ge, Wit
I no
C:r-D tion in an FSM is implicirly ANDed with a rising
c lock edge. For example, a transition labeled
assuming every tmn irion is ANDed
with a rising clock .
"a '" ac tua ll y means "a' *c1 kA." Hencefonh. we will not include the rising clock edge
.
source state , Ihat pOint to t he 1111
' '(al
I . tate . An FSM can only have one IIltlia state.
when drawing FSM transitions, and we will follow the convention that every transition is
Our example's initial stale was Off.
. ' on 0 f Ihe nex t sa
t te to go 10 based on the cu rre nt sta .le and thedva.l
' . ues of
A deSCripti
. a ' ~ "STATE" I UNDERSTAND, BUT WHY THE TERMS uFlNITEu AND "MACHINE? .
lhe inputs. ur exam p Ie u.sed directed edges. .with a,wciated "Input carr. IflOns to
tell us Ihe nex t state. Those edges wilh condil lon arc known ,l~ trlll/Slfrorrs. _ FinilC-Slate machines, or FSMs. have a mther "machine" is used in irs mar.hematical or computer
. . of what OUlput values 10 generate in each Mate. Our exa mple a signs awkward name th m sometimes causes confusion. The science sense, being a concepTUal object that can
A d eSCrlplion . M ' .. Ii term "finite" is there 10 contrast FSMs with a similar execute an abSlr3et language--specificaJl . that sense
a value to X in every slate. Assigning an outpU I In an FS "nown as an ac on.
representation used in mathematics Ih31 can have i.U1 of machine is not hardware. Finite- tnte m3ClliMs are
VII sed a graphical represenlation of an F M. kn own .1\
a slale diagram , to ho" infinite number of stliles; Ihal represenlmion is nOI also known :IS jinile-S14U aUlolfllJllJ. FSMs ~ used
the FSM e ufo r our exa mp Ie. We co u ld have repre5ented
" the F M lex luall y lIl'tead. but stale very useful in digilal design. F Ms, in contmst. have for many things other than just digilllJ design.
diagram~ arc very popular for visualiting an FSM , hchavlOr. n limiled, or finite, number of SIUI.S. n,e lem,
116 Sequential Logic Design-Controllers 3.3 Finite·State Machines (FSMs) and Controllers lJ7
• ' do Fioure 3.39 illustrates the laser timer state
. ' . Ihen proceeds Ihrough K2. KJ, and K4,
implicitly ANDed wiLh a nSlllg clock c ~e. . " ,. . lock
d sino an Imp Jell C . OutpUlling r: 1, 0, and 1, respect ively.
diagram fro m Figure 3..l0. re rawn u ~.. . ply tra nsi tions o n the next clock cycle, even thoug h we returned inpUi a to O. Clk~
A transition-with no assoc iated condlllon slIn
0----0 because of the impl ici t rising clock edge. . ho v 10 describe lime-ordered behavior
Let's consider a rew more examples shOWing \
Timing di agrams represe nt a par-
ticular situation defi ned by how we sel
Inputs
a _ _ _--'

using FSMs.
the inputs. What wou ld have happe ned
if we had held a = I for many more
State IIWait IWail I Kl I K2 I K3 I K4 Wait Kj
clock cycles? The timing di ag ram in Outputs
EXAMPLE 3.5 Secure car key . .
ew aUlOmobiles have ::J thicker plastic h ea~ t~ a~ III the
Figure 3.43 illustrates that situalion.
r~
Notice how the FSM, after retu rning to
H<1 ve you nOllced... that Ihe keys for J~lan). 11 believe it or 11 0 1, there is a computer chi p IIlslde the
• . " • I

past (see Figure .lAO)? Th~ reason IS th~I.,


stare Wait. proceeds to Slate K I again Figure 3.43 Sec ure car key timing diagram
In n basic version of suc h a secure car key, when
head of th e key, implemcntll1g a secure car ke~,
on the nex t cycle.
(which is under the hood and commu- for a different sequence of values on input 3.
the driver tums the key in the ignition, the car s com puler . 'onal aski ng the car key 's chip to The computer chip in the car key
h b {'OI) sends out a ra d 10 510 ..
nicates using what's ca~led t, e a~esta ' ~ '0 The chi in Ihe key then responds by sending has circuiLry that converts radi o signals
respond by sendin g an Identifier via a rad iO sIena!. P s onder ;'transmits" in "response" 10 bi ts and vice versa,
the identifier (ID) usin o what's known as a transponder (a tJan P , h fD d'f "So my car key may someday need its batteries replaced?" you might ask. Actually, no-those
' e . es ponse or the key s response as an 1-
to a request), If the bases!ation d~es not rece,l\'e a r h com uter shuts down and the car chips in keys draw their power as we ll as their clock from the magnetic componem of the radio-
ferent than the lD program med 1111 0 the car s computer. ( e p
frequency field genermed from the computer baseslaLion. The extremely low power requiremem
won't start.
makes Custom digi tal circuitry, ramer th an software on a microprocessor, the preferred implementa-
tion method.
Computer chi p keys make stea ling cars a lot harder-no more "ha l- wiring" 10 stan a car, since
the car's compuler won' t work unless it also receives the correct idemifier, And the method above is
acrually an overly simplistic method-many cars have more sophisticated commun ication berween
the computer and the key, in volvi ng several communi cmions in both directions, even using
Figure 3.40 Why are the he3ds of keys gelling thicker? Note th31 .the key on the ri ght is .thicker than encrypted communicmion- maki ng fooling the car's compUler even harder. A drawback of secure
the key on the left. The key on the right has a computer chip inSIde that sends an Iden lJ fier to the car keys is that you can't just run down to the local hardware Store and copy those keys for S5 any
car's compu ter, thus helping to reduce car thefts. longer-eopying keys requi res special tools th at today can run $50-$ I 00. A common problem while
computer chip keys were becoming popular was that low-cost locksmiths didn't realize the keys had
Let's design the controller for such chips in them. so copies we re made and the car owners went home and later couldn't figure ou{ why
Inputs: a; Oulputs: r
a key having an ID of 10 11 (re31lDs are \ their car wouldn 't start. even th ough the key fi t in the igni tion slot and turned.
typically 32 bits long or more. not just 4
biLS). Assume the controller has an input EXAMPLE 3.6 Code detector
a that is 1 when the car's computer
requests the key's ID. Thus the controller You 've probably seen doors in airpons or hospi-
initially waiLS for the input a to become tals th at require a person to press a panicular
Start
I. The key should then send its ID sequence of bUllons (i.e .. a code) to unlock the
(lOll) serially, staning with the right· door. For example, there migh t be th ree bUllons, Door
Red Code
most bit, on an output r: the key send 1 Figure 3.41 Secure car key FSM. colored red, green, and blue. and 3 fourth bUllon detec10r lock
Green
on the first clock cycle. I on the second for starting the code. Pressing the stan bUllon.
Blue
cycle. 0 on the third cycle, and finall y 1 then the fo llowing bUllon sequence-red. blue,
on the fourth cycle. The FSM for the clk~ green, red- unlocks the door. wh ile any other
controller is shown in Figure 3.4 1. Note sequence does not un lock the door. Such a
Inputs~
that the FSM sends the bits 'lUning from a system may have the ge nernl architecture shown Figure 3.44 Code detector W"Chitocture.
the bit on the right. which is known as in Figure 3.44. An extra output from the bUllons
the leart significant bit (LSB). State I iNail Flail I K t I K2 I K3 1 K4 IWait!W8iti component, a, is 1 whenever a ll)' button is
Figure 3.42 provides a timing pressed.
diagram for the FSM for a particular Silu. We can de cribe the behavior of the CodeDetector block using an FSM ""cured as the SOte
ation. When we set a - 1, the FSM ente" diagra m shown in Figure 3.45. .
,tate K J and output.> r - 1. The FSM For simplicity, ass ume th at the bUllons each h3ve a pecial ireuit that S) n h~nlZes the butt n
Figure 3.42 Secure car key tllTlIng diagram. with the clock ignal. nnd cre'lles a pulse exn tly one clock yele " ide for e3<,h unique press of the

.... ---- .
118 Sequential logic Design-Controllers 3.3 Finite-State Machines (FSMs) and Controllers J 19

bUllon . This is necessary to en!'urc Inputs: s.r.g,b,a; buttons simultaneously, fo ur time in a row? Well , the way we defined the FSM. the door
Lhal the CUlTcm SIZlIe does n '( inad- Outputs: u wo uld unlock! A solution to this undes ired ituation is to mOdify the conditions on the
vcncnr ly change 10 another Slate if a arcs that go back to the Wail state. Rather than the condition a r ' , we could use the con-
button press i:.Isls longer thall a single dition a (r '+b+g). Thus, when the FSM expects the red bUllon, then not pressing the red
clock cycle. (\Vc'lI design such a button. Or pressing the blue or green bUllon , causes a transit ion back to the Wail state-
synchronization circu it in Example
and so does not unlock the door. Likewise when we are ex pecting other specific buttons.
3.9.)
An improved FSM is shown in Fig ure 3.46. Fixing the FSM was easy: trYing to fix a
The behavior of Ihe FSM is circui t deri ved from the FSM wo uld have been much harder.
a~ fo llows:
It turns Out that the FSM in Figure 3.46 still has a problem-a fairly seriou one.
The FSM begins in the Wait We' ll describe that problem in Exampl e 3. 13.
Sia le . As long as the slart bU I-
Ion is nOl pressed (5 ' ), the FS M
FSM slays in m,il: when Ihe Figure 3.45 Code deleclOr - . Standard Controller Architecture for Implementing
sIan bUllon is pressed ( S). Ihe an FSM as a Sequential Circuit
FSM oaes 10 the Sian Slale. d bl
• 0, FSM is now ready (0 delcct the sequence re . ue, green, Now that we 've seen how to desc ri be seq uential behavior using an FSM. we need a Struc-
Being In the 5wrr stal e means the . S I If a bUlIan is pressed AN D that bUI-
. d ( ') he FSM stays ," tar. tured method to conven the FSM to a sequential circuit. The method is actually very
red. If no bUllon IS presse a . I R /I If a bUllon is pressed A D Ihal bUllon
. db (a r) the FSM noes 10 slale et . ' . straightforward when we use a standard implementation archi tecture for the circuit. con-
Ion IS the re ullon. e h Wa i t stalc-nole Ihal when III Ihe Wall
. b ( r ' ) the FSM retums 10 I e . sisting of a state register and combinational logic_ LOgether known as a conlroUer. There
IS nOI Ihe red ullon a . Id be i.nored. unlil the SIan bUllon IS pressed
stare. further presses of the colored butlons wo u ~ are many other ways to implement an FSM, but stiCking to the tandard architecture
results in a straig htforward design method. The standard architecture may not yield the
agai n. b lion is pre sed (a'). If a bullon is pressed and minimum nu mber of transistors, but as we've mentioned many times. that's not a draw-
The FSM stays in Slate Redl as long as no U 81 "f Ihal bUllon is nOI blue (ab ' ) Ihe back these days.
thaI bU llon is blue (ab). the FSM goes 10 Slale lie. I ,
FSM relUrns to state 1'0i1. , A standard cont roller architecture for an
Likewise. the FSM stays 10 . state B1/Ie as Iong as
.. no bUllon
, is pressed (a ). and goes 10 Slale FSM consists of a state register and combina-
Green on con ditio n a g. and state Wair on cond ition a 9 . .. tional logic. The standard architecture for the
Finally. Ihe FSM slays .III Creen 'f
I no bInur a is pres ed. and goes 10 Slale Red2 on condItIon laser timer FSM of Figure 3.39 is sho wn in
a r. and to state \~0il on condition a r I • •
Fig ure 3.47. The architecture consists of a state
R d2 h eans that Ihe user pressed Ihe bUllons III the correel reg i ter and combinati onal logic.
If Ihe FSM makes il 10 slale e . I al m kin Ihe door. Ole thaI all olher Slates sell/=O. The state register is a 2-bi t regis ter that
sequence-Ihus, Red2 sets 1/= I. thus unloc g
holds a binary num ber representi ng the present
The FSM then relums to slate Wail. Ded with a rising clock edge. ~
Recall tha l every transi tion' condi ti on is implicitly A state (i n thi s case, the reg ister is 2 bits wide to
represent each of the 4 possible states).
Figure 3.47 Standard conlroller
Checking FSM Behavior The combinational logic's inputs are the
architecture for the laser timer.
Correctl y defi ning the behavior input of the FSM (in this case, b), as well as
of a system is hard. The earlier the state register's ou tputs (s 1 and sO). The
Inputs: s,r,g,b,a;
we fi nd problems, the easier they Outputs: u combinational logic 's outputs are the outputs of
are to fix. So after we create the the FSM ( x ), as well as the nex t tate bi ts to be
FSM , we migh t take time to as k loaded into the state register (n 1 and nO). The
que tions abou t how the ystem details of the com binational logic detemline Ule
behave under cenain input si tua- behavior of the circuit. The prace s for creati ng
tions and then verify that the those detai l wi ll be covered in the next
FSM responds as we expect. secti on.
Consider the code detector FSM A more general view of the tandard con-
in Figure 3.45. What happens if troller architecture appears in Figure 3.-18. Th31
the user presses the stan button fig ure ass umes a state register that is 11/ bits Figure 3.48 tandanl ,,,,"troller
and then presses all three colored wide. aJ"('hireC'ture--genl"ml \ i '\\.
120 3 Sequential Logic Design-Controllers
3.4 Controller Design 121
3.4 CONTROLLER DESIGN us to easily sec which rows correspond to whi ch Slates. We fi ll all combinati f'
~~ the left, as usual for a truth table. For each row, we look at the state dia~ i ;,puts
O

Five-Step Controller Design Process . 9 to determine the appropriate outpu ts. For the two rows starting with SI~O Ig~~
. I'
We can deSign a control er uSing a
five step process summari zed in Table 3.2. We'll illus-
- ,
~:ta~~ 0Cb:
'"
should be O. If b - 0, the controller should stay in state Off, so nInO Sh:uld
. I, the controller should go to state anI. so nInO should be 01.
tra te thi s process with some examples. Likewise, for the two rows slarting with
TABLE 3.2 Controller design process. 5 150 = a 1 (state 0111). x should be 1 and TABLE 3.3 State table for lasertimer
the nex t state should be 0112 (regardless of controller.
Step Descri ption the value of b), so nInO should be 10. We - - - - - - - - ,_ _ _ _ __
Create an FSM th at describes the desired beha vior of the controller, complete the lasl four rows similarly. Inputs Outputs
CapillI" Ihe FSM
fr Be careful [Q nOle the difference 51 sO
ii; nl nO
between the FSM inputs and outputs of
Creale the
Create the standard architecture by using a stale reg ister of Figure 3.49. and the combinational logic Off aa a a a a
N
a. arc/ZirecllIre
appropriate width. and combinati onal logic With.inputs being the state Inputs and outputs of Figure 3.5O--the latter a o a a
regiSler bits and the FSM inpuls and ou tpul S bemg the next state bits
"
ii;
and the FSM outputs. Step 5:
mcludes the bits from and [0 the stale register.
a a o
Implement the combinational logic. We 0,,1
a
Assign a unique binary number 10 each s t~le . Each bin~ number ~an fi nish the design by using the combina-
a
r. Encode (h e slates
fr represen ting a state is kn own as an ellcodlllg. Ally encodlJ1g will do tional logic design process from Chapter 2. 0112 o o
ii; as long as each stale has a unique encoding. From the truth table, we obtain the following a
equa!lOns for the three combinational logic
Creme the stale
Create a truth table for the combinational logic such that the logic outputs: On3
a o o
"'fr" table
wi ll generate the correc t FSM outputs and nex t state signals. Ordering o a
the inputs with state bits fi rst makes this truth table describe the state
ii;
behavior. so the table is a state lab Ie, x = 51 + sO (note fromthetabIe that x=I if S1=lorsO=I )
.,., Imp/emelll the Implement the combinational logic using any method . n1 51 ' sOb ' + 51 ' sOb + s1s0'b' + sIsO'b
c.
n1 51 ' 50 + 5150 '
"
ii; combil1alioll{l//ogic
nO SI ' 50 ' b + sIsO ' b ' + S150 ' b
EXAMPLE 3.7 Three -cycles-h igh laser timer controller Icontinued) nO 51 ' sO ' b + 5150 '
We can implement th e laser limer (see Example 3.4) as a sequential circuit using the fi ve·step process.
We th en obtain lhe sequential circuit in Figure 3.50. implementing the FSM .
Step I: Capture the FSM. The FSM was created earlier (see Figure 3.39).
Step 2: Create the architecture. The standard contrOller architecture for the laser timer FSM Many textbooks will organize the
was shown in Figure 3.47. The Slate regi ster has two bilS to represent each of the four state table in different ways than that in
states. The combinational logic has external input b and inputs 51 and sO coming from the Table 3.3. However, we intentionally Combinational logic
state register, and has external output x and outputs nl and nO going to the state register. organize the table so that it serves both o
c:
as a state table and a truth table that can -5
Step 3: Encode the states, We can encode c:
0;
the states as follows-Off 00. 0,,/ : be used to design the combinational
aI, 0,,2 : la , a,,]: 11. Remember. logic of the controller.
any nonred undant encoding is fine.
The state diagram with encoded states
is shown in Figure 3.49.
Step 4: Create the state table. Given the
implementation architecture and the
binary encod ing of each state, we can
create the state table for the combina- Figure 3.49 Laser timer state diagra m with
tional logic, as shown in Table 3.3. encoded ~tate s.
Listing the inputs from the state reg- Figure 3.50 Final implementation of the
ister first in the input columns allows threc·cyc les-high Inser tim er controller,
122 3 Sequential Logic Design- Controllers 3.4 Controller Design 123
EXAMPLE 3.8 Understanding the laser timer controller's behavior .
EXAMPLE 3.9 Button press synchronizer
, FSM leI's trace through the behav ior of the
To betler understand how" controller implements ,In . "~ II in state 00 (5 I 50-0 0). b is 0, and
We want to build :l circuit that synchroni zes a
clk cycle 1 hcycle2 ncycle3 ncycle4
lhree-cycl~s- hl gh laser (liner controller. A~sume \~e are m~~l. yba sed 0 11 the combinational logic, X bUlion press to a clock signal. such Lhat when a user
the clock IS currentl y low. As shown in Figure 3.) 1 (left ~ I de).
.
. a ' h I 00 presses the bUllon. the result is a signal thm is high
bi -..J '
Inputs: : :
wtll be a (the desired output in state 00). nI 'II b 0 'lI1d nO will be . mc.lnlng t e va ue
WI e .' d 00 ' 11 b i d d ' t th for exactly one clock cycle. Such a synchronized , L
wi ll be waiti ng <II the state reg ister's inputs. Thus. on the "ext clock e ge. WI e oa e In 0 e Signal IS ~se rul (0 prevcm a single bulton press th at Outputs:
---r-1----
J :

Siale register. meaning we stay in state aD-which is correc t. I~s ls mu ltI ple cycles from being interpreted as mU/ bo
4

(I?'e blltto~ presses. Figure 3.52 uses a liming


x=o _ -- - __ x=o _ - - - - _ -~-b-'----"" diagram to Illustrate the desired circuit behavior.
- <$:»b' -<$))b' The ci rcuit's input wi ll be a signal bi and Figure 3.52 DeSired lIming diagram oflhe
b( ><.=.1 _ x= 1 x=1 b( ~ __ x=1 x=1 b(~ __ x=1 x=1 the output a sjonal bo Wh
en
bl' bcco mes. I' rep- bu([on press synchromzer
s-'S-<11
< ,
s-'~110n3
, 0

~~110n3 0& ~e~entmg the but ton b,eing pressed, we want to se t bo to 1 for exactly one cycle. We [hen wait for
1 t~ return to a agalll , alld then wa ir for bi to become 1 again. which would represent the next
pre ssmg of th e button ,

Step 1: Capture Ihe FSM. Figure 3,53(a) shows an FSM describing the circuit' behavior. The
FSM waJt~ In slate A, outputting bo""O . until bi is 1. Th e FSM then transitions to stale
B, OUlptlt,llng bo:; I. Th e FSM will then tran sition to either sla te A or C. which both set
bo=O again, so that bo was 1 for just one cycle, as desired. The FSM 0DCS from B toA if
b 1 returned to O. If b i is still 1. the FSM goes to Slate C. where the °FSM wailS for b i
fa return D. causing a trans ition back to Slale A.

elk r'-"*O-F'-, FSM inputs: bi; FSM outputs: bo

~
o i'
o \. b" bi
r b"
slale=01 A bi B bi C r
elk slale=OO !lL----~s=lal~e=~OO~---t bo=O bo= 1 bo=O
(a)
Inputs:
FSM inputs: bi; FSM outputs: bo
b - - - - - -____~ " '___________
Outputs: ,
--------------------------~ (b)
Figure 3.51 Traci ng the behavior of the three-cycles-high laser timer controller. bo=1 bo=O n1 = s1 'sObi + s1s0'bi
(c ) nO = s1 'sO'bi
Now suppose b become I. As shown in Figure 3.51 (middle). x will still be O. as desired. n I
bo = s1 'sObi' + s1 'sObi = 51 '50
will be O. but nO wi ll be I. meaning the value 01 will be waiting at the stale regisler's inputs. Thus,
on the lIex( clock edge. aI will be loaded inlo the state register, as desired. Combinational logic
As shown in Figure 3.5 1 (right side). soon after 01 is loaded into the state registe r, X will Inpuls Oulpuls
become I (after the register i loaded, theres a slight delay as the new values fo r 5 I and sO propa- s1 sO bi n1 nO bo bi
gate through the combinational logic gates). That output is correct-we should output X= 1 when in ;:.J!l
0 0 0 0 0 0
state 01. Also. n1 wi ll become 1 and nO will equal 0, meaning the value 10 will be waiting at the 0 --cn--o- Tb-T-'
0 0 1 0 1 0 ff! .~
state register inputs. Thus. on the next clock edge. 10 will be loaded into the state register. as desired.
After lO is loaded into the state regiSle r, x will Slay I , and n I nO will become II. When another
clock edge comes. 11 will be loaded into the register. x will SlaY I. and nl nO wi ll become 00.
CD --1-'0-0-
o 1 1
-6-'0--0--'
1 0 1

When anOlher clock edge comes. 00 will be loaded into the register. Soon aftcr. x wi ll become
Figure 3.53: Bulton press
(0 1 o 1 1 o 0
O. and if b is O. nInO wi ll stay 00 : if b is I. nInO will become 01. No tice we're bnck where we --1--'--0- -6-'0--0--'
started. synchroni zer design steps: <a) unused
inilial FSM . (b) arc hitecture. (c) 1 1 1 0 0 0
Understanding how a State register and combinational logic implement a state machine can clk
FSM with encoded Slatcs. (d) (d)
take a wh ile, ince in a particular state (indicated by the value presentl y in the state regisler). we
generate the eX lemal output for that state. and we generate the signal, for Ihe lI ext state-bul w. state table, (c) final circuil with
don't lran ;i tion to that next state (i.e .. we don't load the <wle register) untilthc next clock edge, • implemented cOll1binationallogic. (e)
--~----

124 Sequential Lo gic Oe sig n~Con trolie rs 3.4 Controller Design 125
. • FSM has three sta tes. the arc hitec ture has a two-bit Slep 5: Impl ement Ih e combin a liona l
Step 2: Cre~lte th e arc hit ect ure. Smce the
st~lIe regi ster. as shown in Figure 3.53(b). logic. We derive th e eq uati ons
W
. I ~ vard ly encode the th rce , tates as 00. 01. and for each o utput of the co mbina- "TI
Step 3: E ncode Ihe s la tes. We can str:lJg 11 Of'
tiona l logic from the ta ble. Afler X ~
10. as shown in Figure 3.53(c). o
so me algebra ic s implification.
Step ~: Creale Ihe s lale lab le. '"
" T." cOllvert the FSM wi th encoded states to a state table as
I
unused Slate 11. we have C lOsen to o utpu
t bo 0 d
= an the eq uatio ns arc as fOll ows: r ----+- y -gc
show n in Figure 3.53(d). For the z in
ret urn 10 sl ~lIe 00. . . . w = sI
S lep 5: .'
Impl ement th e com bllla tlOnal logiC.
" We derive the equ<ll lo ns for.c:Jch
.
combll13tlOnai
h X sIsO '
log .iC o utput. as show n III
. Figure.
. 353(»t:, and then creal C the fina l CIrculi as Sown.
y 5 I • sO

EXAMPLE 3.10 Sequence generator


z 51 • nl
Inpuls: none; Outputs: w. x. y, z
\Ve want to design a circuit wi th fo ur ou tputs: w. x,, y .
nI 51 xor sO
and Z The circui t should oenerate the followlIlg
sequen~e of ou tput pallems: 000 I. 00 II. 11 00. and -qs~oo nO sO '
The fina l circuit is shown in Fig ure
1000. After 1000. the circuit should repea t the 3.56.
Figure 3.56 Sequence generator
seq uen ce. slarting m 0001 again. We wanl the circuit
to genera te the next pattern o nly on a ri sing cI ~k edge.
Sequence generators arc co mmon in a w l~e ra nge
of systems. For exampl e. we might want [0 blink a ~el
cb--cb
wxyz=OOll wxyz= ltOO
con trol ler archileclurc.

of four lights in a particular paltcrn . such 35 in a festi ve EXAMPLE 3.11 Secure car key controller (continued)
ligh ts disp lay. We might instead want to rolate an elec- Figure 3.54 Seq uence ge nerator FSM.
Let"s complete the desig n fo r the sec ure car key controiler from Examp le 3.5. We already carried
tric motor a fixed number of degrees o n each cl oc k oUl the Ca pture Ihe FSM s tep o f the fi ve-s tep process. wi th the FSM shown in Figure 3.41. The
cycle by powering magnets arou nd the motor in a spe- remaining steps arc as foll ows.
cific sequence to attrac t the magnetized motor 10 the Wo
x C "TI
S tep 2:
next position in the rotati on-such a motor is kn own as y~~ C reale Ih e a rchileclure. Since the FSM has five statcs. wc' lI need a 3-bit state reg-
a stepper motor. si nce th e molor rOiates in steps. z in iSler. A 3·bi t stat e regis ter can reprcsent eighl slates. so three Slates will be unu ed. The
We can design the sequence ge nerator controller input to Ihe logic is sign al a. while the OutpulS are sig nal r and next SlalC oUlpurs n2.
us ing our five-s tep process. n 1. and nO. The architcclure is shown in Figure 3.57.

Step I : Capt ure Ihe FSM. We capture the S lep 3: E ncod e Ihe s lates. Let"s encode the states using a straig htforward binary encoding of
sys tem 's be havior as the FSM shown in 000 through 100. The FSM with state encodings is shown in Figure 3.58.
clk
Figure 3.54. The FSM has four states. which
weve labeled A. 8. C. and D (though any o
other four un ique names would do j ust fine). r C"TI Inputs: a ; Outputs: r
Figure 3.55 Seque nce generalor -g~
Step 2: Creale Ihe a rchit ecture. The standard Combinalional n2 in
con tro ller architect ure . logic
controller arc hitecture for the sequence gen-
erator wi ll have a 2- bit state reg i ter to
represen t the fou r poss ible states. no inputs TABLE 3.4 State lable for s equence
10 the logic. and o utputs w. x. y. z from the generator controiler.
logi c. alo ng wi th outpu ts n I and nO. as
s hown in Figure 3.55. Inpu ts Ou tputs
Step 3: Encode Ihe s tates. We can encode Ihe sI sO w X y z nl nO Figu re 3.58 ecure car k<~ F M \I ith
s tates as foil ows~A: 00. 8: 01. C: 10. D:
encoded Sl3le___ .
II. Any other encoding with a un iq ue code A 0 a 0 0 0 I 0 I Figure 3.57 Sec ure car key
for eac h state would also do fine. contro ll er archil cc lu rc.
8 0 I 0 0 I I I a
S tep 4 : Create Ihe stale tabl e. The Slale lable for
Ihe FSM with encoded states is shown in C I 0 1 I 0 0 I I
Slep 4: Creale th e Slule lab le. The FSM convened 10 a sto te table is ,ho.I n in TJ~le 3.: . For
Table 3.4. D I 1 I 0 0 0 a a a
Ih e unu sed :\ttH cs. we h:wc ch sen to SCI r - 311d the nc\! 'tale 10 000.
3.4 Controlle r Design 127
126 Sequential Logi c Desig n- Controllers
TABLE 3.5 State tab le for secure car key 111 step 3, we mu st encode the ~Iale~. Natu-
Impl emen t th e comb in 'l.ti o l~a l TABLE 3.6 State table for sequential
Slep 5: controller. ra ll y, the Sl ates have already been encoded. bU I we
circuit
logic. We ca ll desig n four Cl fCU ltS. ca n still name eac h Slate. We arbi trarily choose
one fo r c;lch output. 10 implement II~e InpulS OUIPUIS
Ihe labcls A. B. C, and D. secn in Table 3.6. In puts OUlputs
combin ational logic. We Icave thi S r n2 nl nO Slep 2 calls for Ihe crealion of Ihe slandard
52 51 sO a
step as an exercise for th e reader. c~nl roll e r arc hitec ture. Th is step req uires no work 5I sO nl nO y
o o 0 00000 Since the controller architec ture was already 0 0 0
Wait 0 o 0 I 0 00 1 defined. A 0 0
0 0 I 0 0
More on Controller Design o o 00 0 Finall y, in Slep I. we caplure Ihe FSM. Ini-
Converting a circuit to a n FSM . . KI 0 o 1 0 0 tiall y. we can set lip an FSM di agra m with th e 0 0 0 0 0
B
We showed in Section 2.6 Ihal a clrc ulL rOllr slates we 've labeled in step 3, shown in 0 1 I 0 0
Iruth table. and equat ion were al l ways 0 1
o o 1 0 1 1
Figure 3.60(a). Nex l, we lisl Ihe va lues of Ihe
K2 0 o 1 0 1 1
FSM outpu ts y and Z next to each state. For C
0 0 0 0 0
representing the same combinationa l fu nc- 0 I I 0 0
tion. Similarly. a circui t. state lable. and o o 0 0 0 example. in Siale A (51 sO = 00). Ihe OUlputs y
K3 0 1 0 0 0 and z are 1 and O. respectivel y. so we list 0 0 0 0 0
FSM are all ways o f represe nling Ihe same D
I 0 0 0
seq uenlial funclio n . . . o 0 o 0 0 0 "y l = 10" wilh Siale A in Ihe FSM . 0
We have been conve rllng an FSM 10 a K'/ o 0 I 0 0 0
c ircuit using a fi ve-step process. We can o I o 0 0 0 0 Outpuls: y, Z Inputs: x: Outputs: y, z
also convert a circuit to an FSM by o I I 0 0 0 0
o 0)
applying Ihe five-slep process of Table 3.2
in rever~e. In general. convertin g a c irCUit
to an equation or FSM is known a,s re,_'erse
Unused
I
1
1
0
0
1
1
o
0
0
0
0
0
0
0
0
0
0
0
0
0 0yz:10
0yz: 10
YZ: 10

ellgilleerillg Ihe behav ior o f Ihe CirCUIt. I 1 1 0 0 0 0

EXAMPLE 3.12 Converting a sequential circuit to an FSM


0 0 0yz:oo
0 yz:01
YZ:0 1
Give n the seq uential circuit in Figure 3.59. fi nd (a) (b )
III (c)
:;
an equivalent FSM. .~ x
We Slart from slep 5 of Ihe 5-Slep process :2 oc
Figure 3.60 Conve rting a Slale lable 10 an FSM diagram: (a) inilial FSM. (b) FS~I with OUlputs
described in Table 3.2. The combinalional f2 Z -ij
c
;;; specified. and (c) FS M wi lh OUIPUIS and transilions specified.
circuit has already been implemented. and we
can proceed to step 4. where we create a stale
Arter listing the outputs for Slares B. C. and D. shown in Figure 3.6O(b). \\C tum 10 the ,late
lable.
The combinational logic in the controller tmnsil ions specified in the s late table by 111 and nO. Consider th e first row oflhe sttlte table. \\hich
architecture has 3 inputs: 2 inputs. 50 and s1. says Ihal nlnO-OO when s1s0x=000. In olher words. when in laleA (s1s0=00). the nnl
repreo;;;ent the conlents or the Slate register. and I Siale is Siale II (nlnO = 00) if X is O. We can represe nl Ihis in the FSM diagram b) dr.l\\ing an
input, x, is an eXlema.1 input. Thus ~ur slate arrow rro m slate A back to stale A and labclin2. the new trnnsition " X ' ," No\\ consider the . nd
table wi ll have 8 rows Ince there arc 2 ::; 8 pas· row o f the stale table. which indicates th at fron~ Sl3lC A. we tr.msition to state B \\hen \ =- 1. \\'c add
sible combinat ions or inputs. a transition arrow from Slale A 10 B and label it "x." Arter labeling all the tr.lnsitions. \\ e are left
A ncr we set up the state table and cn u· wilh Ihe FSM in Figure 3.60(c).
meratc al l pas ible combinations or inpu ts You mny nOlice thut sl<He D cannOI bl.! reached from any OIht!r SlalC and transition, (0 stale -\
(e.g.. s l sOx~O~O ..... slsOx=ll l). lI'e on any input. \Ve C~l11 reasonabl y in fe r that (he original F ~t had onl) Ihree Slates and 'Iale D i"
use Ihe lechniques described in Secllon 2.6 10 Figure 3.59 A <C(IUenlial circlIil wilh :111 cX lrn. unused state. For completeness. it is preferable to Icave state 0 in lile tinal diJgram.
fill in Ihe values of Ihe OUlpUIS. For example. unknown behavior. however.
con,ider Ihe OUIPUI y. From Ihe combinalional
Gi ven any synchronous circ uit co n ~ i s t ing of logic gales and flip-flops. \\e ':m al\\a~<
circuit. we see that y" 5] ' . Knowi ng Ihi \ , .
we add a 1 in the y column of Ihe \laIc lable In every row where S 1 • O. and we add a 0 to ~h< redraw the ci rc uit as con ~ i sting of a state register and logi -{)ur st:mdard l'('ntroll r
remaining ' pace. in Ihe y column. Now con~idcr nO. which wc ,ee h'" Ihe Boolean cq~alloa arc hiteclUre-just by grouping all Ihe Oip-O ps logelher. Thus. the appfO.Ich dc>cnbnl
nO. S l ' sO ' X. Accordingly. we '01 nO 10 1 when S 1 = 0 and sO = 0 lind X= 1. We fill In tht above works for any synchro nous circuit. not j ust a circuit alr'ead~ dra\\ n in the fonn ,I'
column\ ror z and n 1 u\ing a simi lar an::llylii\ and move on 10 the neX! \"cp. o ur siandard controlle r arc hiteclure.
128 Sequential Logic Design- Controliers 3.4 Controlier Design 129
Clearly. Ihe OR of Ihose Iw .. .
Com mon Pitfa lls
Mistakes are commonl y made when capturing an FSM , relating to prop~rties regarding were bolh 0, neither condilion "~~~ndlllons '~ nOl l. bUI rather a+b . Thus. if a and b
specIfied 10 the FSM . Abov d be .Irue, ,lIld Iherefore the neXI Slate wou ld
:~Ib~
~
the transitions leaving a state. In short, one and ollly one tra nsitIOn condt tlOn should ever
Checking yields: e, we fixed Ih,s prob lem by addi ng another transi ti on,
evaluate to true du ring any rising clock edge. The propert ies are:
ab=ll- I. Only one condilioll sholiid be Irlle-For a given s13te, for any rising clock edge, + a'b + a'b'
next state? no more than one transition condit ion should be trUe. For example, consider an

...
• a + a' ( b+b . )
FSM with inputs a and b, and a state SWle I with tWO outgoing tra nsitions, one a a + a ' *l
labeled "a ", and the other labeled "b." What happens when a = 1 and b ~ 1- • a + a'

o:::X0 a'b
which transition should the FSM take? The FSM designer must ensure that the
conditions are exclusive-only one could possibly ever be true at one ti me. In the
example, the designer might label the transitions "a" and "a ' b" to solve the
. Analyzing Ihe equalions Illad f
a
- 1
..
ellher 1 or is a 101 of work. TIl e ~om c~nd l l lons of every stale and proving they equal
problem. Actually, a particular type of FSM , known as a nondetermillistic FSM, two slIuations and inform the d ,ere ore ,. ,I good FSM capture 1001 wi ll delecl the above
e Igner 0 t Ihe SIl U3110n.
does allow more than one condi tion to be true and chooses among them in some
arbitrary way-but when designi ng circuits, we usually want detenninistic EXAMPLE 3.13 . Verifying transiti on properties
. for the code detector FSM
behavior, so we don ' t consider no nde tenninist ic FSMs further.
FIgure 3.46. shows an FSM ~Or a code detector

~
truc" We \V'mtIO ,' f h '
2. Olle cOlldilioll sholiid be Irlle-For a given state, for any rising clock edge, aile of ( ,property for the transilions leavi ng '1: S' ven y I e 'only one condilion should be
the transitions fro m that state must be take n. In other words, every input combina- a r +b+g). We Ihus have three pairs of S '"~• . llIrl. There are Ihree condilions: a r, a'. and
rollows: ' con( II IOns. which we AND and prove each equal 0 as
tion should be accounted for in every state. Designers sometimes forget to ensure
whati'
ab=OO? th is. For example, consider an FSM with inputs a and b, and a state Slalel with
a r * a' a ' * a ( r ' +b+g) ar * a ( r ' +b+g)
two outgoing tran sitions, one labeled "a", and the other labeled "a ' b." What m( a*a')r - (a'*a)*(r ' +b+g) - (a*a)*r*(r ' +b+g)
a'b' ... happens if the FSM is in Slate l . and a = 0 and b ~ O? Neither of the two transi- - O*r O* ( r'+b+g) - a*r*(r'+b+g)

~
tions from Stale l has a true condition. The FSM is not full y specified-we need D 0 • 0 - arr ' +arb+arg
to add a third transition, indicating what state to go to if a ' b' is true. With that
- 0 + arb+arg
th ird condi tion, we have covered all possible values of a and b. A commonly for- arb + arg
gotten transition is a transition pointing from a state back to itself. - ar(b+g)
We can verify the above two propertie using Boolean algebra. For the first property It appears our FSM is not fu ll s cifi d
result in 0, which in IUm means bOI~ cpe d' e , as Ihe AND of Ihe third pair of conditions does nOt
of only one condition bei ng trUe, we can check that the AND of evelY pair of cOlldilions
delerministic FSM (if bOlh d" on 1I10ns could be true at Ihe same time-resulti ng in a non
all Irans iliolls of a stale always reslIlls ill O. For example, if a state has two transitions, con Illons arc fmc Wh ' l ' . h -
deleCtor problem descripll'on tllat ' . a IS t e nexl stale?). Recall from the code
one WIth condi tion a and the other with condition a ' b, using transfonnations of Boolean b wan(. to trans'It"Ion f rom the Slarr slale 10 Ihe Red l Slate when
a bUllon is pressed (a - I) and Ih twe
algebra we obtain: . a ullon IS Ihe red bUllon . ,and no Other colored bUllon is pressed.
Th e FSM III Figure 3.46 has the con d"
shou ld instead be arb' g' . h Ilion a r. Our mIstake was underspecifying Ihis condilion' 'It
* a'b - Ill ot er words a b tt h be .
(a*a ' )*b
As evidence 'hat
lilis "pitfa" " is
(r) and Ihe blue bUllon has nOI been p d
(b '~ on as en pressed (a) and il is the red button
The transilion from Starr I back to Iher~~s~ and Ihe gree n bUllon has not been pressed (g ').
= 0 * b ifldeed common,
the same as in Figure 3 46 aft I ' all stale could then be wrillen as a (rb ' g' ) , (which is
we ad",il ,har 'he
o mLfloke we made verify the "only one CO~di tio:~:~~::.ngr DeMorgan 's Law). After this change, we can agai n try 10
in Figure 3.46wos and a (rb ' 9 , ) ': p operty for all paIrs of the three conditions arb' g'. a' .
For th: second situation of one condi tion being true, we can check that the OR of all
ge1ll1;1I£. and lIof
Ihe condlllOlIS all l/'QnS/llOllS of a stale always re~II/IS in 1 . Conside ring the same example just made fo r
arb'g' * a' a ' *a(rb ' g ' )' arb ' g' * a (rb ' g ' ) '
ofa state that has two tranSitIOnS, one with condItion a and the other with condition a ' b educatiollal
purposes. A D aa '*rb'g ' O*(rb 'g' )' = a*a*(rb ' g')*(rb'g')'
uSlOg transfonnations of Boolean algebra we obtain : ' reviewer of Ihe ~ 0
O*rb'g' write rb ' g ' a Y for clarily.. .
+ a'b book caugh, il. We
left the mistake ill
o D a*a*Y*Y'
a * (1 +b ) + a' b alld added this ' = a*a*O
+ ab + a ' b example. to stress c 0
a + (a +a' ) b 'he pOitllllzat the
We wou ld need 10 change Ihe [ 'f d' .
a + b
misrake is
Ihe pairs of condilions for those slalreaS~sl' Ion con It Ions of the olher slates similarly, and then check
c
commo", ran sltlons too.
130 Sequential Logic Design- Controllers 3.5 More on Flip-Flops and Controllers 131
, SltIrr we OR the three conditio ns and
To verify the "one condition is mlc" property fo r stale ' more functionality than D flip-flops . to reduce the logic gates required ou t ide of the flip-
prove they cqu~1I I: flops, and hence to reduce the number of ICs neces ary to implement a circuit. Those flip-
flop types Included SR. JK. and T flip-flops .
arb ' g ' + a ' + a ( r b ' g', ) ', ) ' (write rb' g' as Y for clarity)
a' + arb ' g ' + a(r b 9 SR Flip-Flop
a' + aY + aY '
The SR flip-nap is similar to the SR latch descri bed earlier. with additional logic to make
a' + a (Y +Y' ) = a ' + a(l )
the CorCUlt triggered by the edge of a clock. rather than just the level of the clock.
- a' + a
JK Flip-Flop
1
We wou ld need to chec k the property for all other states toO, The JK flip-fl op is similar to an SR flip-fl op. wi th J corresponding to S, and with K cor-
responding to R (I remember this by thinking of " K" standing fo r "Klear" or clear). The

a---o
a=O
b='
a=O
b=O
. I'f'

.
FSM Notations ' Unassigned Outputs
SImp I yong . I
. .
ANDed wIth a riSIn g c
.. ' . '
We already introduced the slmphfYlng
<> •
.. b ' . I"
FSM otation of every translu on eong Imp .cnly
h n commonl y used simplificatio n involves
lock edoe Anot er
l'stin the assionment of every output in
JK flIp-fl ap's behavior differ from the SR flip-flop when both inputs are I . Recall that an
SR flip-flop 's behavior is undefined when both inputs are I. A JK flip-flop. in contrast.
togg le when both inputs are set to I (at the next clock edge. of course). To toggle means
. . If FSM has many outpu ts, I <>O

... to change to the opposi te state, meaning if the present stored bit is I. the next stored bit
"
c=0 c=' asslgno ng outpu ts. an I d ke the relevant behavior of the FSM hard to
every state can become cumbersome, an ma as follows-if an output is not explicitly wo uld be O. Likewise, if the present stored bit is O. the next stored bil would be I.

a---o
discern. A COllllllo n simpll fy mg notati on IS . 0 T Flip-Flop
ass i~ned in a state. the output is implicitly assIgned a .
A T fli p- fl op acts like a JK flip-flop wi th the JK inputs tied together to form the T input.
- . . . , '. I licit Clock Connections
Simphfyong C,rc u, t Drawongs. m~ v a sinoIe clock signal connected to all seque ntial In other words, whenever T is 0, the flip-flop maintains its current state. but whenever T
b=' c='
Many if not most sequential CorCUlts a. e <>. I because of the small triangle input
is I, the flip-flop toggles (think of "T" for ·'Toggle").
\~ k ' a component IS sequenua
components. e no\\. k b I Many circui t drawings therefore use a simplifi.
drawn on the component S bloc. sym o. 'be connected to all sequential components, Nonideal Flip-Flop Behavior
cat ion wherein the clock sIgnal IS ass umed to .
• simp lificatIOn
This . leads to Iess c luttere
d wiring in the draWIn g.
Generally, when we first learn about di gital design. we assu me ideal behavior for logic
. .' I d Sequentia l Circ uit Design gates and flip-flops, JUSt like when we first learn physics of motion. we as ume there's 00
Ma thematica l Formalisms on Combmatoona .an B i n functions and FSMs for
We have described two mathematical formahsms, 00 ea . ' friction or wind resistance. There is. however. a non ideal behavior of flip-ftops-metasta-
biJity-that is such a common problem in real digital design practice, we feel obliged to
. . o combon
deslgllln .atlona
. 'Id' not halle
I and sequelltial circuits, respect ively. Note that web dId th to
use thosee formah. sms to d eSlgn "CorC
· UI
ts. Recall that our first . attempt at UI 109 a . ree· discuss the issue briefly here. Digital deSigners in practice should study metastability and
possible SOlutions quite thoroughly before doing serious designs .
cycles-hl2h. laser ti. mer'on F·Igure 335
. J'ust had us connecti . ng components ' toge. ther on Lile
hopes of - . a correctIy workin "o circuit
creating .. However,
. . usong those formahsms
. provIdes
. for Metastability comes from failing to meet fl ip-flop setup or hold times, which we now
a structured and sound method of designong corcuns. Those fonnaiosms also proVIde Lile introduce.
basis for powerfu l automated tools to assist us with design, s uch as a tool that would auto- Setup Times and Hold Times
. II y c hec k' lor
matlca C the common pitfalls described . ' on thIS secllon ,' tools Lilat.
. earioer Flip-flops are built from wires and log ic gates, and wire and logic gates have delays.
aulomallca. II y conven Boolean equations or FSMs onto corcun , tools that venfy that tM
Thus, a real flip -flop imposes ome restrictions on when the flip-fl op's inputs can change
circu its are equivalent, tools that simulate our systems, etc . And, we have s~arcely touched Clk---IL-
relative to the clock edge. in order to ensure correct operation de pite those delays. Two
on all the benefits of those mathematical formalisms relating to automatong the vanous
aspects of designing circuits. and verifying the circui ts behave properly. The Importanceo[
using sound mathematical formalisms to gUIde deSIgn cannot be overstated.
o-riL-
: :
important restriction are:

• Setllp time: The inputs of a flip-flop (e.g" the D input) must be stable for a
:----:
setup time minimum amount of time, known a the setup time. before a clock edge arrives.
This intuitively makes sense-the input values mu t have time to propagate
3.5 MORE ON FLIP-FLOPS AND CONTROLLERS
Clk~ thro ugh any internal logic and be waiting at the internal gate ' inputs before the
clock pu l e arri ves.
I
Other Flip-Flop Tvpes
C)~
, ,
• Hold time: The inputs of a flip-flop must remain stable for a minimum amount f
Today, designer generally use registers to implement their bit storage needs, and LilOSl ,, ,, time, known as the hold time, after a clock edge arrives. Thi at 0 makes intuitive
regi ters typically are built from D flip-flops . However, in the past, tran Istors were m~cb t--') sense-the clock signal mUSt have time to propagate through the internal gate- to
more scarce than today. Thus, designer often utilized other types of flIp-flops, haVll\! hold time create a stable feedback state.
132 Sequential Logic Design-Controllers 3.5 More on Flip·Flops and Controllers 133
. . I k pul se width- the pulse must be wide
A related restrict ion is on the mlnllnum C oc . II ' d Why would we ever violate setup and hold times? After all, within a circuit we design
, tl ough the tnterna oglc an create a
enough to ensure that the correc t values propdgate lr we can measure the longest possible path from any Rip-Rap output to any flip-Rap input ~
stable feedback state. , ' long as we make the clock period sufficie ntly longer than that longest path, we can ensure
.
A flip-flop . II y comes W'th
typlca I ,a datasheet
, describi ng setup li mes, hold limes, and Our CirCUli obeys setup li mes. Li kewise, we can ensure that hold times are satisfied too
minimum clock pulse widths. . . I" D han cd 10 0 too close The pro blem is that our circuit li kely has to interface to external inputs, and we ~an't
Figure 3.61 ill ustrates an example of a setup lime Via allan. c g control when those inputs change, mean ing those inputs may violate setup and hold times
10the risino clock. The res ul t is that R was not 1 long enough 10 create a stable feedback When connected to Rip-fl op inputs, For example, an input may be connected from a
"
in Ihe cross-coupled NOR gates With . Q betng
' 0. Inslead . . to 0 bn .efl y. That
. , Q glitches button bell1g pressed by a user- the user can ' t be told to press the bunon so many nano-
glitch feeds back 10 the lOp NOR gale, causing Q' to gillch to 1 bn~ny. Thai giltch feeds seconds before a clock edge and to be sure to hold the button so many nanoseconds after
back 10 the bOllom NOR gate, and so on. The oscillalion would ilkely conttnue until a the cl~ck edge so that setup and hold ti mes are satisfied. So metastability is a problem
race condition caused the circuillo senle inlo a stable si luation of Q~ 0 or Q ~ l~r, the pnman ly when a Rip-fl op has inputs that are not synchronized with the circuit's c1ock-
circuil could enter a melastable state, which we now descri be. such II1pUts are said to be asy nchronous.
Designers usually try to synchronize a cir-
cuit 's asynchronous input to the circ uit's clock
D lalch C before propagating that input to components in aj ----0>--"':,.----1

the circuit. A common way to synchro ni ze an


D asynchronous input is to fi rst feed the asynchro- "
S nOlls iI/pur imo a single D flip-flop, and then use
the output of that Rip-Rap wherever the input is
needed, as shown for the asynchro nous input a i
R in Figure 3.62. Using a si ngle Ri p- Rap as shown
aj
also eliminates a second problem of different
Q'
values of the same signal appearing at the various
Q internal Rip-Raps at a clock edge, due to different
path delays. synchronizer
Figure 3,61 Setup lime violation: Dchanged 10 a (I) 100 close 10 the rising clock, u changed 10 1 "Hold on now! " you might say. Doesn' t that
after the invener delay (2), and then R changed 10 I afler Ihe AND gale delay (3), BUI then the synchroni zing Rip-Rap experience the setup and
clock pulse was over, causing Rto change back 10 a (4 ) before a stable feedback situalion with 0-0 hold time problem, and hence the same metasta-
Figure 3,62 Feeding asynchro~ous
occurred in the cross-coupled NOR gales. R's change 10 I did cause 0 10 change 10 0 after the NOR bility issue? Yes, that's true. But at least the external inputs into a single flip-Bop
gate delay (5), bUI R's change back 10 a caused 0 10 change righl back 10 1 (6). The glitch of a 0 on asynchronous input directly affects only one Rip- can reduce melllSlJlbilit) problems.
Q fed back inlo the lOp NOR gate, causing 0' 10 glitch 10 1 (7). That glitch of a 1 fed back 10 Ihe fl op, rather than perhaps several or dozens of Rip-
bottom OR gale, causing anolher glilch of a aon 0, That glilch runs around Ihe cross-coupled fl ops and other components. And that synchronizer
OR gale circuil (oscilialion}-a race condilion wou ld eventually cause Q 10 ettle 10 1 or 0, or
Ri p-Rap is pecifica lly in troduced for synchronization purpo es and has no other p~
possibly enter a metaslable stale (10 be discussed),
whereas other Rip-Raps are bei ng used to store bits for other PllIpDSCS- We can !herefore
choose a fli p-flop for the synchronizer that minimizes the metasrnbilit) prohlem-we can
Metastabili ty choose an extremely fast Rip-flop, andlor one with I'el)' small setup and hold times. and/or
dk-----FL If a designer fails to ensure that a circuit obeys the setup and hold times of a Rip- fl op. the
result could be that the flip-flop enter a metastable state. A Rip-fl op in a m etastable stall
one wi th special circ uitry to minimize metastability_ That Rip-Rop may be bigger than

o--t--©L
1 ,
is in a state other than a stable 0 or a stab le 1. Metastable in ge neral mean s that a system
is on ly marginally stable-the system has other states that are far marc table, A fli p-Hop
nonnal or can ume more power than nonnaL but there's only oe su h Hip-Hop per -yn-
chronou input. so those issues aren't a problem. Bear in mind that 0 matter what we 00_
though, the synchronizer flip-Rap could still become mc:tasrnble. but 3t Ie -t we can nuni-
H
setup ~me
in a metastable state may have an output with a va lue thllt is not a Q or a L instead out- mize the odds of a meta 'table state happening byeh -iog a good Hip-Hop,
violation putting a voltage somewhere between that of a 0 and that of ai , That voltage may nl 0 Another thi ng to cons ider i that a Rip-flop will typicnll not '(3) metast:lbl for I~
o<;cillate somewhat. That's a problem. Since a flip-flop' output i< connected to other I ng, Eventually, the flip-flop will "t pple" mer to amble 0 or a tahle _ It e how 3
o~ components like logic gates and other flip-nap" that wangc volLOge va lue may cause oi n tos cd onto the ground nm spin for a \ hi Ie (a mctustubl state) but will elentuall~
metastable other components to output strange value" and soon the V(Iluc, throughout our entire topple over to :1 stable head or tail. Whm many designcn; th refore do IS IIltrodu:e til )/'
state circuit can be in bad ~ hape , 1110rc flip-flops in series for s nchronitation purposes, as ShOll'11 in Figure 3 63, '0 I n If
134 Sequentia l Logic Design-Controllers
3.5 More on Flip-Flops and Controllers 135
the first flip- fl op becomes meta- Probability of flip·llop being ~ip-flOP to 0 on the nex t clock edge. Li kewise, a synchronous set inpul forces the flip-
stable. that fl ip-fl op will likely metastable is: . op to 1 On a rising clock edge. The reset and set inputs Ihus have priority over the 0
very
reach a stable state before the very Incredibly Input. If a flip-flop has both a synchronous reset and a synchronous set input. the flip-flop
next clock cycle. and thus the low low datasheet must Inform the flip-fl op user which has priority if both inputs are sellO 1.
second flip-fl op is even less An asynchronous reset forces the flip-fl op to 0 independently of the clock signal-
likely to go metastable. Thus the al
the clock does not need to be rising, or even be 1. for the asynchronous reset 10 OCcur-
odds of a metastable signal actu- hence the term "asynchronous." Likewise, an asynchronous set. also known as preset.
ally making it to our circuit"s can be u ed to asynchronously force the flip-flop to 1.
normal flip-flops are very low. We omit discussion of how
synchronizers
This approach has the obvious synchronous/asynchronous reset/set
drawback of delaying changes on inputs would be internally designed
the input signal by severa l in a flip-flop.
Figure 3.63 Synchronizer flip-fl ops reduce pro bability of cycle 1 cycle 2 cycle 3
cycles-in Figure 3.63. the rest Sample behavior of a flip-fl op's Clk -----!l--_~ L---4L---_
of the circ uit won't see a change
melaslabilllYin our regular flip-flop .
asynchronous reset input is shown in
on the input a i fo r three cycles. . ... Figure 3.65. We assume Ihe fl ip-fl op
As clock periods become shaner and shaner. the odds of the firs t flip-flop s t~bllizmg D
initially stores 1. Selting AR to 1
before the next clock cycle decreases. so metastability i becomIng a more challengIng Issue ,
forces the flip-fl op to O. independent AR '
as clock periods shrink. Many advanced methods have been proposed to deal with the. iss~e .
Nevenheless no malter how hard we try. metastability wlil alway be a posslblilly,
meaning our cir;uit lIIay fa il. We can minimize the likelihood of fail ure, but we c.an't
completely eliminate failures due to metastability. De igner often rate their deSigns
of any clock edge. When the nex t
clock edge appears, AR is still 1. so
the flip-fl op stays 0 even though the
input 0 is 1. When AR returns to O.
--LLr----L
Q : : ;

using a measure called mean time between failures . or MTBF. DeSigner typically 31m
the flip-fl op foll ows the 0 inpul on Figure 3.65 Asynchronou reset forces !be flip-Hop
for MTBFs of many years. Many students find this concept-that we can't design fail-
successive clock edges, as shown. to O. independent of c 1 or D.
proof circuits-somewhat disconcening. Yet . that concept i the real situation in design.
Designers of serious high-speed digital ci rcu its should tudy the problem of metasta-
bility, and modem soluti ons to the problem. thoroughly. Initial State of a Controller

Particularly observanl readers may have come up with a question when we implemented
Flip-Flop Reset and Set Inputs
FSM as controller in thi section-what happened to the indication of the initial tale of
Some D flip-flops (as well as other flip- an FSM when we designed the controller implementing the F M' The initial -mle of an

-r:r0~
flop Iypes) come with extra inputs that FSM i the state that the FSM starts in when the FSM is first a ti\1lted-or in ntroUer
can force the flip-flop to 0 or 1, inde- temlS. when the controller i firsl powered on. For example. the laser timer ntroller
pendently of the D input. One uch
input is a clear, or reset, input, which
forces the flip-flop to O. Another such
input is a set input, which forces the
yyr-y (a) (b) (e)
FS 1 in Figure 3.39 has an initial state of Off. When we omened our graphi-al FS~ls to
state table in this section. we ignored the initial tale infonnation. Thus. all of our n-
troller circuits stan in some random stale based on whate,,:r \'alues happen 10 appear m
lhe state register when we power up the circuit. , ot kno\\;n" the initial -tale of J -ircuil
flip-flop 10 1. Reset and set inputs are Figure 3.64 0 nip-flop, with : (n) 'ynehronous could pose a problem-for example. we don't want ur laser timer ntroller I ;!:lrt in
very useful for initializing flip-flop to resel R. (h) a ynehronou rc et AR. and (e) state lhat immediately turns on the laser.
an inilial value (e.g., initializing all flip- asynchronou; rescl and ... 1. One olu ti on i to add an additional input. r eset. to e",,) L'OnlI'Olier. tting "ese:
flops to Os) when powering up or reset- to 1 should cau e a load of the initial state into the stnlC regber. Thi inioal 51 Ie ' W
ting a system . These reset and set inputs hould not be confused with the Rand S inputs of be forced into the tate register. The re ' et and set inputs of a flip-Hop ( OJ 10 \ ') b;md~
an RS latch or flip-Hap-the reset and set inputs are ~ pecial control inpu t ~ to any type of in thi situalion. We enn imply onnect the controller' - rese input I the ('e. l;md
flip-flop (D. RS . T. JK ) that take priority over the nomlal data input ~ of 0 nip-flop. input of the tate register" Hip-ft ps in a \\ ay that sets the Iltp-Il< s 10 the imtiJI 5t I
The resel and 5et inpull of a flip-flop may be either synchronol!\ or 0'> nchronou . A when rese i 1. For c~antple. if the initial state of n ~-bil sw regi, r . h,'Illd
synchronous reset input force the flip-flop to 0, regardlc \\ of the ,·aluc on the D inpuL lhen we could nneet the ontrollcr's re.cI inrut 10 re,et .ll1d set tnpU
during a rising clock edge. For the flip-flop In Figure I.M (a). ctllng R to 1 ~ rces the flop . . as .ho\\ n in Figure 3.M .
136
3 SeqUential L .
ogle DeSign- Controllers
3.8 Product PrOfile-Pacemaker 137
. Or cou"e . for thi; reset fun c-

-
component can instead have an active- low input. An active-low input (also known as a
tIonality to wor~ as desi red. the b
~ /l egative logic input) is a control input whose operalion is aClivated by seuing the input to
deSigner must cn~lIrc lhal the con- Combinational
logiC O. Figure 3.67 depicts a 0 Rip-Rap with an acti ve-low synchronous reset input-the circle
lroll er ', reset input is I when the ~
f;;o- at the R input indicates that the R input is aClive-low. Thus. LO reset the flip-flop LO O. we
sYStem is fir>! powered up. Ensuring
lhe reset input is I duri ng power up st tsO would set R to 0, whereas for nonnal 0 Rip-flop operalion, we would set R LO 1.
Active-low inpulS can OCcur on any component with a control input. not just on flip-

-
an be hnndlcd using an appropriate State register
elk ,--- fl ops. For example, the enable control input on a decoder could be active-Iow-seuing that
e lec lronic circui t connected to the
On/off Sw itch. the description of
t> ~p- D O' p.. enable to 0 (meaning the decoder is enabled) would cause nonnal decoder operation, while
\Vh,ch is beyond Our scope. selli ng the input to I (meaning the decoder is disabled) would resu lt in all OUtpUls being O.
ate that. if the synchronous t> Of- t> 0 f.- When discussing the behavior of a component. designers wiIJ often use the Lenn
re ' et Or se t inputs of a flip- na p are
....
resel ~ S-- assert to mean setting a control input to the val ue that activates the associated operation.
Thus, we might say that one must "assen " the R inpul of the 0 flip-flop in Figure 3.6 in
Used . then the earlier-discus ed
etup and ho ld times. and assoc i. order to reset the Rip-fl op to O. Using the tenn assen avoids pos ible contu ion mal could
ated metastab il ity issues. apply 10 occur when some control inputs are active-high and others are acti\·e-Io".
tho e reset and sct inpu ts. Figure 3.66 Three·cycle ·high laser timer con~lIer Active-low inputs typicall y exist when the internal design of the component requires
with a reset input th at loads the stale register with fewer gates when implemented with an active- low input than with an active-high input
Nonideal Cont II B ' . the initial "Iatc 0 1.
ro er ehavlOr: Output Glitches
3.6 SEQUENTIAL LOGIC OPTI MIZATI ONS AND 'TRADEOFFS
G litching is the presence of temporary values on a wi re. typica ll y caused by differe~t (SEE SECTION 6.3)
delays of different log ic paths leading 10 thm wi re. We saw an example of gluchll1g m
Figure 3. 13. G litchino wi ll also often occur when a controllcr changes states, due to dlf- The earlier secti ons described how to design basic sequential logic. Thi section. "hicb
fe rem path lenOlhs fr~m each of lhe controller's state reg i ter flip- fl op to the controller's phys icall y appears in this book a Section 6.3. describes how to create bmer sequential
OUlpUI~. Consider lhe Ihrec·cycJes-hioh laser timer design in Figure 3.50. The laser logic (smaller. fas ter, etc.) using optimization and tradeoffs. One use of !hi boo '
shou ld be off (output x =O) in Slat: 5150=00 and on (x - I) in . tates 5150- 01. describes sequenti al log ic design optimization and tradeoffs immedialel~ after inrro-
sIs 0 = 1 O. and 5 I 50= II. However. the delay from 5 I 10 x's OR gatc 111 thc fig ure could ducing basic sequential logic design. meaning now. An altemati\'e use describes
be lo nger lhan the delay from 50 to that OR gate. The resu lt could be lhat when the state seq uenti al logic design optimizalions and tradeoff later. after completing the introduc.
regi ster changes Slate from 5150 =01 to 5150-10. the OR gate ' input cou ld momen- lion of basic datapath components and RTL de ign (Chapters -4 and -).
taril y ee a 00. The OR gate wou ld thu output 0 momentarily (a glitch). In the laser
timer example. that glitch could momentarily shut off the lascr-an undeS Ired ituation.
Even "'or~e would be glitche that momentarily tum all a la;cr. . 3.7 SEQUENTIAL LOGIC DESCRIPTION USING HARDWARE
Real deSigner must detenninc whether such glitching would really pose a proble~ 111 DESCRIPTION LANGUAGES (SEE SECTION 9.3)
lheir pani ular ~y tem. and if so. those designer\ should take action to avoid s~ch gluchll1g.
One solution in the laser timer example might be to insen a 0 nip-fl p after x s OR gate 10 This section. which phy icalJy appears in thi book as Se ti n 9.3, lI1trodu . the_ use oi
Figure 3.50. Thi ~ would shift the x output later by I clock cycle (\till resulting 111 three HDLs for describing equenlial logic. One use of this book imrodu uch use ot H~Ls
cycles high. however). bu t shou ld eliminate glitche\ see n at the x output. as only the table immed iately after introducing basi equential logi design. meaning nO\\ . An altemat]\e
value appearing at the output would be loaded into the fl ip-flop on a ri~ing clock edge. use introduces such HDL use later.

Active-Low Inputs (Negative Logic) 3.8 PRODUCT PROFILE-PACEMAKER


D O' ~
mil now, we ha ve a \umcd acti ve·high input' on A pn emaker is nn electronic devi e that pnl\ ides electrical stimulati n t .~ hem to help
flIp-flop, and other componelll'. An actil'e-iligil o regulate 3 hean ' beating. .. teau ing 3 heart \\ hose bod)'~ natural "lOtnn~I" r, ~
input h a comrol Input who\c a"ociated operatIon I< not worki nc properh . perhaps due to di.ease. ImplantJble pa III 'e '_ \\hl~h = ,~,_
aCll~:llcd hy ,cll ll1g the '"put to I For examplc, If an ally placed, under .the '~1Il . F''ll"ni. _'~'
. '''' •' h0 \\ n III ' . . an' \\ l'rn b\. el\ r I :: mill"
Input ca n rc'ct a fl,p-Oop. we '" umed that '"flut mcricUlt'. The) nrc pl.l\\en:d b) J bJllcl) thm t,st tcn ~ af' r nh:n!. Pa.: nl _
figura J 67 f) Olr-O lp Wllh ad;'e
rc'ct ",hen thc Input \ value Wi" I Hnv.c'er, a illlpnl\cd the qt;n lit) (1f hfe II.' \\ ell ,h l'llgth ned the li\c', f mJn\ nil II I,,", .'1
In\\- )n hrnnuwlo rr'Cl IOp"l

-.., . __-----
...
138 3 Sequential Logic Design-Controllers . 3.8 Product Profile-Pacemaker 139
. I ) 'md I WO venlrl C cs. I - (left and right) . The. ve nlrtcles
A heart has two atria (left and ng 11 , . ,' , the blood fr0111 the vein. A very The ri ght side of the fig ure shows the controller's behavior as an FSM. Initially. the
. "I he utna rece"e , . COOlroller reset the timer in state ReselTimer by setting t = 1. ormally. the controller
ush the blood out to the artenes. whl e t I contraction in the heart s rt ght ven-
p
simple pacemaker has one sensor 10 detect a" nalll I t'ra
1111ulation to thm nght . ve Olnc' Ie I'f th e wa its in state Wail , and stays in that state as long as a contraction is nO! detected (5 ') and
. I d one output " to deI'IVer electnca" s
wire d ' '' period- tYPIC" . II YJU' t un der one the timer does not reach 0 ( z '). If the controller detects a natural contraction ( 5), then the
tnc e, an . h" pec lfie li me "
nalUral contraction doesn't occur WIt In , s ct',on nO! only in the nght ventncle, controller again resets the timer and returns to waiting again. On the other hand. if the
second. Such electrical Sll.mu lallon
. causes a contra . cOOlroller sees that the timer has reached 0 (z = I), then the controller goes to stale Pace.
but also the left ventricle. which paces the heart by setting p= 1, after which the controller returns to waiting again.
Thus, a long as the heart Contracts naturally. the pacemaker applies 00 stimulation to the
hean . But if the heart doesn't contract naturall y within 0.8 econds of the last contraction
(natural or paced), the pacemaker forces a contraction.
The atri a receive blood from the veins. and contract to push the blood iDlO the "eotri-
c1es. The atri al COOlractions OCcur jusl before the ventricular contractions. Therefore.
many pacemakers. known as "atrioventricul ar" pacemakers. sense and pace nO! just the
ventricular contractions, but also the alrial contractions. Such pacemakers thus bave two
sensors, and two output wires for electri cal stim ulation. and may provide bener cardiac
output, with the des irable re ult being higher blood pressure (Figure 3.70).

Inputs: sa, za, SV, zv


Outputs: pa, la. pv. tv
la=1

Figure 3.68 Pacemaker with leads (Ieil). and pacemaker . localion under the skin (right). Counesy
of Medtronie. Inc.
. . fa sim Ie pacemaker's control ler usi ng the FSM in
We can descn be the behaVIOr 0 h P h pac ' maker con i ting of a controller and
69 Th I ft 'de of the figu re sows t e e .
Figure 3. . e e Sl . h the timer when t - 1. pon being reset. the
. Th " h n input t wh,c resets .
a umer. e umer as a . 8 d If the timer cou nts down to O. the lImer
timer begins counting down from °id ~con ;befOre rcaching O. in which case the timer
sets its output z to 1. Th~ II mer COhU . re t:rt counting down from 0.8 seconds again. Figure 3.70 An atriovenlrieular pacemaker'S contrOller FSM (using the comenoon thaI FS)\
d t z to 1 and Instead t e lImer . . . h
ocs not se . h' h ' 1 when a contraction i~ sen~ed In the ng t ven. OUIPUIS nOI explici lly sel in a Slale arc implieili, sel 10 0).
The controller has an input s. w IC IS h' h the controller sets to 1 when the controller
tricle. The cOOlroller has an output p. w IC The pacemaker has two ti mers. one for the right atrium (TimerA) and ne for th...
wants to cause a paced contraction. right ventricle (TilllerV). The comroller initiall) resets TimerA in tate Re etTunuA. and
then wailS for a nat ura l atria l contra tion. or for the timer 10 reach O. If the xmuoller
detects a natu ral at ri al contraction (sa). then the ontroller skips pacing of the ~trium. On
the other hand. if Tilll erA rea he 0 first. Ihen th ... :omroller gO<!' to 'tate Po -eA. hich
causes a contraction in the atrium bv setting pa- l. After no atrial 'ontra -non (... trW
natu ral or paced). the c ntro/ler reset' Timer! ' in 'Iate ResnTimul: and then \\ail> for
nat ural entricular con traction. or for the timer to ~Jch O. If a n"rural , ... ntnculJ.r u'fltr.lC-
tion occurs. the contmllcr skips plICing of the \enmde. n the other hand. If n a\
reaches 0 first. then the controlla gO<!, to ,t.ue Pace I : \\ hich .IU_ , a :"'ntr. ' tbn '"
psI ve ntricle b sett ing pv - 1. The ontroller then ~tum, to th am,ll ,tat" .
t. o
lost modcm p'lcema~l·rs -an h:1\ e the tim'r pam111erel"' pn,'gr.lflUlk-J 1 I ,I~
thrOllch r:ldio sielln" ,0 that JoctOI"' can u: Jlfli.'rcnt t~Jtment' \l1thL'Ut h'l\m~ tl' ' u :1-
F,gure 3 69 A ba'lt pacemaker", ."nlloller ~SM call) ;c11Io\,e. p~gram. and ~implant the pal'e1l1a~er.

d
I~O Sequential Logic Design-Controllers 3.1 0 Exercises 141
Th is example demonstrates Iho use fulness or FSMs in desc ribing a com rOller's (d) 10 GHz (PCs of Ihe early 2000s)
behavior. Real pace makers have controllers wilh lens or even hundreds or SlaleS 10 deal (e) I THz ( I lerahcnz)
wilh ""ri ous details lhal we left Oul of Ihe example ror simpli cilY· 3.2 Compule Ihc clock . r'
With Ihe adve nl of vcry low-power microprocessors . a trend in pacemaker design is (a) 32.768 kHz pe ,od for the following clock frequencies.
lhm or implemenling Ihe FSM on a mi roprocessor ralher than wllh a custom scquenllal (b) 100 MHz
ci rcui!. Microprocessor impiel11clllalion yields Ihe advanwge or easy reprog ramming or (c) 1.5 GHz
lhe FSM . expanding the range of treatmenls Ihat a doclor can ex peflmenl wllh. (d) 2.4 GHz
Compute
3.3 (a) I s
Ihe clock frequency for the following clock periods.

3.9 CHAPTER SUMMARY (b) I ms


(c) 20 ns
Seclion 3. 1 introduced Ihe concepl or sequenlia l circuilS. namely circuils thai slore bits,
(d) I ns
meaning the circuils have memory. known as 5(3IC. Section 3.2 developed a series of
(e) 1.5 ps
increasingly robusl bil storage bl ocks. includ ing Ihe SR lalch. D lalch. D nip-nap. and
3 .~ Compule Ihe clock r
fina ll y a register. which can store muliiple bils. The seclion al a introduced the concept of (a) 500 ms requency "or the following clock periods.
a clock. whic h synchronizes Ihe loads or registers. Sec lion 3.3 introduced fin ite-state
(b) 400 ns
machines (FSMs) for capluring the desired behavior of a equenlial circuit. and a slan- (c) 4 ns
dard archileclure able 10 implemenl FSM s. Wilh an FSM implemenled using the (d) 20 ps
archi lecture known as a controller. Seclion 3.4 then descri bed a fi ve- tep process for con-
3.5 *Assume scienlists have developed a t hO h -
"en ing an FSM 10 a controller implementati on. Seclion 3.5 highlighled some types of lance, meaning signals w'lh ' h' . Ip 3vmg perfect transi tors and "ires 'With no resis-
flip- fl ops Olher lhan Ihe D flip-fl op. Ihose olher Iypes being popul ar in the past. Thai I In t IS chip can tra\'e l at lh peed f . '"
second. Assuming OUf digital circu't h 'dth e5 0 hghl. or 3xlv- meters!
th I k ' as a w, oP - mm and a h'!!h f -
seclion also desc ribed several liming issues related 10 Ihe use or flip-fl op . including setup e c oc . period and clock frequenc 'h th - . e,_ t 0 ~ mm. compute
lime. hold lime. and metastabililY. The seclion introduced asynchro nous clear and sel a single cl ock period is: y. \\ ere e longest dl lance an) signal must £r3\"cl durim~:
inpu ts to nip-flops. and described their usc for inili alizing an F M to il initial tate. (a) one-eighlh of the wid,h of the circ uil -
Secl ion 3.8 high li ghled a card iac pacemaker and illu trated the u e of an FSM 10 describe (b) one-half the heigh, of the circuit
lhe pacemaker's behavior. (c) lhe widlh of Ihe circuit
Designi ng a combinational circ uil begi ns by capluring Ihe desi red circuit behavior (d) diagonally across the circuit
using either an eq ualion or a lrulh table. and lhen following a everal slep process 10 (e) Ihe perime,er of lhe circuit
convert Ihe behavior 10 a combi nalional ci rcui!. Designing a eq uenlial ci rcuil begins by 3.6 Trace Ihe behavior of an 5R latch for lhe fo llowino· . .
caplUring the des ired circuil behavior as an FSM . and then following a cveral step for a long time. then 5 chanaes 10 I and ~ Slluaoon. Q. . and Rare 0 and be<on !la,,,
Using a liming diagram. sho~v the \'3Jues ~;;SJ there for :1 lon~ (i,me. then ch:mg ck to O.
process to convert the behavior lO a circuil consi ling or a register and a combi nalional
Assume logic gates have a tiny but nonzero dei~ar on e\er) "lre for c\el') change 00 3 \\"Ire..
circui!. known as a con troller. ConceplUall y. then. with the knowledge in Chapler 2 and
3. we can build any digital circui!. However. many digita l ci rcuil dea l wilh inpul data Qs 3.7 Repeal
rop Exercise
h 3.6. but aSS ume thai S "'as hanged
.' to I just long enough for ~ " 1!OJJ
_ 10
many bits wide. ; uch as five 32-bit inpu l5. Imagine how complex ur equal ion . lruth P ag.atc l rough one logic gate. after \\ hich -
nOI sallsf), Ihe hold ,ime of the lalch. \\as changed back to O--in other \\ords. did
=
tables. or FSM s would be if they involved 5"32 160 inpul'. Fortu nalely. components
have been developed specifically 10 deal with data inpuls and Ihus ~impl ify the de ign Gs 3.8 ;f"JCC the behavior of a level-sensili'e 5R la'ch (see Figure , I~l ~ th .
"gure 3.7 1. Assume 5 1. RI. and Q are inilialh 0 Co - . ' . or e '"pllt p"ttem m
process--components Ihal will be described in lhe ne~1 chapler.
logic g3lcs have a tiny bu t nonzero deI3~ . . ' mplete the nmmg dJ~~ J..~""Ummg

3.10 EXERCISES c
Any problem nOled wilh an a\icri,k (0) reprc,enL' an e pecl3l1) chnilenglllg problem. S~~~~__~_______

SEc.-no ' 3.2: STORI NG ONE BIT- FUr', f'LOl'S


A ____~!l~ ____~r_l
SI '-----
3. 1 Compule Ihe clocl period for Ihe folil)wlIlg cJocllrequcnClc,
(J) ')0 lHI (Cilfly compule"l
Al
(hi lfJO MHI (Son} Pld)'IJ"'1Il 2 pre""'''''1 Q
Ie) 1 ~ Glil ({nlel Pcnllum 4 prll"t Or) Figure 3.71
. I L 'c Desig n- Controll ers .
3 I~ ) fo r the input pattern 111
142 3.10 Exercises 143
Sequentl. ogl . >

.. SR I teh (~c;t! Figure. . .' . . 3.1 3


T ,. h behavior of a level-scnslt l\'C ... a 0 COI npktc the timing diag ram . ass ummg Trace the behavior of an edge- tri ggered D Ri p-Rop using a master-servant design (see Figure
.\.9 ract: I e R i d Q arc 1Il 1 1l ~l ll y .
Fiourc 3.1'1. A~S 1l11le S I. . an I 3.24) for the inp ut pattern in Figure 3.76. Ass ume each internal latch initially stores a O. Com-
lo~iC gates have a tiny but nonzero de la). plete the timing di agram, ass uming logic gates have a ti ny but nonzero delay.
----, C~ L-J I~_____
C
S
n D/Dm ~r--JI~._ _ _ _ _ _ _ _ _ _~ n ~_ _ _ _ _ _ _
R n Cm
Sl
Orn/Ds
R1
Cs
Q . .
Fi ure 3.72 SR latc h input pattern tIIll1llg
d' om!1l
13,::='
fo r Exercise 3.9
• •
as
g . '. SR latch (sce F'Igure..] I ~)
for the. "'put pattern . 111 Figure 3.76 Edge- triggered D Rip-Rap inp ut pattern timing illagram for Exercise 3. 13
T h behavior of ;1 levcl-sensIU\ C C Ictc the ti min " dIag ram. ass uming
,\.10 race t e , 51 RI and Q are ini tiall y O. OI1lP c
Figu re 3.73. Assume . ' de la ' 3. 14 Trace th e behavior of an edge-triggered D Rip-Rap using the master-servant design (see Figure
logic gates have :l liny but nOlllcro ). 3.24) for the inpu t pattern in Fig ure 3.77. Assume each internal latch irtitially stores a O. Com-
~--, plete the liming diag ram. assumin g logic ga tes have a tiny but nonzero delay.
C
n
S
R n n D/Dm
C

Sl Cm
R1 Orn/Ds
a Cs
. . di agrn m for Exe ise 3. 10
Figure 3.73 5R lalc h in put pattern II mlllg as
ure 3 I ) for the inpu t pattern in Figure 3.74. Assume Q Figure 3.77 Edge- triggered D Hip- Hop inpul patte rn tim ing diagram for Exercise 3.1 4
.1.11 Trace the beha"ior of a D latch (see F ig . . I gic g.te haH'" ti n), bu t nonze ro delay.
is inilially O. Complete the liming diagram. assu ming 0 3.1 5 Compare the behavior of D lalc h an d D Rip-Rop devices by completing the timing illagram in
Figure 3.78. Ass ume each device initiall y stores a O. Provide a brief explanatioo of the
~-- I I I
C ~ L--J '----- be havior o f each device.

D~L_ _ _ _~rlL----
C--.J L-1 L
S D~~_ _~
R a (D latch)
a a (D flip-flop)
Figure 3.74 D latch input pattern timing diagram for E<crcl'e .1 II Figure 3.78 D I31Ch and D flip-Rap input pattern ti ming illagram for E.lereise 3. 1

Fi J 18) (or the IIlPUI p.llern III Ftgure J .75. ume Q 3. I 6 Compare the behavior of D latch and D Hi p-Rap de' ice by completing the timing di8gram in
C' .1.12 Trace the behavior of a D latch (\ee Igure : logIC gate, h",c ,I tin) bu t nonlero del.).
P L U'S " initiall) O. Complete Ihe IImlllg dtagrnm. as\um"'g Fig ure 3.79. Assume each device initiall) stores a O. Provide a brief explanation of the bdla,"1OI"
of each de ice.
C
D - . f I L -____~ C
D _ _ __ ---'
S
R
a (D la tCh)
o a (D flip·llop)
Figure 3.79 D latl'll und D tli p-Ih,p '"I ut p,mern ttnllng dl'l!)rnm f..... E.«n:" _1, I
Figure 3 75 0 lilkh ",put pJttern IlIning diagram f.. r r:"",'C 1 12

t s
3 Sequentia l Logic Design- Controlle rs . 3.10 Exercises 145
I~
.. niches connected in ~c ri c:, ( the o utput o f o ne IS can· SECTION 3.3: FINITE-STA TE MACHINES (FSM) A D CO NTROLLERS
3. t7 C r ealc
. , ', of three Icvcl-senslll
a Cl feUI h
vC how
D I. 3 cl 'oc k Wi.l h a long hi gh-li me can
.
cause the value
I k
d 10 the input of the nexth). ow h orc th 'tn one Intch dUring the same c oc 3.23 Draw a state diagram for an FSM thai has an input X
neet e 'cklc throug III •
and an OUlput Y. Wheneve r X changes from 0 to I , Y a3 a2 a1 aO
at the input of the fi rst D late 10 Ln
should become I for two clock cycles and then return
cycle . . fl . ' lid , how how the input of the first D to O-even if X is sti ll I. (Assume for this problem
. cdgc- t n·."
3 18 Repeal Exercist: 17 uSlIlg ~el,;' red D
. flif1 P' op, ." . how long the CIOC k signa
0 maller ' I 'IS h'Ig.
h c
. latch does nol tri ckle through to the nex t fllp- op n and all other FSM problems th at an implicit ri sing
clock is ANDed with every FSM transition condiLion.)
3. 19 sin2 D fl ip-flops. creatc tI circuit a3 a2 a 1 aO
3.24 Draw a state di agram for an FSM with no inpu ts and
wi th ~an input X and an output Y. such "I Ti l three outputs, x, y. and z. xyz should always follow
that Y always equals X ddayed by 13 12 II 10
th e following sequence: 000. 00 1, 0 10. 100. repeal.
twO clock cycles.

.'.20 Us ing fou r registers. des ign a circ ui~


lhal stores the previous four v~ l u ~:-;
c - t> reg(4)

03 0201 00
The outp ut shoul d change onl y on a rising clock edge.
Make 000 the initial Sla te .
3.25 Do Exercise 24, but add an inpu t I tha t can stop the
see n at an 8-bil inpu t D. The circ uit
sequence when sel to O. When input I returns to I . th e
should have a single S·bit output that
ca n be configured usin g IWO in pu tS 5 I
and sO to output anyone of l~e fou r
b3 b2 b1 bO
10
I I I1
13 12 11 10
sequence res um es from where it left off.
3.26 Do Exercise 25, exce pt the equence starts from 000
13 12 11 whenever I returns 10 I.
reg isters. (Hin t: use an 8-bu ..h: 1
reg (4) ~ 3.27 A wriSlwatc h display can show one of four items: the
mux .) reg(4)

3.21 Consider three ~-bi l registers con- 03 02 01 ao time, the alann. th e stopwatch. or the date. controlled d3 d2 dl dO
030201 00
nee ted togethe r as shown in ~igure by two signals s I and sO (00 displays the Lime. 0 I the
3.80. Assume the initial values In the
I I I I.
c3 c2 cl cO
J J J1
d3 d2 dl dO alarm. 10 the stopwatch, and II th e date-assume
figure 3.83 Regi ter configurntioo.

registers are unknown . Trace the s I sO co ntrol an -bit-wide mu x that passes through the appropriate regi ter). Pressing a
behavior of the reg isters by com- Figu re 3.8lI Register confi guraLion. button B (which sets B = I ) sequ ences the display to th e next item (if the presentl) dis-
pleti ng the Liming diagram of Figure played item is the date. th e ne xt item is the current time). Create a state dia!!J'llID for an FS~l
3.81. descri bing this seq uencing behavior. ha ving an input bit B. and 1"0 oUlp;t bilS 1 and sO.
Be sure ( 0 only sequence forward by one item each Lime the bUllon is pressed regardl of
how long the bUllon i pressed-in other words. be sure 10 wait for the bunoo to be relea..'>ed
afrer seque ncing forward one item. Use shan but descriplh-e names for each ute. :\.faki!
di splaying th e time be the initia l stale.
C 3.2S Extend the state diag ram you created in Exercise _7 by adding an input R. R= I ~
b3 .. bO FSM to return to the state Lhat displays the Lime.
c3 .. cO 3.29 Draw a slate diagmlll for an FSM with an input 'em and three outputs. ..t'• •~ and :. The t:' ..
outputs genera te a sequence called a Gray code in \\ hi b exactly one of the three oulpUlS
d3 .. dO
changes from 0 to I or from I to O. The Gray code sequence that the FSM should ""tpUt is
Fig ure 3.81 4-bit reg"ter inpu t pattern timing diagram for Exe rci,e 3.21 000. 0 IO. 0 II. 00 1. 10 1. I II , 11 0. 100. repeal. The output should bange 001) on 3 rb'J11g
. d ether a< ,ho" n In Figure 3.83. Ass ume the initial clock edge when the input gem = I. Make the initial tate 000.
3.22 ConSider three 4-bit registers ck'onnecteTr~~~ the behaVIOr of the reg"te" by ompleting the 3.30 Trace th ro ugh the excc ution of the FSM ),ou created in E,<ercise 19 b) mpletil1£ the nnun);
vaJues in the regl.sler) arc un nOwn.
diagmm in Figure 3.84. where C is the lock inpul and is the o-bit ~lUte f'e!!lSttt. AssUlDe'
liming diagram of Figure 3.82.
is initially 000.

genl

C c
b3 bO
s
c3 .. cO
d3 .dO
Figu r.3.82 4-DIl reg"tcr Input pallern IImln8 ding".", rnr F\c""" \ 22 Figure 3.84 F M input pattem tinlln!! di~ram fN \ <1\'1 .. .\0

c
1~6
Sequential log ic Desig n- Controllers
3.10 Exercises 147
", " FSM in Fi ,ure 3.85. >ueh that the FSM ;tart; in state Wail.
, H Dr.1\\ a t!ll1m£, di ag ram lor tht: I ~ b I bch'lVior of the circu li III English. 3.39 Using the five-step processor for designi ng a con-
~ .. reacht!~ S13h: EN, and return s to \\'ail.
Dt:scn c I le • troller. can veil Ihe FSM of Figure 3.87 to a
co~trolJ er. implementing the controller using a stale
Inputs: s,r regISter and logic gates.
Oulpuls: a.en
3.40 Using the five-slep process for designing a con-
troller. canveil the FSM you created for Exercise 24
to a con~ro lle r. implementing (he controller using a
y=l
stale register and logic gates.
3.41 Using the five-slep process for deSigning a con-
a;1
troller. convert the FSM you created for Exercise 27
en;O
to a controller. implementing the controller using a Figure 3.87 FSM for Exercise 339
Siale register and log ic gales.
a;O en= 1 3.42 Using Ihe five-step process for designing a controller. canvell the FSM you created for Exer-
en;O CIS~ 29 to a COntroller, implementing [he controller using a stale register and logic gates.
3.43 Usmg the five-Slep process for designing a controller. convell the FSM in Figure 3.88 to a
Figure 3.85 FSM for Exercise 3.31 ~ontroller.. Slopping once you have created the state table. Note: your state table will be quite
. . be- f tates indicate the srn allest possible number of bilS arge. havmg 32 rows-you might therefore want to use a computer tool. like a word pr0-
"\ l' For FSi\I s with the follOWing num rs 0 5 . ceSSOr Or spreads heel. to draw !he table.
. .. - for:l st.:lIe reg ister representing those stJtes:
(a) 4
(hI 8 lnputs:g.r
(c) 9 Outputs: x.y.z
(d) 23
(e) 900
3._'3 How many possible states can be represenled by a 16-bi t register? .. .
3 ,~ If an FSM has N tates. what is the maximum number of poss~ble tranSlllons thai co~ld ext~t
.. in th e FSM (assuming there are a large number of inpuls. meaning the number of lranSlllOns IS
nol limited by the number of inputS)?
3.35 .Assuming one inpul and one output. how many po sible four-statc FSM exist?
3.36 . Suppose you are given twO FSMs that execule co n~urrentl y.. De'~ ribc an approach for
xyz=110 xyz=OlO xyz=Oll X}'Z=111
merging those two FSM into a ingle FSM with identical funclionalllY as the two epara" xyz=101 xyz=001
FSM . and provide an example. If the fir.it FSM has , Iates and the sccond has M states. how
Figure 3.88 FSM for Exercises 3.37 and 3,43.
many tate will the merged FSM have?
3.37 · Sometimes dividing a large FSM into t,,·o , mailer 3.~~ Create an FSM Ihat has an inpul X and an output Y.
FSMs res ul tS in simpler circuitry. Divide the F M Whenever X changes from 0 to I. r should become I
shown in Figure 3.88 into two FSMs. one contaming for five clock cycles and then relurn to O--even if X is
GO-G3. the other containing G4-G7. You may add slill I. Using the five-step process for designing a
addilional Mates, transitions, and inputs or outputs controller. convell the FSM to a controller. stopping
between the two FSMs. as required. Hint: you will once you have crcnred the Siale table.
need to mtraduce signal; between the FSM, for one
3.45 The FSM in Figure 3.89 has two problems: one state
FSM to tell the other FSM to go to some state.
hn two lr.lnsitions whose condition ('Quid simulta~
neausly c\'nlu3Ie 10 lllIc. and another states has
SECTION 3.4: CONTROLLER OESIG N
lransistions that aren't gunrnnleed (0 hu\'c at leas( one
UX U"ng the fi,e-step processor for de"gl1lng 2 con- of Ihe tmnsition conditions true. By ORing and
troller. con, ell the FSM of Figur. 3.86 10 a ANDing Ihe conditions for each stnte's tr.lnsitions.
controller. Implemenlmg the controller u<lng a lU te Figure 386 F prove that these problems exist. Then. fix these prob-
regISter and logiC ga te . lems by refining the F M. taking your best gue. < .s
(0 whnl \\o.~ the F ~I creator's imcllt.
148 Sequential Logic Design-Controllers 3.10 Exercises 149
. I circuil shown in Figure 3.90.
3..16 Reverse engineer the behavior of the sequcnlla ~ DESIGNER PROFILE
Brian got hi s baChelors th e end." Thus, bei ng able to work alone as weil as in
Combinational logic o deg ree in Electri cal
COl large groups was imponant. requiring good
\.-----r~fj Ul Engineerin g and then
communicati on and team skills. And being able to
~~ worked for severa l understand not onl y a part of lhe system, bUl also
years. Realizing the important aspects of the oth er parts was also important..
future demand for digi tal requiring knowledge of diverse topics.
design targeting an Brian is now an independent digital design consultan~
increasingly popular something that many electrical engineers, computer
type of digi ta l chip engineers. and compu ter scientists choose to do after
known as FPGAs (see ge tting experience in lheir field. " I like the flexibility that
Chapter 7), he returned to school to obtain a masters
bei ng a co nsultant offers. On the plus side. I get to work
degree in Elec trical Engineering with a thesis topic
on a wi de variety of projecLS. The drawback is that
51 sO targe ting dig ital design for FPGAs. He has bee n
sometimes I onl y ge t to work on a small part of a projec~
employed at two different companies, and is now working
rather than see ing a product through from stan to finish.
as an independen t digital design consultant.
And of course being an independent consultant means
He has worked on a number of projects. including a
there's less stability than a regular positio n at a company,
system that prevents house fires by tripping a circui t but I don ' t mind that "
breaker when current running in the circuit indica tes
Brian has taken advantage of lhe flexibi lity provided by
Figure 3.90 A seq uen(al
I circuit 10 be reverse engineered. arcing is occurring, a microprocessor architeclUre for
consulting by taking a part-time job leaching an
speeding up the processing of di giti zed video, and a undergraduate digital design course and an embedded
SECTION 3.5: MORE ON FLIP· FLOPS AND CONTROLLERS ' . mammography machine for precise location detection of systems course at a university. " I really enjoy leaching
tum ors in humans.
. d shown in Figure 3.92. Trace lhe behav Ior of the flIp- and I have learned a 10l through teaching. And I enjoy
3.47 Conside r lhree T fllp-flopsconn ecl.e e as in Fi eure 3.9 1. Ass ume a ll the flip-flops initially One of th e proj ects he has found most interesti ng was a
flops by compleLing the umJng dlCloram 0
introd ucing students to the field of embedded systems."
baggage scanner for detectin g explosives. " In th at system. Asked what he likes most aboul the field of digital
contain 0·5, _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ _ __ _ __ there is a lot of data being acqui red as well as motors design, he says. " I like building prodUCLS that make
running, x-rays being beamed, and other things people's lives easier, or safer, or more fun. That's
T happening. all at the same time. To be successful. you sa tisfying."
C have to pay anent ion to detai l, and yo u have to Asked to give advice to studen ts. he says that ODe
communicate wi th the other design teams so every one is imponant lhing is "to ask questions. Don'l be afraid of
01
on the sa me page." He fou nd th at proj ec t particularl y looki ng dumb when you ask questions .t a new job.
02 interestin g because "1 was working on a small part of a People don't expect you to know everything, bUl they do
03 very large. co mplex machine. We had to stay focu sed on expect you to ask questions when you are unsure.
our part of the design, while at Lhe same time being Besides. askin g questions is an importanl part of
Figure 3.91 T flip-fl op input panem timing di agram for Exercise 3.47
mindfu l of how all the parts were going to fit toge ther in learning."
3.48 Show how to con neCl fo ur T fl ip-fl ops
001 T
together to create a circuit that counl~ fro~ T T
o to 15 in binary and back to 0 agaJO- JO
other words, that counLS 0000. 000 I, 00 I0,
.... 11 11 , and back 10 0000 agai n. Hint: con-
sider usi ng the Q OUlput of a flip-flop as the C _~_ _ _-<l>-_ _ _....J
clock inpu t of another flip-fl op. Assume all Figure 3.92 Three T flip-fl ops.
lhe flip-flops in itially co ntain O·s.
3.49 Define metastabi lity.
· a cantro II er wll' h a 4- b'I t sta te register th at gets synchronou sly initialized to state 1010
3.5 O DeSign
when an input resel is SCI to 1.
3.5 1 ' Design a D nip-fl op with asy nchronous reset, AR. and a,y nchronous set, AS, inputs using
basic logic gates.
4.2 Registers 151

as register-transfer level (R TL ) components, also known as datapath components. and a


Circu it composed of such componenls is known as a datapalh .

4 Datapalhs can become quile complex, and Iherefore il is crucial to build datapaths
:rom. a SCI of dalapalh componenls Ihal each encapsulale an approprialely high level of
uncllOnallly. For example, if you were asked whal components make up an aUlomobile.
you wou ld probably lisl components like an engine, tires, a chassis. a body, and so on.
Each of Ihose componems encapsulares a high-level function of the automobile. You

Datapath Components
thought of a tire, nOi of Ihe rubber, slee! wires, valve stem, valve, sidewalls, and oiller
parts thai make up the lire. Those delai led pans make up Ihe design of a lire. nOI an aUIO-
mobi le . A tire is an appropriately high level of componem when thinking of a car; a valve
stem IS nol. Likewise, When we design dalaparhs, we mUSI have a set of dalapath compa-
nems aI Ihe appropriately hi gh level- logic gales are 100 low-level.
This chapler defines such a sel of datapalh componenlS. and also inLroduces simple
dalapal hs. In Chapler 5, we' ll see how 10 create more advanced darapalhs. and how 10
4.1 INTRODUCTION combine datapat hs and Controllers 10 bu ild an even higher-level componem known as a
. . increasinoly complex building blocks Ihat can be used to processor.
Chaplers 2 and 3 II1lroduced . 0 d diaDic o'lles mul!iplexors, decoders, basic
build digilal circui ls. Those blocks IO C1u e ° °d fa; implementing systems having
. d fi II lLroliers Controllers are goo 4.2 REGISTERS
reglSlers, an na y cal. .' 1 d eneralin o some number of control output sig-
f antral Inpul Slona S an g o .
some num ber a c o . I . I become 1 (correspond 109 perhaps An N-bit register is a sequemial componem ab le 10 store N bils. Typical regi ter width
nals. For examp Ie. 'f see a part icul ar conlro II1pU
I. .wed) Ihen we may want 10 gener,ate ,a 1 on a control output (corre-
. (the number of bit N) are 8, 16, and 32 bits, though any wid th is possible. The bilS in a
b
to a bUllon ell1g ples s e , . I ' 'h ler we inslead focus on creating
. I liohl !lIrnln 0 on) In I liS C ap , register often represenl data, such as 8 bils represeming a lemperature as a binary number.
spondll1g penaps 10 a 0 d ~ ° I ~s havi no dara inputs and outputs. In general, The common name used for storing data imo a register is loading, although tbe words
bui ldin g blocks Ihat are goo or sys el . ' 0 II )'
digita l ;ystems have IWO Iypes of inpuls (and oUlpU IS as we . writing and storing are also used. The opposile aClion of loading a regi ler is known as
reading a register's coments. Reading consisls merely of connecting to the regi ler's
Control' A con tra I .InpUI .IS Iypically
' one bil, representing a part"icu
. lar event
. outputs-note thai reading therefore i not synchron ized with Ihe clock. and funherrnore.
occurri~o OUlside Ihe system. li ke a bUllon being pressed,.or representing a panic- nOle Ihal reading does nOi remove the bils from the regi ter or change them in any way.
ular stateo of sameth'mg OU tSI'de the system, like a door being closed
" or a car bemg
. Regislers come in a variely of slyles. We'll introduce some of the mOSI common
at an intersection. Control inputs could sometimes be grouped 11110 mullJple bus- slyles in thi s seclion. Registers are perhaps the most fundamemal dalapath campanelli. a
. 4.
ilke . wh'ICh a f 16 bUllon is pressed, or 2 bits representing each
bits represenllng we will provide numerous examples of their design and their use.
of 4 possi ble states of a door (closed. open 113rd, open 2/3rd,. or fu lly open),
Control .II1pUtS are typlca
. II y used directly to influence a controller s present state. Parallel load Register
Data: A data input is typically multiple bi ts, collect ively repr~senting a single
. For examp Ie. a 32-b'lt input may represent a temperature In binary, A' .
7-bu The mas I basic type of regi ster, shown in Figure 3.30 in Chapter 3. cons iSIs ll1erel~ of a
entlly. . . 00 ft
.Input may represen tthe present floor location of an.elevator set of flip-flops that gel loaded on every clock cycle. Thai ba ic regi ler is useful as the
. In a I "- oar bUlldmg,
· stale regi ter in a coni roller, since the state register is loaded on every clock cycle. Ho\\-
A dala Input may be a s·In gle bit I differin 0o from a slOgle-bll control
,
Input 111 that we
don' t directly rely on that bit's value to influence the controller s present state. ever, for most other uses of registers, we walll ome way 10 control whether or nOI a
regisler gets loaded on a particular clock cycle--{)n some cy les we wanl 10 load.
Not all input can be strictly classifi ed as ei lher comrol or dala-th e~e are some inputs whereas on other cycles we j usl wanl 10 keep Ihe previous value.
thai fall somewhere on the border in belween the IWO Iypes. BUI most Inputs can be clas-
sified as one or the other. (A nd. of course, a digi tal ystem also has power Inputs, ground ~ WHY THE NAME "REGISTER"?
inputs, and clock inpuls too, in addition to conlrol and data inputs.) .. .
Hislorically, the term "regisler" referred 10 a sign or device for sloring dntn. In Ihis contex!. -inee 3
Coni rollers are a good building block for buildi ng systems cons lstll1g mall1ly or
chalkboard 01110 which people could lemporarily wrile collection of Hip-flops stores data. the n3.m~ register
comrol inpu ts and cOlllrol OUlputS. But we also need building block. for systems con· OUI cash lransactions. and later perfonn bookkeeping seems quile nppropri3le.
si . ting of data inpuls and OUlpUIS. In particular, we need registers 10 hold the data, and using those tran sactions. The tenn generally refers to n
functional unilS to operale on (e.g. add or divide) Ihe daw. Such component are known

150
152 Oatapath Components 4.2 Reg isters 153

~iagram in Figure 4.3(a). We can Ihen delermine the values in regislers RO. RI. and R2. as shown in
-'gurc 4.3(b). Before the fi rst clock edge. we do not know the values in the registers, so we show the
registers ' contents as "????" The Contents are actuaJ ly some combination of four 0 and 1 vaJues but
we don't know what those particular values are. •
a
Before the fi rs l clock edge, we are give n that a 3 .. a become 11 11. Thus. on the first clock
edge, RO will be loaded with III I . AI the same momenl, RI and R2 wi ll be loaded with the value
in RO, whi ch is " ???? ," so R I and R2 will still have con tents of" ? ?? ? ."

(a) ~
Clk_~n11 ~
:2
n n n
:3~---~i 4L---.....,15~---~
n
o
11
.& a3 .. aO --l-l-l-l--i.I-X 0001 i X !
1010
'0

'"
!2
------------------i---------- --~ ----'-------------.t-
, _-_-
_ _- __ -_-_-_~-t,-_-__-_-_-__-_-_-
- __
- ..,.~_-__-_-_-__-_-_-__-_-_
__
RO ????
1010

Rl ????
1010

. ( ). al desian (b) palhs when 1oad=O and 10ad =l , R2 ????


Figu re 4.1 4· bil parallel load reg ister: a mtern 0 • 0101
and (e) regi ster bloc k symbol. . ' . : 1010

i ,~
. I I d ' g of a reoister by add ing a 2x I rnuluplexor In front
We can ac hieve con tra ove r oa In .0 . 1 d' .
c h 4 b't reoister In F,oure 4.I(a). Whe n the oa sIgnal IS 0 (b)
of each flip-flop as shown ,or t e - I 0 " 0 I h . F'
.' . I fl' fl 0els loaded with its own va ue. as sown m Igure
and the clock signa l rISes, eac l iP- op" ., d
. h fl ' fl . rese nt contents. the register S conte nts a not change
4 I (b) Because 0 IS t e IP- op s P . . . h fl ' fl 0101 :1101011 0101 1
. '1 d ' 0 Wh en the lo ad sional
when oa IS. 0
is I and tile clock s ignal nses, eac
.
[P- op gets
I d d . c.;,:""" "'R
O,2"-'-'! Rl R2
loaded with one of Ihe data inputs 10. I I , 12, o r I 3- thus, th e regIs ter gets oa e With
Ihe data inputs whe n loa d is 1. " . Figure 4.3 Basic regis ter example: (a) timing diagram. and (b) the contents of each register.
A reoister w ith a load line that co ntrols whether the register IS loaded With ex t~rnal mputs,
. aII t°h ose .Inputs b'
Wllh elflg I0 aded in parallel . is know n as a parallel load reglSler. Figure
Before clock edge 2. we are give n th at a3 .. aO change to 00 01. Thus, on the second clock
4. 1(c) provides a block symbol for a 4- . . . - - - - -- - - - -- - - --,
edge. RO will be loaded wi th 0001. Simult aneously, RI wil l be loaded with the value of RD. which
bil parallel load register. A block was 1111, and R2 will be loaded with the value of RO inverted, meaning 0000.
symbol o f a component shows a compo- Before the third clock edge, we are gi ven that a3.. aO change to 1010. On the third clock
nent's inputs and outputs. wi th out edge, RO wi ll be loaded with 1010, while simultaneously RI gets 000 1. and R2 gets Ilia.
showing the component 's internal We are given Ihat a3 .. aO stay at 1010 before the fourth clock edge. On the fourth edge. RO
details. again will be loaded with 1010. while simultaneously RI gets 1010 and R2 gets a 10 I.
Because register are suc h a funda- As a 3 .. a a stay at 1010 before the fifth clock edge. then on the fourth edge. RO again will be
mental component in datapaths, we loaded with 1010, while R I again gets 1010 and R2 agai n gets a 10 I.
prese nt a number o f examples The important fea ture 10 notice in this example is that the RO. RI . and R2 registers all ger
loaded siml/lralleol/sly. Thus, even though RO gets loaded with a new value on a clock edge. RI and
invo lving reg isters. to ensure the reader
R2 gel the previous value. not the new value. on that same clock edge.
gai ns suffi c ient comfo rt with registers.
EXAMPLE 4.2 W eight sa mpler
EXAMPLE 4.1 BaSIC example uSing registers
Consider a scale at a grocery store used to weigh fruit. The scale may have a display that shows the
Figu re 4.2 show, a simple conneclion of
presen t weight. We want 10 add a second display. and a bunon that the user an press to remem ber
Ihree regislers RO. R I. and R2. Suppose we
th e present weight (sometimes called "sampli ng"). so that when the fruit is remo,'ed. the remem-
are laid Ihal Ihe inpul values on a3.. aO
bered we ight continues to be displayed on the second display. A diagmm of the system is sbown in
have Ihe va lues show n in Ihe liming Figure 4.2 Bn,ic regisler ex ample.
Figure 4.4.
154 Oatapath Components 4.2 Reg isters 155

Assume the sca le outpul ~ th e We must ensure that when th e timer ge nerates its hourl y pulse on C, the pulse is 1 fo r onl y one
clock cyc le. Otherw ise, th e registers would gel loaded marc than once during a single pulse
prese nt weigh t as n -I.-bit b~llar~
!lumber. and the "Present we ight (beca use du ring that pu lse. multipl e rising clock edges wo uld occur. and regi sters get loaded on
Weight Sampler each rising clock edge). and so the presen t temperature would get loaded into two or even aU three
Jnd "Saved weight" di spl nys ~u(O­
matically conve rt their inpu~ blllary registers. We Can accomp lis h a sin gle-cyc le hi gh outpu t by usin g the Same clock as input to the
numbe r to the proper displayed timer, and then deS ig ning the timer's internal state machine to only sel C"'l for o ne slate-similar
v;J lue. We cu n design th e Weight· 10 how we set an output to 1 For exac tl y three stales in Example 3.7 in Chapter 3.
Sampfa block using a -lobi! parallel
EXAMPLE 4.4
load rce islcr. \Ve co nnect the bu tton Automobile above-mirror display using parallel-load registers
signal b
to the load inplll of the reg-
In Chap ter 2, we described an exa mpl e of a sys tem above a rearview mirror that could di play one of
is~er. The OUlplil connects to the
fo ur 8- bit in puts, T. A. I, and M. In that example, we ass umed the car's central compu ter was con-
"Saved weight"' display. Whenever b
nected 10 the above-mirror sys tem usi ng 32 lines (4*8). Thirty-two wires is a lot of wires to have 10
is L the weig ht va lue gelS loaded
connect from the computer to above the rearv iew mirro r_ In stead, assume that the computer connects
into the register. and thus appears on
to the above- mirror sys lem usi ng 8 data lin es (C), 2 contro l lines a 1 aD that specify whic h data item
the second display. When b retunlS
prese ntl y appears on C (be ing T when alaO-OO. A when alaO-Ol. J when alaO-lO. and M
to O. the register kee.ps its value. so
when a 1 a 0-11), and a load control line load, For a total of II line . ratherthan 32 lines. The Com-
the second display conti nues 10 show
puter can se nd the data items in any orde r. at any lime. The above-mirror system should s impl y SlOre
the same weight. eve n if other items Figure 4.4 Weight sampler imp leme nt ed us ing a 4-bit
dal a items in Ihe appropri ale regis ler (accord in g to a laO) when the data items anive. and thus lhe
are pJaced on the scule and the first parallel load reg ister.
sys lem needs four para ll el-load reg isters in whic h to store each data item. The control lines a 1 a 0
disp lay cha nges.
wi ll therefore serve as th e "address" thai tell s us which regis ter 10 load. As in the earlier example.
inputs xy determine whi ch va lu e to pass thro ugh to the 8-bit d isplay OutpU t 0 (wi th xy sequenced
EXAMPLE 4.3 T m erature history display using registers (again) . by th e user pressing the mode button).
e p 3 . whi.ch a Ilrner
. gel,eraled
, a pulse on an Input C every hour. We We can des ign the sys tem as show n in Fig ure 4.6. The fi g ure uses a popular "shonhand" nota-
Recall Example 3.2 of Chapter . 111 • t s and those regis ters we re connected such tion th at replaces a grou p of wires by a si ng le thicker wire having a s lanted line and number
I k ' UIS of three reg is er .
co nnected that input C to the c oc II1p 'perature the second register would get the indi ca ting the number of wires in the gro up .
I d d / th the prese nt ten .
that the first registe r woul d be oa e ".1 Id the te mperature before that one, o n the rising
d h thi rd reo lSler wou ge t .
previous temperature. an t e o . d connec t any input o ther than a clock signal
. . e typIca ll y a not
edge of C. However. In pracllce. w . W therefore redes ig n the system LO use a clock
. . clock IIlput e can
(from an osc illator) 10 a reg ister S . . II 11 ad reoister. We co uld then con nect th e input C
signa l as the regi ster clock inpul. by uSlllg a'par~ e a
~====n:-l iO
0

... . show n III Floure 4.5.


to the load inputs o f the reg isters. as 0 I er hour In fact due to the nature of how 8-bit
. f . n be fas ter Ih an I pu se p . ,
The osc li lator requency ca ? Q rr Osc ill ators· o n page 102 in Chapter 3), 4x1

~~rli1
osc illators are made (see "How Does It Work . - ua z .
oscillalOr frequencies are usua lly at leas t in the k.ilohertz range .
o

a4 a3 a2 a1 ao _ _ b4 b3 b2 b1 b 0 r - c4 c3 c2 c1 cO
,----
---- r.;;.. 14
~13
04 14
13
04
03
14
13
04 I-
03 r---
----
03
~12 02 12 Rb02 12 Rc 02 -
~1 1 Ra 01 11 01 11 01
.....
~IO
to
00 10 00 10 00
Figure 4.6 Above-mirror dis pl ay design. a I a O. set by the car's central computer. delennines
whi ch register to load Wilh C. whil e 1 oad-l enables such loading. y . which are !Odependent of

~
osc~
clkl~
C ~
r~
newline
1 IT Temperature History Slo,age
a 1 a 0 and are sel by the user pressing the mode button. dctennine \\ hkh register to output to the
di splay D.

Th e decoder decodes a 1 a 0 to enable exactly one of the four regbten;. The load line en3bl<$
the decoder- if 1 oa d is O. 110 decoder OUlput is I and so no register get. loaded. The multIpk\ 'r
Figure 4.5 Internal de,ign of the TempemlllreNislorySlorag" co mpo nen i. using parallel load reg lslers. pUrl of the sys tem is the sa me as in th e earlic.!f example.
156 Datapath Components 4.2 Registers 157

Let's !<iCC how thi s system work s far a sample sequence of inputs. Suppose init il.l ll y that all reg- Notice that the microprocessor must set va lues for 64 bits one bit for each square However
. . , 0 and xy=OO . Thu s. the di splay wi ll show O. Ir the user presses the mode button fo.ur th~. inexpensive type of microprocessor used in such a device p~obably does not have ~ pins. Th~
,Sters store s ' . h I 0 I 1 0 II and back to 00. ror each press still dlS- IllICroprocessor needs ex ternal registers to store th ose bits that drive the LEDs. and will write to
times the inputs xy wlil seq uence t roug 1 • • • •
la , i~(l 0 (..::ince all registe rs are Os). Now suppose that during some clock cycl e. th~ car s computer those registers One at a time. The microprocessor writes to the registers so fast.. though. that an
po) e '
sets a 1aO=01. oa - . an
1 d-l d C=000010 1 0. Then register 1 wi ll be loaded wnh 0000 1010.
. . .
observer would probably see all th e LEDs change at the same time. not noti cing that some LEDs
are changIng rm croseconds earlier th an others.
Since xy=OO . the di splay will still show the contents or regISter O. and thus th e dISp lay wil l show
O. Now. ir thc user presses the mode button. xy wi ll become OL and the dis play 11' 111 show the Let 's use one register per column. meaning we' lI need eight 8~bil registers tOla!. as shown below
the checkerboard in Figure 4.7(a), with those registers named R7 th rough RO. Each register'S 8 bits of
decimal value of ree.i ster I 's 0000 1 010 value. which IS len III deCima l., Pr~sslOg mod e ~gal~ WI))
clWI1e.c xy to 10. s~ the display will show the co nt en ts of register 2:
whIch I ~ O. At any tlme ,ln the
~ata corres~ollds to a particular row in the register's column. ind icating whether (he respecti ve LED
IS.on or off , as shown in Figure 4.7(b). The eight regi sters are connected to the microprocessor. The
fUlUr;. the car's compu ter can load the other reg isters. or reload regl s ~er I . with new val,ues. In any
mi croprocessor uses eight pins (D) for data, three pins (i 2, i 1. i 0) for addre sing the appropriate
order. Note th at the i03ding of the reg isters is independent from the display of those reglsters.
register (whi ch is decoded into a load line fo r each of the 8 regi sters). and one pin (e) for the register
load line (linplemented using th e decoder's enable), ror a total or 12 pins-a num ber much more fea-
EXAMPLE 4.5 Computerized checkerboard sible than 64 pins. To config ure the checkerboard ror the begin ning of a game. the mi croprocessor
Checkers (known in some countries as "draughts") is ont! of the world's most popular board gam~. would crea te the following sequence of register wri tes shown in Figure 4.8.
A checkerboard consists of 64 squares. formed from 8 colum ns and 8 rows. Each player starts ~Ith
12 chec kers (pieces) on the board. A co mputerized checke rboard may replace the checkers by uSing
an LED ( liglll~el1l illing diode) in each square. An on LED represent ~ a checker 111 a square;.an o~
LED represen ts no checker. For simplicity of the example. ignore the Issue of each player havmg hIS
own color of checkers. An example board is shown in Figure 4.7(a).

clk
O LEO e lit LEO
Figure 4.8 Timing diagram indicating an input sequence that can be used to initia1ize.

~ HOW DOES IT WORK? COMPUTERIZED BOARD GAMES,


Many of you have played a computeri zed board game, playing programs will "prune"' configurations that
like checkers, backgammon, or chess. either using appear to be very bad and thus unlikely to be chosen by
boards with small di splays to rep resent pieces, or an opponent, just as humans do. [a reduce the
perhaps usi ng a graphics program on a personal configurati ons to be considered. Compute~ can examine
computer or website. The main method the computer millions of configurations. whereas humans can onJy
uses fo r choosing among possible nex t moves is called mentally examine perhaps a few dozen. Chess. being
lookahead. For the cu rrent configuratio n of pieces on the perhaps the most complex or popular board games, has
board, the computer considers all possible single moves attracted ex tensive attention since the early days of
Figure 4.7 An electronic
that it might make. For each such move. it might also computing. Alan Turing. considered one of the fathers of
checkerboard: (a) e ight 8·
consider all possible single moves by the opponen t. For Computer Science, wrote much abou t using computers
bit regi !>lers (R7 through
each new confi gurati on res ulting fro m possible moves, for chess. and is credited as having written the first
RO) can be used to dri ve
the computer evaluates the configuration's goodness, or compu ter chess program in 1950. Howe\'er. humans
[he 64 LEDs. using one
quality, and picks a move that may lead to the best proved better than compu ter chess programs until 1997.
reg j ~ ter per column. and
configuration. Each move that the computer looks ahead when IBM's Deep Blue computer defeated the reigning
(b) detail or how one
(one computer move. onc opponent move, another world champion in a classic chess match. Deep Blue had
regi ster connects to a from from computer move, another opponent move) is called the 30 lllM RS-6000 SP processors connected to -I Ospecial
column', LEDs and how microprocessor decoder lookallead amount. Good programs might lookahead purpose chess chips. and could evaluate 200 million
the va lue 10100010 (b) three, four, five moves, or more. Looking ahead is costly moves per second, and hence many billi ns of m \'eS in
stored in that register
in terms of compu te time and memory-ir each player [) few minutes. Today. chess toumamenlS nOt only mat h
would li gh t th ree LEDs.
(a) has 10 possible moves per tum. then looking ahead two humans against compu ter prognun~. but also progT':l.l1lS
moves results in 10' 10 = 100 config urations to evaluate: against programs, many hosted b. the lnrern:ltionaJ
A computerized chcckerboard typically has a mi croprocessor that keep' trac k or where each three moves in 10' 10' 10= 1000 configurations. fo ur Computer Games Association.
pi ece is located. moves pieces according 10 user cOlTImnnds or accordin g to a checker-playing moves in 10,000 confi gurations, and so on. Good game- (SoufC'e: Ct)mpllt~r Chess Hislf'I)', b~ 8 dt WaH),
program (whcn playing against the computer), keeps score, etc.
158 Datapath Components 4.2 Registers 159

On (he first rising clock edge. RO .g~ls loade~


Rotate Register
O lED . litlED
with 1010001 D. On the second nSJllg clock A rotate register is a Slight vari ation of a shift register in which the ou tgoing bit gelS
cdoc. R I gets loaded with 01000101. And so shi fted back in as the incoming bit. So on a right rotate, the rightmost bit gets shifted into
on- Arter eight cloc k cycles. the reg isters would the leftmost bit, as seen in Figure 4. 12.
co~tain the desired values. and the board's LEDs

~
would be lit. as shown in Figure 4.9. 1 0 1 Register contents
before shih right
Shift Register
One thing we might want to do wi th a reg- 1 1 1 0 Register contents
after shift right
ister is shift the register's con ten ts to the left
or to the right. Sh ifting to tile right means to (a) (b)
move each stored bit one nip-nap to the Figure 4.12 Right rotate example: (a) register contents before and after the rotate. and (b) bit-by-bit
view of the ro tate opcral ion.
right. If a 4-bit register originally stores
11 01. shifting right wou ld resu lt 1tl 0110, Implementing a rotate register is achieved by modifying the design of Figure 4.11.
as shown in Fi gure 4. 10(a). We dropped the feeding the rightmost nip-nop output , rather than the 5 h r _ i n input. into the leftmost
rightmost bit (in this case a 1), and we mux's i 1 input. A rotate register needs Some way to get va lues into the register--either
shifted a 0 into the left most bit. To bui ld a via a shift, or via parallel load.
regi ster capable of shi ftin g to the rig~l . \~e
conceptually need to connect the regtster s EXAMPLE 4.6 Above-mirror display using shift registers
Figure 4.9 Checkerboard after load ing
Aip-Aops in the manner similar to that In Example 4.4. we redesigned the connec ti on between a
registers for init ial checker positions. This bundle
shown in Figure 4. 10(b). car's centra l computer and an above-mirror display system should be
to reduce th e num ber of wires fro m 32 down lO 8+2+ 1= II .
shr_~

~
Reglslercontents However. even II wi res is a JOI of wi res to have to run f eU' wires.
o 1 1 0 1 before shllt nght fro m the comput er to above the mirror. Let 's reduce the lIot ele\lefl
Figure 4.10 Right shift example: wires even further by using shi ft regis ters in [he above- wires.
(a) sa mpl e c onteOlS before and Register contents (b) mirror system. The in puts to the above- mirror system from
after a nght shift and (b) btl-by-btl o1 1 0 after shift fight the car's com puter wi ll be one data bit C. two address lines
view of the shi ft. (a)
a 1 a D. and a shift line S h i ft. for a total of only 4 wires.
When the computer wants [0 wri te to Oll e of th e abovc-Illjrror system's regis ters. the computer will
W o' t able to shi ft to the right as shown in Figure 4. 11. The register set a 1 a 0 appropriately and will then set 5 h i f t to 1 for exactly eight clock cycles.
e can create a re"ts cr . h ~1 s a ri ght shift on a rising
includes two control mputs, S h rand 5 h r _ , n. 5 r cause . . For cHe h of th ose eight clock cycles. (he computer wi ll se t c to one bit of the -bit dara to be
clock cdoe whi le s hr~O causes the register to maintain its present value. 5 h r _, n tS the loaded. starti ng wi th the least-signifi-
bit that :e'want to shift into the leftmost register bit during a shift operation. cant bit on the firs t clock cycle. and Note: this tine is 1 bit, rather than 8 bits like before
ending w ith the Illost-significant bit on x y
th e eighth cloc k cycle. We can thus
design the above-mirror sys tem as t sOt
51
shown in Figure 4.13.
2x4 iO
8
dl
4",
aO -... iO
il
al -... il 8
Figure 4.13 Above-mirror display design using shift d D
regi sters to reduce the number of li nes coming from the d2 8
car's computer. The compu ter sets a 1 a 0 to the desired i2
register to load. and then holds S h i ft ~ 1 for eight 8
clock cycles, with C equaling the register conten ts bit-
Figure 4.11 Shirt regi' ter: (a) implementation. I> 03 02 01 00 by-bi t, one bit per clock cycle. resu lting in the desired
e d3
I I I shi~ • i3
(b) path' when Sh r~ 1. and (e) block symbol. (c ) reg ister being londed with th e sent 8-bit value. 8
160 4 Datapath Components 4.2 Registers 161

~ HOW DOES IT WORK? COMPUTER COMMUNICATIONS IN AN ~ HOW DOES IT WORK? WIRELESS AND USB COMMUNICATION
AUTOMOBILE USING SERIAL DATA TRANSFER. BETWEEN DIGITAL DEVICES.
a time , like the co mmu ~ i cali o n in ~x ampl e 4.6, to Serial communi cation between di gital device. such as
Modem automobiles contai n dozens of computers communical ion. sending one bit 31 a lime over a radio
reduce the number of wires. A ~artl c ularl y ~opul~r between personal computers. laplops. printers.
distributed throughout the car-some under the hood, seria l communication scheme I n automobil es IS frequency. While data communication between devices
some in the dash, some above the mirror. some In .the cameras, elc., is ubiquitous. The popular US B
known as the "CAN bus." short for Controller Area may be serial. compulations inside devices are
door. some in the trunk. etc. Running wires in terface is a seria l communication scheme (USB is
Network. which is now 3n mte~nalion a l standard typically done in parallel. Thus. shift registers are
thro ughout the car 50 those computers c~ n short for U"i.'ersal Serial Bus) lIsed to connect
defi ned by ISO (International Standards commonly used inside circuils ( 0 convert internal
communicate is a challenge. Thus. most aUlOma,blle personal computer and other devices together by wire.
Organization) standa rd number I 1898. parallel dal a into seri al data to be senl 10 another
comp uters communicate seriall y. meaning one bit at Furthermore. nearl y all wireless cOlllmunication
device, Jnd to receive seria l data and convert that data
schemes, such as WiFi and 8 1ucTaolh. use serial
into parallel da ta for inlcrn al device use.
. ria te reo ister gels a new value shifted in durin g the next eig~1
When Shl ft-l. the approp 0 arallelload from eight separate inputs. but uti' Let 's exa mine the mux and flip-fl op of the rightmost bi t. When 5 I s0:00 . the mux
clock cycles. This method achieves the sa me as a p
passes the present fl ip-flop value back to the flip-fl op, causing the flip-fl op to get reloaded
lizes fewe r wi res. ~ nn of comm unication between di gital circuits known as serial with its present value on the next rising clock, thus mai ntaining the present value. When
This example de~onstrate~ a .0 . ale data by se nding the data o ne bit at a lime.
communication. in wh ich the Circ Ui ts comm unl C 51 S 0:0 I , the mux passes the ex ternal 10 input to the flip-fl op, causing the flip-flop to get
loaded. When 51 S 0: 10, the mux passes the present value of the flip-fl op output from the
Multifunction Registers . ' " left, Q I, thus causing a right shift. s i s 0: 11 is not a legal input to the register and thus
sho uld never occur; the mux passes Os in this case.
. ~ nn a variety of operations (also call edjimcll olls), li ke load, shtft ngh~
Many registers can pe 0 ft Th egister user selects the presentl y demed operatIon
h 'f I ft t t rioht rotate Ie etc. e r . . . Register with Parallel Load, Shift Left, and Shift Right
S J t e . ro a e o , _ .' , now introduce some multifu nctI on regIsters.
by setting the register's control mputs. We II Adding a shift left operati on to the above 4-bit register is straig htforward. and is hown in
Figure 4. 16. Instead of connecting Os to the 13 in put of each 4x I mux. we instead
Re 'ster with Parallel Load and Shift Right .
connect the output from the flip- fl op to the right. The ri ghtmost mux's 13 input wou ld be
gJ
A popular .'
combmauon .
of operatIOns on a reoister
0
is that of both parallel
"
load and shIft. We can
. connected to an addi ti onal input 5 h 1_ in.
design a 4-bll. regIster
. capable 0 f para II. e I load and shift right, the. details of whIc h are shown 10
Figure 4.14(a). Figure 4.14(b) shows a block symbol of the regIster. 13 t2 t1 to

to
flTrr-nTrt--~*~~-d~Lls~hun
shUn
shr_in
51
sO

(b)
(a) (b)
(a) Figure 4.16 4-bi l regisler wilh parallel load. shift lefl. and shin righl operations: (a) internal
Figure 4.14 4-bil register with parallel load and shift right design. (b) block symbol.
.1 .0 Operation
operations: (a) internal design. and (b) block symbol. 0 Maintain present value
0
~ UNUSED INPUTS,
0 t Parallel load
Notice that we used a 4x I mux, rather than a
0 Shi~ right The example in Figure 4. 14 included 3 mux wi th 4 inputs the internal [fUn istors conduc t or nOI conduct? \Vc:: don't
2x I mux , in front of each flip-fl op, because each
(unused - let's load Os) of which we onl y used 3 inpuis. Notice that we aClually rea lly know. and so \\ e C' uld get undesired beh:l\ iar
flip-fl op can now receive its next bit from one of sel the unused input to a particular value. rather th an from the mIL'. Leaving inputs unconnected should not be
three locations (the fourth mux input is unu ed). Figur. 4.15 Operation lable of a 4-bil simply leaving the input unconneclcd. Remem ber that done. On Ihe other hand. lea\'ing outputs unconnected is
The register has two control inputs, with the register wi lh parallel load and shift the input is controlling lransistors inside the no problem-an unconnccted output ma~ ha\ e a 1 or n
control behavior shown in Figure 4. 15. right operalion\. component- if we don't <.15sign n value to the inpul. will thai simply doesn't control anything clse.
162 4 Datapath Components
4.2 Registers 163
The register has the operat ions shown in sl sO Operation Figure 4.20 Truth tables
Inputs Outputs
describing ope rat ions of a Note
Figure 4. 17. 0 0 Maintain present value Id shr s hl s l sO
register with lert/right Operation Id shr shl

--
0 1 Parallel load Operation
shirt and parallel load 0 0 0 0 0
Load/Shirt Register with Separate Control Maintain value 0 0 0
1 0 Shift right along wit h the ma ppi ng of 0 0 1 1 1
Maintain vaJue
Inputs for Each Opera tion Shift left 0 0 1 Shift left
1 1 Shift left the register control inputs 0 1 0 1 0 Shift right 0 1 X Shift right
Reg isters Iypica ll y don' l come wilh conlrol to th e inlcmal 4x I mu x 0 1 1 1 0 Shift right ~ 1 X X Parallel load
inpulS Ihal encode Ihe operation inlO the 1 0

f1
Figure 4.17 Operation table of a 4-bit select lines: (a) complete 0 0 1 Parallel load
minimum number of bils li ke the conlrol register with para llel load. shirt left, operat ion ta ble defi ning 1 0 1 0 1 Parallel load (b)
inpulS on Ihe regislers we designed above. and shin ri ght o penllions. the mapping or 1d, s hr . 1 1 0 0 1 Parallel load
Inslead. each operalion usually has ils own and shl to sl and sO. 1 1 1 0 1 Parallel load
cOlll rol inpul. and (b) a co mpact ve rsion
So a registe r wilh Ihe of th e opcn:llion tab Je. (a)
operati ons of load, shi fl lefl.
and shift righl. mighl have Ihe Id shr shl Operation We can design that combinational circuit starti no from a simple truth table shown in
inpulS and operation lable Fig ure 4.20(a). 0
o o o Maintain present value
shown in Figure 4. 18. The o o Shift left We th us obtain the fOll Owing eq uations for the regi ster's combinational circuit:
fo ur poss ible operations o o Shift right sl = ld'*shr ' *shl + ld ' *shr*shl ' + ld'*shr*shl
(mainlain , shilt left, shifl right o 1 1 Shift right - shr has priority Over shl sO = ld'*shr'*shl + ld
and load) really onl y req uire o o Parallel load
o 1 Parallel load - Id has priority Replacing the combinati onal circuit box in Figu re 4. 19 by the gates described by the
two control inputs, but the
o Parallel load - Id has priority above equati ons would complete the register's design.
fig ure shows that the register
Parallel load - Id has priority . Register dalas heets typica lly show the register operation table in a compact form.
has three control inputs-l d,
takll1g advantage of the priorilies among Ihe control inputs. as shown in Figure 4.20(b). A
shr, and shl.
Figure 4.18 Operat ion table or <I 4·bil register with separate sll1gle X 111 a row means that row is actually two rows in the complete table. with one row
NOli ce that if Ihe user hav ll1g 0 111 Ihe position of the X, the other row having I. Two Xs in a row means that row
control inpuls ror parallel load. shifl lefl. and shift right.
sets more than one control IS actually four rows in the complete table. one row havi ng 00 in the positions of those
inpul 10 1. we mu SI dec ide Xs, anot her row having 01. anO,ther 10. and another 11. And so on for three Xs. repre-
what operation 10 perform. If sentll1g 8 rows. Note lhat pUlling hi gher priority contro l inputs to the left in the table
the user sets both s h r and s h 1. we' lI give priority to s hr. If the user asse rts 1 d and keeps the table' opera lions nicely organi zed.
either or both of s h rand s h 1. we ' ll give priority 10 1 d.
The internal design of such a regi ter is similar to the load/shift register designed above, Register Design Process
except that the three control inputs of 1 d, shl, and shr need to be mapped to the two
Table 4. 1 describes a general process for designing a register with any number of functions.
control inputs S 1 and sO of the earlier register, using a simple combinati onal circuit, as
TABLE 4.1 Four·step process for designing a multifunction register.
shown in Fig ure 4. I 9.
Step Descri ption
I I I I I. Determine Count the number of operations (don't forget the maintain present vaJue
'3 / '2 / '1 /10 / mllX size operation!) and add in rront of each flip-Rop a mux "ith at least that
shr in number of in puts.
L 13 12 11 10 2. Create mllx Crc:uc an operation tab le defi ning the desired ope ralion for each
shein
-~ combi-
j-- sl shUn
shl in operaTion fable possible va lue of the 1ll1lX selec t lines.
-~ national r - sO 3. COl/fleet mll.X For each operation. connect the corresponding I1lUX data. input to Lhe
- ~
circuit t> 030201 00 inplllS appropriate external input or flip-fl op OUlput (possibl~ pa..-.sing through
t> 03 /0 2 /01 /00 / 4. Map cOllfrol
some logic) to achicve the desired operat ion.
Create a lnllh table that Illaps ex ternal control lines to the internal mu,
I I I I lili es select lines. with appropriate priori Lies. and then design the logi to
Figure 4.19 A small combi national circui t maps the control inputs 1 d. shr. and shl to the ach ieve lhnl mapping
mux ,elect inputs S1 and sO. We' ll illustrate the regi ster design process \ it h ano ther example.
164 4 Datapath Components 4.3 Adders 165

EXAMPLE 4.7 Register with load, shift. and synchronous clear and set . . Loo ki ng at each output in Figure 4.23. we deri ve the cqu3Iions desc rib ing the circuit that maps
. . following operations: load. shift lelt. synchronous clear, and the external comrol in puts to the 1l11IX select li nes as follows:
We want 10 design a register with the r h on<>ration (1d. 5 h 1. c 1 r. set). The s)'l/chro- 52 ~ c1r ' *set
.h . co ntrollOpUIS lor enc .. -
synchronous SCI. wit uniq ue . I d all Os into the register on the nex t rising clock
nOlls clear opermion on :1 reglSle~ means to °1~ a~ nil 15 in to th e register on the nex t rising clock
51 c1r " set ' *ld "'sh1 + c1r
Th I s set opernuon means to sO ~ c1r ' *set ' *ld + c1r
edge. e S) 'IIC tTOIl Oll .' cd because some registers co me wilh asy" chronous clear or
edge. The lerm synchronous IS Includ h 'gister design method of Table 4. 1. we perform the fol-
asynchronous set operations. FollowIOg I e rc We could then cre3(e a cOlllbin:ll iona l circuit implementi ng th ose equations, to map the ex ternal
lowing sleps: register control inpu ts to the mux selec t li nes. and hence. complcling thc register's design.
. . , . . There arc 5 operati ons- load, shift ,left : synchronous clear. sy~chro-
Stcll l. Determlilc mux Size D ',rorget the mmntaIn present va lue operat ion as Some reg isters co me with asynchronous clear and/o r asynchronous set control
Ilaus se t, and ma intain preselll I'a/ll e. on ,
inputs. Those inpulS could be implemented by connecting them to asynchronous clear or
th at opcnl1ion is implicit.
asy nchronous set inputs thm ex ist on the ni p-nops themselves.
Step 2: Create mux operation table. We' ll use s2 s1 sO Operation
the fi rst 5 inputs of an 8x I mux for the o o Maintain present value
desired 5 operations. For the re m ~m l~g o o 1 Parallel load 4.3 ADDERS
3 mux in puts. wc' lI choose to mmnlam o o Shih leh
Ihe present value. though those mux Adding two bi nary numbers is perhaps the most co mmon operation perfonned on da ta in
in puts should never be utili zed. The
o 1 1 Synchronous clear
o o Synchronous set a dig ital system. An N-hil adder is a d:ltapa th co mpo nent Ihat adds two N-bi t binary
,able is shown in Figure 4.21. num bers A and B, and generates an N-bit sum S and a I-bit carry C. For instance, a 4-bit
o Maintain present value
o adde r adds two 4-bilnu mbers. like DIll and 0001 , result ing in a 4-bit sum . li ke 1000 .
Figure 4.21 Operation lable for a register Maintain present value
wit h load, shift, and sync hro nous clear
with a carry of O. 1111 + 0001 wo uld resull in a carry of I and a sum of 0000 (or
Maintain present value
and set.
10000 if yo u treat the carry bil and sum bits as o ne 5-bit result). N is often referred to as
the \Vidlil of the adder. Designing fasl ye t size-effi cient adders is a subject that has
In received considera ble attent ion for many decades.
Step 3: Connect mux inputs" We connect Ihe
.......Hrl~0f--~~;===;~Irom
mu x in puts as shown in Fi gure 4.22. Altho ugh it ap pears that we could des ign an N-bil adde r by fo ll ow ing the com bi -
On-l nati o nal logic des ig n process of Table 2.5 , it IUrn s o ut th ai buildin g an N-bit adder
which for simplicily shows onl y the
Illh nip-nop and mux of the register. fo ll owing that process is not very pracli ca l when N is much large r than 4. A 4- bit
adder has IWO 4-b it input s. meaning eighl inputs to tal, and has fo ur sum outputs and a
carry oUlpUt. So we could des ig n the adde r using Ihe standard com binalional logic
D des ig n process of Table 2.5. For exampl e, a 2-bit adde r, which adds two 2-bit nu m-
bers, could be desig ned by starting with the truth table de picted in Figure 4.24. We
Figure 4.22 Nth bit-slice of a register with could then impl ement Ihe logic using a two-leve l logic gale based implementation for
the following operations: maintain present o each o utput.
value. parallel load. shirt lefl. synchronous
clear. and synchronous sel. On
Inputs Outputs Inputs Outputs
Step ~ : Map control lines. We' ll give c 1 r highest priority, followed by set' .l d. and Sh 1, ~o .1 aO b1 bO e s1 sO .1 aO b1 bO e s1 sO
the register contro l inputs would be mapped to the 8x J mu x select hnes as shown In
0 0 0 0 0 0 0 1 0 0 0 0 1 0
Figure 4. 23. 0 0 0 1 0 0 1 0 0 1 0 1 1
0 0 0 0 0 0 1 0 0 0
Inputs Output. 0 0 1 1 0 1 0 1 1 0
elr set Id shl s2 .1 sO Operation 0 0 0 0 0 1 0 0 0 1 1
0 0 0 0 0 0 0 Maintain present value 0 0 1 0 0 0 1 0 0
Figure 4.23 Truth table 0 0 0 1 0 1 0 Shih leh 0 0 0 1 1 0 0
fo r the control lines of 0 0 X 0 0 1 Parallel load 0 0 0 0
a register with the Nth 0 0 Set to all i S
0 X X 1
bit-slice shown in
X X X 0 Clear 10 all Os
Figure 4.22. Figure 4.24 Trulh table for a 2-bil adder.
----------------------------~~~~

166 Datapath Components 4.3 Adders 167


. hat for wider adders. the approac h res ul ts in
The problem with such an approach" t , 6 b' . dd >r has 16 + 16 = 32 inputs
I I' h ' bl ' d too m~ln y gate,. A I - II a c ,
too "rge 0 Iru t t,1 e' ,In .// . 's A two- level logic gate based
. , bl ' Id h'IVe over jOllr bl rO/l IVII .
mealllng the trulh t.l e wou , . "II ' of oates To ill ustrate this
. I> > , " I' h' ' bl> would likely req UIre ml Ion 0 .
IInp ement,lIl on a t ,II ta e. . . 'h we used Ihe standard combinational logic + 0 + 0 o
Point ' we performed an expcnm~nl ..
tn whlc . ' h I b' dd
.' width stantng Wit - II a ers on up. ",e
'u
de'ign proce" 10 create adder> 01 Increasing i n 'tool avai lable. and asked the tool to
used the most advanced commercial logiC des g r d" OR Figur. 426 Adding 1\' 0 bln"ry numbe"
. . ( I ve l of AND gates lee tng tnlO an gatt b) h;md. colum n by column .
create a design u. ing two leve ls of logiC one e II ')
. . o o
num ber of gates (actua y. trans istors .
t t 0 1 I 0 I
for each out pu t) and using the minimum
The plot in Figure 4.25 sum- For each column, we odd Ihree bit, togelher, "lid we ge nerate II SlIllI bit ror the
mari ze:, results. Not ice how 1 ~-----------------------'
OUf
pre~ent column and a carry bi t fnr the ne" co lllllln. The firs l COIUIlIII is all exception in
fast the nu mber of transi. tors 8000~------------------~~ that we onl y ad I two bi t, t ge lher, hUI ,till gc ncr:tt ~ a MIIIl IIlId tt cu rry bit. The carry or
grows as the adder width i,
increased. Th is fast growth is an
effect of exponential growth- for
1?
'in
* 6000
c 4000~----------------~~~
the last column become, the lifth bit or the ' lim. The MIIIl i, 101 0 I (2 I in base 10).
We can create a c mhinat ional compollenl to perrOrlll the requ ired add ilion 1'01' a
sing le column . The input' and ou tpuh of ' "ch colllpOllent s arc , ltow il in Figure 4.27.
an adder wi dth of N. the number of ,::'"
2000~--------------~~" Thus, all we need to do i, de, ign tho,c cOlllpOllentS tha i perrorm Ihe addi lion in each
truth table rows i, proportiona l 10 co lumn. and connect them together u, shown ill Figure 4.27 to create" 4- bi l adder. Bear
2N (more preci,ely, 10 2"'·,v).
2 3 5 7 in mind, though. that this llIethod or creating lin adder ;' illtended to enuble eflicient
Clearly, Ihi s exponential growth N design of wider adder,. like those with 8 hil ' and above. We arc ill uslruting Ihe metllOd
prohibits uS from w,ing the stan- u. ing on ly n 4-bi t adder becm,," that ,ile adder keeps our figures sma ll and readable, but
dard design proces, for adders Figure 4.25 Why large adders aren't built using
!"wndard two-level combinati ona l logic-nOlice the
if al l we rca lly needed wa' a 4-bi l adder, Ihe , tandurd combinaliollal log ic design process
wider than perhaps 8 to 10 bits. We for two-Icvel log ic wou ld probably work jUM line.
could nOI complete our experi- exponential grow th. How many transistors would a
ments for adders larger than 8 32-bil adder require?
0 -------,
bi ts-the 1001 simply could nOI
complete the design in a reasonab le . .
amount of ti me. The too l needed 3 seconds 10 build the 6-bll adder,40. sewnds to bUIld A:
the 7-bi t adder, and 30 minu tes for the S-bit adder. The 9-bl t adder dldn t fi nt sh after one
full day. Looking at this data. can you predict the number of transistors requlfed b~ a 16- + B: 0 o
bi t adder or a 32-bi t adder u ing two-levels of gates? From the fig ure, II looks hke the
nu mbe r of transistors is doubling for each increase in N, with about 1000 transistors for Figure 4.27 sing
N=5 . 2000 t.ransistors fo r N=6. 4000 transistorS for N=7. and 8000 transistors for N=8. combinat ional components
Assuming that trend continues for larger adders, then a 16-bit adder would have S more to add Iwo binary numbers
do ublings beyond the S-bi t adder. meaning multiplying the Size of the S-bll adder by colu mn by colum n. SUM
28=256. So a 16-bit adder would require 8000 • 256 = about two mi liton transistors. A
32-bit adder wou ld require an additi onal 2 16=64K doublings, meaning 2 mi ll ion • 64K = We' ll now design the components in each column of Figure 4.27.
over 100 bi/lio/l transistors. That's an outrageous number of transistors. We clearly need
another approach for designing larger adders. Half-Adder
Inputs Outputs
A half-adder is a combinalional component that adds two bits b co
(a and b), and generales a sum (5) and carry out (c o) bit. ( ote 0
Adder-Carry-Ripple Style 0 0 0
that we did flot ay that a half-adder adds /lVO 2-bit /Ill/fi bers-a 0 1 0
An alternative approach to the standard combinational logic design process for adding half-adde r merely adds tlVO bits.) The componenl on the ri ght in 0 1
0
two binary numbers i to instead create a circuit that mim ics how we add binary Figure 4.27 that adds the rightmo t column 's two bits (a and b) 0
numbers by hand. which is one column at a time. Consider the addition of a binary and generates the sum (5) and carry-out (CO) bit is a half adder.
num ber A-IlIl ( 15 in base 10) and 8-0110 (6 in base 10), colu mn by column, shown We can design a haJJ-adder using the straightforward combina- Figure 4.28 Truth table
in Fi gure 4.26. tional logic design process from Chapter 2, as fo llows: for a half-adder.
168 Dalapalh ComponenlS 4 J Addors 169
lrulh table 10 caplure lhe funclion. '!be t~p J: r~OJ
thr drt'u;t.
'tep I: CoptU" /" ~ Juncllon. We ' ll use a
The CIrcUlI f{lr ~ lull',ld<kr I' a
nppropnnle lrul h lable" hO" n In Figure 4.2 b co
,ho" n 10 hgu~ -I .\ I( a t, .uld
and rhal a' b • rhe lulI ·adder', hi. ~ '\mhol
Slep 2: Convert 10 ~q/lations. We can clearly see lhal S -

db' Ole Ihallhe equullon S - d' b + ab' i lhe arne as " hO\, n 10 hgure J 11 (h)

-I- Oil url') -Rippl \ddrr


Slep J: Creole I" ~ circuil. The cIrcUli a b
L"lOg Ihrec lull'Jd(k" ,Uld
for ,I half-adder, Implemenl'"g Ihe above
equulloo;. I' ,h()wn III Figure 4 29(a), one halt -add r, "e an dC'I~n
Fi gure <I ,29(b) ,how, U bloel ,ymbol fa a 4-bll carT)-npple adder,
Half·adder "h,ch add, I"" -I bll
half-ndder (HA)
numbe" ,lOd gener.He, " J
Full-Adder co 5 bll urn. ,ho"n 111 h j!urc co
.j L The 4-hll CdrT) npple (I )
Jull-adder " " cmnb,",lIona l compo- co
nenl Ih"1 Jdd' Ihree hll\ (d, b, and cO (I )
Jdder ,10,0 generale' J l'drT) Figure 4 31 I uti Jddrr (,lllIreu'l. II"d (b) hlock 'YlIlh,,1.
(b)
oul bll
and ge nerale, J 'um (s ) ,,"d a carry-oul figure 4211 Half-adder' (.) cireuil. and
(co ) bll. ( ole Ihar we did flol ~uy Ihal a (b) block symbol.
full -adder add, 111'0 J-bll fI"mben- 1I aJb3 112b2 ., bl
llIerely .,dd, ,"r", bllf. ) The three component in Figure 4.27 thaI add the Iwo bilS of a
column (a and b) along Wllh Ihe carry from Ihe column on the righl (ci) and generates J
the SUIll (s ) and carry oul (co ) bll_ are full -adders, We can de ign a full-adder usi ng !be 03020 100
' 1r:lI ghlforw" ru comblllallonal logIC de,ign proccs . as follows : 4·bll ddor
co s3s2s 150
Step I: Capillre t"e Jl/llction. We'll usc a ItUth lable 10 Inpula
caplure Ihe fun cllon, , hown In Figure 4.30,
Outpula - II
b 01 co I co 53 . 1 aO
tep Z: Co m'.rl/o equations. We oblain the followi ng a 0 a a a (a) (b)
equali on, for co and S, For ; impli ilY, ler's wri le Ci as a a 1 a fIgure 4 J2 4·bll adder, ( ) arry "pplc ImplCIllCnlnllOn Wllh 3 fuli -adde" und I 111M-udder, und
c, We'lI u,c algebmic method, 10 implify rhe equations, 0 0 a 1 (bl blocl tmbol
co - a 'b c + ab'c + abc' + abc
a 1 1 0
co - a ' bc + abc + ab'c + abc + abc' +
a a a 1
a 1
We can Include a carry-In hll WIth Ihe 4-blt lIdder, which cnllble; u ~ to connCCl 4-bil
abc 0
adder\ logether 10 build larger adde". We Include the cllrry-in bil by replacing Ihe half-
co - (a'+a)bc + (b ' +b)ac + (c '+c)ab a 0
adder (whIch WOL\ In Ihe rightm \1 bil po\lllonj by a fu ll-adder. a ~ , hown in Figure 4,33,

- a'b ' c + a ' bc ' + ab ' c' + abc a3b3 a2b2 81 bl


Figure 430 Trurh lable for a
- a'(b'c + bc ' ) + a(b ' c ' + bc )
full-.dder,
- a ' (b xor c) ' + alb xo r c)
, t • ,

During algebmic simplification. for co, we nOled Ihat each of rhe first three terms could
be combi ned wilh Ihe In I term abc. as each of the first three lerms differed from rhe last
lem1 in jusl one lileral. We thus re3led three instances of rhe last' term a bc (which
doesn 't change the funclion ) and combined rhem with each of rhe first three lerms. DoO'I
co 53 s2 51 sO
worry if you aren'l able 10 come up wirh thaI simplification on your own righl now-
(a) (b)
Seclion 6,2 introduces merhocls to make such simpli fication more straightforward. If you
have read rhn! seclion , you mighl try usi ng a K-map (introduced in that secLion) to sim- figure 4.33 4-bi l adder: fa) carry-ripple implemenlalion with 4 full-addc". wilh a carry-in inpul,
plify the equalions, and (bj block symbol.
170 Datapath Compone nts 4.3 Adders 17 J
. .' r Su ose that all inpu ts have been Os for a long
Let's ana lyze the behavIOr 01 thi s adde. POP d ' II c i va lues of the full adders will Keep wa iting. After a third full-adder delay. the new va lue of co2 wi ll have propagated
S '11 b 0000 co wtl l be . an d through the nex t full -adder, resultin g in 52 becomi ng 1+0+1 - 0. with c o2 becoming 1. So
time. meaning that WI e . 11 d 8 becomes 000] at the sa me time (whose
also be O. oW suppose that A becomes 0] an f A and 8 will propaoate throuoh the a fter three fu ll-adder de lays, the output will be 00000. as hown in Figure 4.34(c).
Th ,ew values a , " "
sum we know sho ul d be ] 000) . ose I . ? S So 2 ns after A and 8 change, the sum Just a htl le more patience. After a founh full-adder delay. co2 has had time to pro-
full-adde rs. Suppose the de lay of a full -adder IS - n .. F' re 4 34('1) So 53 will become pagate th rough the last full-adder. resulting in 53 becoming 0+0+1-1, wi th c03 staying
'11 h 0 as shown In Ig u . ,.
outputs o f the full-adders WI c. an"e. +0+0= ] (with c02=0), 5] will become O. Thus, after four full-adder del ays. the o utput w ill be 01000. as hown in Figure
0+0+0=0 (with c 03=0). 52 w tll become ]1 ]-0 ( 'th coO=] ) But 1111 + OllO 4 .34(d), and 01000 is the correct re ult.
. 1-0) d sO will beco me + - WI .,
1+0+0=] (with co - . an 01000 What we nt wrong ?> To recap. until the carry bits have had time to rippl e thro ug h a ll the adders. from
sho uld not be DOll O-inste:ld. the sum should be .
ri g ht to left. the ou tput was not COrrect. The intermedi ate o utput va lues are known as spu-
0111+0001
rious values . The delay o f the 4-bit adder, meanin g the time we must wait until the Output
IS the stab le correc t va lue, is equal to tile delay o f fou r full-adder. or 8 ns in thi s case.
wh ich is the time fo r the can'y bits to ripple throu g h a ll the adders-hence, the term
The Ie I'm "ripple< carry-ripple adder.
carry" adder is
(IClltol/Ylllore Students often inti all y confuse full-adders and N-bi t adders. A full-adder ad ds 3 bilS.
COII/IIIOIl. I prefer In contras t. a 3 -bit adder adds two 3-bit numbers. A full -adder produces o ne sum bit and
Output after 2 ns (1 FA delay)
c030 Os3 Ille term "corn-
one carry bit. In contrast, a 3-bit adder produces three sum bilS and o ne carry bit. A fulJ -
ripple" for .
cOllsistent I/alll illg adde r is usually used to add o nl y olle colf/1I111 of two binary numbers. wherea an N- bit
lIIith OIlier adder adder is used to add two N-bit numbers.
types. like carry-
sdeCi (lIId carry- An N-bit adder often comes wi th a carry-in bit. so that the adder can be cascaded
lookailea{/, which w ith other N-bit adders to form larger adders. Figure 4.35(a) haws an 8-bit adder built
we describe in from two 4-bit adders. We would set tWe carry-in bit (ci) o n lhe ri ght to 0 when adding
Chaprer 6.
Output after 4 ns (2 FA delays) two 8-bit numbers. Figure 4.35(b) shows a block ymbo l of that 8-bi t adder.
o 0

a7a6a5a4 b7b6b5b4 a3a2al aO b3b2bl bO

ci a·bit adder CI

Output after 6 ns (3 FA delays)


co
(a) (b)

Figure 4.35 8-bit adder: (a) carry-rippl e implementati on built from two 4-bit carry-ripple adders.
and (b) block symbol.

o o (d )
o Output after a ns (4 FA delays)
EXAMPLE 4.8 DIP-switch -based adding calc ul ator
Figure 4.34 Exa mple of adding 0111 +0001 using a 4-bil carry-rippl e adder. The output wi ll
Let 's design a very simple calculillor that can add two 8-bit binary numbers and produce an 8-bil
ex hibit temporaril y incorrect (spu ri ous) results until the carry bit from th e fi ght most btl has had a
result. The input binary numbers wi ll come frol11lwO 8-swilch DIP switches. and the ourput \, i11 be
chance to propagate (ri pple) all the way through to the leftmost bit.
displayed usi ng 8 LEDs. as illustrated in Figure 4.36. An 8-bit DIP (Dualllllille Package) ,witch is
a simple digital component havi ng switches th at a user cnn by h:md mo\'e up or dO\\ n. \\ ilh up out-
OI hin g went wrong-the carry- ripple adder simply is n' t done yet a fter ju t 2 ns.
putting a ] on th e corresponding pin. and down outputt ing a O. An LED (Iight-emitling diode) is jU'1
After 2 ns, co O changed fro m 0 to 1. Now, we must all ow time fo r that lIew va lue of coO
a smalllighl Ihm illumi nates when the LED 's input i~ 1. and is dark when the input i~ O.
to proceed through the nex t fu ll -adder. Thus, afte r another 2 ns, 51 wi ll eq ual 1 +0+ 1=0,
We con implement this calculator by ut ilizing an 8-bit c:llT) -ripple adder for the CALC block.
and co 2 wi ll become 1. So after 4 ns (two full -adder delays). the o utput will be 00 100,
as shown in Figure 4.36. \Vhcn n lIs~ r moves the switches on 3 DIP s \\ itch. the ne\\ bm3~ \-alu~$
as s how n in Figure 4.34(b).
propaga te through the carry~ ripplc adder's gates. generating intcnnittent outputs and henC'<' C3lb1ng
172 4 Datapath Components 4.4 Shifters 173

Delay and Size of an 8-Bit Carry-Ripple Adder


Assuming full-adders are implemented usi ng two levels of gates (ANDs followed by an
OR), and that every gate has a delay of I ns, let 's compute the total delay of a 32-bit
carry-ripple adder. Let 's also compute the size of such an adder.
To determine the delay, note first that the carry must ripple from the first full-adder to the
32nd full-adder. The delay of the first full-adder is 2 gates * I nslgate = 2 ns. The new carry
must now ripple through the second fu ll-adder, resulting in another 2 ns. And so on. Thus, the
ci 0 total delay of the 32-bi t carry-ripple adder is 2 nstfu ll -adder * 32 full -adders = 64 ns.
B.bit carry-ripple adder
To determine the size, note that a full-adder requires approximately fi ve gates (we
say approximately because the 3-input OR gate in a full-adder requires more transistors
than each 2-input AND gates, and the 3-input XOR gate requires even more transistors).
Since the 32-bit adder has 32 full-adders , the total size of the 32-bit carry-ripple adder is
CALC 5 gates/full-adder * 32 full-adders = 160 gates.
Figure 4.36 8-bit DIP-switch- LEOs The 32-bit carry-ripple adder has a long delay, but a reasonable number of gates. In
based addi ng calculator. The Section 6.4, we' ll see how to build faster adders, at the expense of using more gates, but
addition 2+3=5 is shown. still using a reasonable number of gates.

. .' 0 until the values have finally propagated through the entire cir-
rapid blinking of some of the LE s'. d th LEOs display the correct new sum. EXAMPLE 4.9 Compensating weight scale using an adder
CUil, al which point the outpu t stabliJZes an LEeD 1' '1-. 'I.e intenniuent values. we can introduce a
• ,L bl" ki g of the s w "' "' A scale, such as a bathroom scale. uses
If we want to aVOId ",e In n . h ' d' t ' s when the new value should be displayed. We a sensor to determine the weight of an
.. ] ..) t the system whlc to lea e
button e (for equa s . a fi ured 'both DIP switches to represent the new inputs to be summed. object (e.g .. a person) on the scale. The
press e only after haVing can g . . F re 4 37 We connect the e input to the 1oa d sensor's readings for the same object
We can utilize the e input with a register, as JO Iguwl'tc'he~ on the DW switches, new intennittent
d . When a user moves 5 may change over lime, due to wear and
input of a parallel loa regISter. bl k d at the register's inputs, as the register holds its tear on the sensing system (such as a
values appear at the adder outputs, but are'L OC e 'ous value When the e button is pressed, then on
th LED d' splay ",at previ . spring losing elasticity), resulting
previous value and hence e s J ded d the LEOs will then display the new value. perhaps in reponing a weight that is a
the next clock edge the register will be.IIOabe '~ct only if the sum is 255 or less. We could connect few pounds too low. Thus, the scale
Notice that the displayed value WI I carr ci ~ O
may have a knob that the user can tum B-bit adder
co to a ninth LED to display sums between 256 and 51 1.
to compensate for the low reponed
weight. The knob indicates the amount
1 to add to a given weight before dis-
playing the weight. Suppose that a knob Weight
O'~~FFFFF-~FFFFFF~ can be set to change an input compen-
sation amou nt by a value of 0, I, 2,
clk t==~::;::::j:::::j:::::j:::::j:::::j:=J:=J:~_.!.A~d~jU~s~te:rJ
7, as shown in Figure 4.38.
We can implement the system using to display
an 8-bit carry-ripple adder, as shown in
Figure 4.38 Compensating scale: the dial outputs a
ci 0 the figure. On every rising clock edge,
B-bit adder number from 0 to 7 (000 to Ill), which gets added
the display register will be loaded with
to the sensed weight and then displayed.
the sum of the currently sensed weight
plus the compensation amount.
Figure 4.37 8-bit DIP switch-
based adding calculator, using
a register to block spurious
LED outputs. The LEOs onl y CALC
4.4 SHIFTERS
get updated after the button is Shifting is a common operation applied to data. Shifting can be u ed to manipulate b!ts,
pressed, which loads the LEOs li ke when we want to reverse the bits of a number. Shi fti ng is useful for communi aung
output register.
data serially, as was done in Example 4.6.
174 Datapath Components
4.4 Shifters 175
.. d' 'd' " by a factor of 2. In base I 0, you are
Shift ing is also useful for multlpl ymg or IVI In" d b s· Ilply appendin a 0 to a
o EXAMPLE 4.10 Approximate Celsius to Fahrenheit convener using a shifter
. I . I ' b 10 can be one y II 0
fami li ar wit h the Idea that mu li p yJJ1g Y . O· I '111e as shiftin o left one We arc given a digital thermometer that digiti zes a tempermure in
-' O· -0 ApP ' ndJJ1 o a IS tIe s, I 0
num ber. For exa mple. ) times I IS). " 0 d b pend in o a 0 meaning by Celsius inlO on 8-bit binary num ber C. So 30 degrees Celsius would
.. . . . b ? I ' I . 1" by ? can be one y ap 0'
pOSI tiOn. LIkeWIse. In ase -. mu tip yll " - h . base 10 multiplying be digilized as 0001111 D. We wan! to Conve n Ihal lem perat ure '0
.. 01 . . ?' . 1010 Furt ermore, 111 .
Shiftin g left one pOSlll on. So 01 times - IS . . . I f ' So '11 base 2 multiplying Fahrenheit . aga in using 8 bits. The equmion for converting is:
. O ' 11Iftlno e t tWIce. I , ,
by 100 can be done by ap pcndJJ1g two s. 01 S Of I . . base 2 1's 1.I·ke multi- F = C*9/5 + 32
by 4 can be do ne by shI. ftIng. Ie ft tWIce.
. SI11' f(II10
"
Ie t t"ee lII11es In . I, ' b ? h' f(
Plyino by 8. And so on. And since shifting left is the same as multl p yJJ1g y _, S I 109 Let's assume that we are nOI concerned abou t accuracy. so we' ll
o ... Od" ddb ? ls 0101. replace th e equation by a simpler oll e;
ri ght is the same as dl vld JJ1g by 2. So 101 IVI e . y - . fi nd the need to
Althou" h slll. ftlng
. can be done uSIng . a Sh·ft I reo" lster' so metimes
.. d Iwe h'ft b d'f
F = C*2 + 32
use a separale " combinati. .onal component that ~e r fa. n11S the ShIft , an t lat can S l Y 1-
We can design the converter stra ightforwardl y using a left shifter
fere nt nu mbers of positi ons and in di fferent directions. (wilh a shin in value of 0) 10 compule C*2. and Ihen an adder to add Figure 4.40 Celsius to
32 (00 100000). as in Figure 4.40. Fahrenheit conve ner.
Simple Shifters
An N-bit shifter is a combinat ional component that can shift an N- bit input by some .. FAHRENHEIT VERSUS CELSIUS_
amount to generate an N- bit output. ... .. 5 we wa nt a s hifter that The U.S. represents temperature using Fahrenheit.
costing seveml hundred million dollars. was destroyed
The simplest shi fter shi fts one pOSItIOn 111 one directIon. .ay . . 0 • whereas most of the world uses the metri c system's
when enteri ng the Mars atmosphere too quickly_ The
shi fts left by I positi on. That simpl e shifter's deSIgn IS strai ghtforward , COnSIStln o of Just Celsius. Presidents and other U.S. leaders have desired
reason: "a navigati on error resu hed from some
wires as show n fo r the 4-bit left shifter Figure 4.39(a) . Note that the shIfter has an addI- lO switch to the melfic system for almost as long as the
spacecraft commands being ent in English units
tional' input thaLis the va lue La shifLinto the ri ghLmost bit. U.S. has existed, and several aC lS have been passed over
instead of being converted to metric units." (Source:
the centuries, the mosL recem being the Melric
www.nasll.gov). Perhaps if all readers of this book in the
i3 i2 i1 iO i3 i2 i1 iO Conversion Act of 1975 (amended several limes since). U.S. use Celsius when they talk. we' ll help speed up the

~in
The ACLdesignates the metric sysLem as the preferred

W
transition? So instead of saying ·~It 's a warm ninety
inl system of weights and meas ures for U.S. Irade and ' degrees outside today," say "II 's a warm thirty-two
commerce. Yet switChing (0 metric has been slow. and

~~
degrees outside today." Actually. we mjghl say '11's a
few Americans Loday are comfonable with metric. The wann three ten and two degrees outside today"
q3 q2 q1 qO problem with such a slow transirion was poignantly (remember correct counting in Chapler I?).
demonstraLed in 1999 when Ihe Mars ClimaLe Orbiter.

$- q3 q2 q1 qO q3 q2 q1 qO
(b) (e)
EXAMPLE 4.11 Temperature averager

(a) Recall Example 4.3 , in which registers


were used to save a hislOl)' of tempera-
Figure 4.39 Combi national shifLers: (a) len shifter wiLh block symbol shown at bOLlom, (b) len ture values over the last three clock
shin or pass component. (c) left/rig ht shi ft or pass component. peri ods. We wa nt to extend thi s system 10
save th e last four values instead of three.
A more advanced shi fte r can eiLher shifL one pos iLi on when an addiLio nal inpuL sh is We also want the sys tem to compute the
1, or can pass the inpuLS Lhro ugh La the OULpULS unshi fLed when s h is O. We can deSign average of the las t four values and ou tput
LhaLshifLer USi ng 2x I muxes, shown in Figure 4.39(b). . . thai average on an output Tavg. The
An even mo re advanced shifLer can shift left or righL o ne pos iL ion, shown JJ1 FIgure average of four va lues Ra, Rb. Re, and
4. 39(c). When bOLh shi ft control inpuLs are 0, the inpuLs pa s th ro ug h unchanged. When Rd is (Ra+Rb+Re+Rd) 14 . NOie thaI
s hl=l , the shi fLer shifLS left , and when sh R=l, the shifLer shi fLS ri ght. When bOLh Lhose dividing by 4 is the same as shining right
control inpuLS are I, the shi fLer could be des igned La OULput Os by connecL ing Os La the 13 by two. Thus. we can design the sys tem
inpuLS of the muxe (noL shown). Funher eX Lensions of the simpl e shIfter ~re pOSSIble: using a right shifter Ihat shifts by two
such as all owing shi fts of one po iLi on or two posiLi ons. Such mulu funcLl on shlfLers pl aces (wiLh a shift in value of 0). as
inLernal designs require larger muxes, and mapping of the control Ignals to the mu~ shown in Figure 4.4 1.
Figure 4.41 Temper:Jlure a\'erager using a right-
select lines, jusLas was necessary in designing multifuncti on reg i Lers.
shifl-by-~ 10 divide b) 4.
176 4 Datapath Components 4.5 Comparators 177

Barrel Shifter
45 COMPARATORS
An N.bil barrel shifler is a general purpose N·bit shifter that can shift or rotate any We often Want to compare t b'
than the other F wo tnru: numbers to see if Ihey are equa l, or if one is greater
number of posi tions. For sim plicity. le!"s consider only left shIfts for the moment. An .. suring huma~ ~ ex~mple. we 111Ight want to sound an 819rnl if a thermometer mea-
bit barrel shifter can shift left by I position. 2 positIons. 3 poslllons. 4 posllJOns, 5 poSI· Fahrenheit (394 d Y emperature reports a temperature greater than 103 degrees
tions. 6 position,. or 7 positions (and of cour eO positions. meaning no shift is done). An binary number~. egrees Celsius). Comparator components perf0n11 such comparison of
8.bi t barrel shifter therefore requires 3 control inputs . say x. y, and Z, to speCIfy the dts·
tance of the shift. xy z- OOO may mean no shift. xy z~OO I shift by I pos ition, xy z~OIO
shifL by 2 positions. etc. Equality (Identity) Comparator
We cou ld design such a barrel shifter by placing an 8x I mux in front of each of the 8
shifter outpu ts. connecti ng xyz to each of the eig ht mu x's select input. and then con· An N·bil eqllalily COm I ( .
para or sometImes called an idel/lily comparalor) is a datnpath
necLing the mux inputs wi th the appropriate shifter inputs for each configuraLion of x, Y, cfomLhponent .that compare two N· bi t input A and B. setting an output control signal to 1
and z. So 10 (corresponding to xy z- OOO. meaning no shi ft) of each mu x would just get I e two tnputs are equal 1\yo N b' .
B-b3b2blbO . . • It mputs, say two 4· bit inputs A- a3a2a l aO and
the present bit's sh ifter in put. II (corresponding to xyz~OOI. meaning left shift by one a3-b3 2 b2' arc equal If each of theIr corresponding bit pairs are equal. So A-B if
position) would get the shifter input one posi tion to the right. 12 (xy z=O I 0, meaning left .a - ,al-bl. andaO -bO.
shift by two positions) would get the shifter input two positions to the right. And so on. turinFOllowing the combinational logic design process of Table 2.5, we can start by cap-
Such a design. while conceptually straightforward. has too many wi res being routed g the functi on of a 4· bit equali ty comparator as an equation:
about. And the design does not scale well to larger bit· widths. such as a 32·bit barrel eq - la3 b3+a3 'b 3 ' 1 * la2b2+a2 'b 2 ') * lalbl+al'bJ'1 *
sh ifter-a 32x I multiplexor cannot be built with two levels of gates (AND/OR), because laObO+aO 'bO' 1
gates with 32 inputs are too big to be implemented efficiently. and must instead be imple·
mented using multiple levels of smaller gates. b th ~Ch term detects if the corresponding bits are equal, namely, if both bits are 1 Or
o liS are O. The expressions inside each of the parentheses represent the behavior of
A more elegum de,ign for an S·bi t barrel shifter
?n XNOR gate (recall from Chapter 2 Lhat an XNOR gate outputs J if the gate's two
consists of 3 cascaded simple shifters. as shown in
tnput bIts are equal), so we can replace the above eq uation by the equivalent equation:
Figure 4.42. The firs t simple shifter can shift left four
positi ons (or none). the second c<ln shift left by two eq - (a3 xno r b31 * (a2 xnor b2) * lal xno r bl) * laO
xnor bO)
positions (or none). and the Lhird by one position (or
none). Notice th<lt the shifts "add" to one another- We convert the equation to the circui t in Figure 4.43.
shifLing left by two, then by one. resu lts in a total
shift of Lhree positions. Thus. by configuring each a3 b3 a2 b2
shifter appropriately. we can obtain a total shift of
any amount between zero and seven. ConnecLi ng the
control inputs xy Z to the shifters is easy-just thin k
of xy z as a binary number representing the amoun t
of the shift, x represents shifting by four, y shifting 4·blt equality comparator
by two. and z shifti ng by one. So we just connect x Figure 4.42 B·bit barrel shifter eq
to the left-by-four shifter, y to the left-by-two shifter, (Iefl shift only).
and z to the left-by-one shifter. Ib)
The above design considered a barrel shifter LhaL could only shift left. We can easily
extend the barrel shifter to support both left and right shifLS. We would replace the
internal left shifters by shifters Lhat could shift left or right, Lhus each having a control Figure 4.43 Equality comparalor: la) inlemal design, and (bl block symbol.
input indicaLing the direction. The barrel shifter would Lhen have a direction control input
also, connected to each internal shifter's direcLion control input. Of course, we could have built the comparator starting with a truLh Lable, but that
Finally, we can easily extend the barrel shifter to support shifLS and rotates. We would would be cumbersome for a large comparator, with too many rows in the truth table to
replace the internal shifters by rotators Lhat cou ld ei ther shift or roLate, thus each having a easily work with by hand. A truth Lable approach enumerates all the possible situation
control input indicaLing whether to shift or roLate. The barrel shifter would then have a shift- for which all the bits are equal, si nce only those situations would have a I in the column
or-rotate control input also, connected to each internal shifter's shift-or-rotate control input. for the output eq. For two 4-bit numbers, one such situation will be 0000 -0 000.

b J
178 Datapa th Componen ts
4.5 Comparators 179
Anot her wi ll be 000[=0001. learly, there wi ll be as many situ at ions as there are 4-bit
if acl and b-O , sellinc Ollt I - [ ' f'
binary n ll ll1bcr~-lllca n ing there wi ll be 2J = 16 ~ itu a l.i ons where both n~lll be rs are equaJ. ~ I a-O and b- 1. " nd setl ing ou t €q-l if a and b
bOlh equa l I or bOlh equal O. -
F r two 8-bi t numbers, there wi ll be 256 equal ilU allons. For two 32-b lt numbers, there We could C'lplu re II r . , '
will be four bi ll ion equal ,i tuat ions. A comparator built wilh such an approach wi ll be .
brevlly h ' Ie unCllon 01 a siage', block u~ in g a Irulh lab Ie wilh 5 ;'lP1115 For
l a ugh " II ' I .
large if we do n' t min imile Ihe equation, and Ihat minimi zali on will be hard with such . 'f ~ , \\e " mp y U"" Ihe fo llow ing equal ions deri ved from the earlier exph -
nati on 0 how c!ach Si a k I ' . '
large nu mbe" of terms. Our XNOR-based des ign looks 10 be much simpler and scales to . " ge wor s: I Ie CirCUli for each stage wo uld follow di reclly from
tIlese eq uall on>:
wide inp uts wonderfu ll y- widening Ihe in puts by one more bil invo lves merely adding
ou t_9 i 11_9 + (i 11 €q A a ' b ' )
One morc XNO J~ gil le.
ou _It - in. It + (i n_€q * a ' , b)
out_€q - in €q * (a XNOR b)
Magnitude Comparator- Carry-Ripple Style
cGJ) o o 1
An N-bitmagl/itll de comparator i, a dalapalh componenl Ihal compares two N-bit binary a3 b3 bl bO
num bers A and B. and indicmes whel her A>B . A=B , or A<B .
We have already seell several limes Ihal designing certain datapat h components by
sl:l rt ing wilh a Irul h lable involves 100 large of a trulh lab Ie. Lei's instead design a magni-
AglB
tude compara lor by con,idering how we compare num bers by hand. Conside r comparing AeqB
IWO 4-b il number> A=a3aZalaO-10 11. B=b3bZb lb O= 100 1. We stan by looking al the AIIB
high-orde r bi IS of A and B, namely. a3 and b3 . Since Ihey a.re equal (bot h are 1). we look Stage2 Slage l StageO
at Ihe nc.XI pair of bits. a Z and bZ . Again . since Ihey are equal (both are 0), we look at the (0 )
neX I pair of bi". al and b1. ince aJ>bl (l>O ), we conclude th ai A>B . 1 ~Q) o 1
Thu" comparing IwO bi nary num bers takes place by comparing from the high bit- a3 b3 a2 b2 aO bO
pairs dow n 10 Ihe low bil -pairs. A; long as bi l-pairs arc equal. we need to compare the
neXI lower bil -pa ir. As soon as a bil-pa ir is different. we conc lude that A>B if a i =1 and
b i =0 , or Ihal A<B if bi -1 and ai-D . We ca n thus des ign a magni tude comparator using AglB
Ihe struclure shown in Figure 4.44. AeqB
AIIB
a3 b3 a2 b2 al bl aO bO
Slagel
~ ~ ~ ~ ~ ~ (b) StageO
b a
tgl __ in_gl oul_gl AglB
1
b3
0 cG])
a2 al bl
leq ..... in_ eq oul_eq AeqB
lIt __ in_1I OUUI AtlB
Stage3 SIage2 Stage 1 SlageO AgtB
(a)
AeqB
AIIB
StageO
4·bil magnilude comparalor
1 o ~
a3 b3 a2 b2 al aO bO
~
(b)
Figure 4.44 4-bil magnil ude comparalor: (a) internal design usi ng iden lical components in each
t ~ ~ ~
slage. and (b) block symbol.
glB
eqB
Each stage works as follows . If i n_9t=1 (meaning a higher stage determined A>B), ItB
th is stage need nOl compare bits, and just sets ou t_9 t = 1. Likewi se, if i n_lt =I
(meaning a hi gher stage determined A<B), th is stage ju t sets out_ l t= 1. If i n_ €q=1 Stagel Stag eO
(d)
(meaning hi gher stages were all eq ual), thi s stage must compare bits, setting ou L 9t =1
Figure 4.45 The "rippl ing" wilhin a magnitude comparalor.
4.6 Counters 181
180 Data path Components . f A~ 1011 and 8~1001.
. arator works for an IOpUt a . If A<B. UlC comparalOr's A1 tB OUIPUI wi ll be 1. In Ihis case. we wanl 10 pass A through the
Figure 4.45 shows how thIS camp sisting of four stages . mux. so we conneCI A1 tB 10 Ihe 8-bil 2x I mu. selecl inpul. and A 10 Ihe mu. ·s I I input. If Al tB
We can view the comparator's behavIOr as can by sellin o the external input I eq~ I, is O. Ihen eilher Ag tB-I or Ae qB- 1. If Ag tB - I, we wanl 10 pass B. If Ae qB- l. we can pass either
5() ve star! 0 . -1 d A or B (since Ihey are idemic"I). and so leI 's pass B. We Ihus simply connecl B 10 Ihe 10 inpul of the
In Stage3 shown in Figure 4.4 a, \ the comparison. Stage3 has 1 n_ eq- ,an
to force the comparator to actually do '11 become I , while o ut _g t and out_I t 8-bil 2x I mu •. In Dlher words. if A<B. we' ll pass A, and if A is nOI less than B. we'll pass B.
since a 3 ~1 and b 3~l, then ouLeq WI NOlice that we sel the comparator's I eq conlrol inpul lO 1. and the I gt and 11 t inputs 10 O.
These values rorce the comparmor to compare its data inputs.
wi ll become O. that since out_eq of Stage3 connects
In Stage2 shown in Figure 4.45(b). we see .'11 be 1 Since a2~0 and b2~O, then
2' in eq WI '. 4.6 COUNTERS
to in eq of Stage2, then Srage s -t d ou t 1 t will be O.
- h'l out 9 an - .
out eq will become I , w J e - that since Stage2' s out _e q IS con-
In S-tage I shown in Fioure 4.45(c), we see
J's i 0 eq WI '11 be 1 . Since a1 ~ 1 and b1=O, An N-bit COUllIer is an extended N-bi t register component thaI can increment or decre-
0
ment its own va lue on each clock cycle, when a count enable control input is I.
nected to Stagers i o_e q, Sla;~t eq ~d out_l t will be O.
Illcrement means to add I. while decr emellt means to subtract I. A counter that can
out gt will become 1, whIle - that the outputs of Slagel cause increment is known as an liP-COli liter , a counter that can decrement is known as a down-
- . 445(d) we see
In StageO shown in FIgure . . h'directly causes StageO's out_g t to become COli Iller , and a counter that can increment and decrement is known as an IIp/doWII-
StageO's i o_g t to become I, whlc b a Notice that the values of a a and bO COlllller. A 4-bi t Up-counter would thus count the fo llowi ng sequence: 0000 , 0001.
l and causes out_eq and o ut_l t to e . t to the comparator's external out-
, . 0' outputs connec 0010 , 0011 . 0100, 0101. 0110, 0111. 1000. 1001, 101 0, 1011,
are irrelevant. SIOce Slage s 8 d A1t8 will be O.
8 ' 11 b 1 whIle Aeq an 1100, 11 0 1. 1110, 1111. 0000 , 0001,etc. Notice that a counter wraps aroulld
puts, Agt WI e , h he staoes in a manner similar to a (also known as rollillg over) from the highest value (1111) to O. Likewise, a down-
I . les throug t " .
Because of the way the resu t npp '1 h' way is often referred to as havIOg a counter would wrap around from 0 to the highest value. A control output on the counter,
. d parathou h whatIS's rippling is not rea IIy a " carry .. b'II.
tor bUl t t
carry-ri pple adder, a magOltu e com often called termillal COU llt, or tc , becomes 1 during the clock cycle that the counter has
ca rry-ripple style implementallon, even g t d straiohtforwardly WIth another 4-bll reached its last (terminal) count value, aft er which the counter wi ll roll over.
. an be connec e o . . .
The 4-bit magOltude comparator c. . d comparator and likeWIse to bUIld any Figure 4.47 shows the block symbol of a
'Id 8-btl maglll tu e
magnitude comparator to bUl an. o . on outputs of one comparator Ag tB ,
, ( 4-bi t up-counter. When co t=I , the counter
size comparator, sImp y y connectln the next comparator (I 9 t, I eq, I It) .
. I b
. 0 theofcompans increments its own value on every clock
Aeq8, A1t8) wi th the comparIson IOpUtS of looic, and a gate has a I ns delay, then each cycle. When cot~O. the counter maintains its
If each stage is built from two levels f " -ripple style 4-bit magnitude compar- present va lue. On the cycle that the counter
staOe wi ll have a 2 ns delay. So the delay 0. a Carryarator built with thi s style wi ll have a rolls over from 1111 10 0000, the counter
ato~ is 4 stages • 2 ns!stage = 8 ns. A 32-blt comp
sets tc=l for that cycle, returning tc to a on Figure 4.47 4-biI up-counter block symbol.
delay of 32 stages * 2 ns!stage = 64 ns. the next cycle.
. . . f 0 numbers using a comparator
EXAMPLE 4.12 Computing the mll11mUm 0 tw ak I va 8-bil inputs A and 8, and OUlputs an Up-Counter
I mponent that t es \
We want to deSign a combmauona co magnitude comparator and an 8.bu 2xl
f A dB Wecanusea We can design an N-bit up-counter using the
8-btt OUlpUI C thaI IS the min Imum 0 an shown In FIgure 4.46
multiplexor to Implement thiS componen t, as register design process described in Table
4. I- the incremented value of the register
A1-__~Bl;::::::::::::::::lh~~8
MIN would be fed into a mux input, and the
8 8
counter's control lines would be mapped to
the mux select lines. A simpler view of an up-
A B counter design is shown in Figure 4.48,
8-bit magnitude comparator 8-bit assuming an incrementer component exists to
2x1 mux
add 1 to the present value. When c nt=O . the
8 register should maintain its present val ue.
(b)
C When c n t = I, the register should be loaded
wi th ils present va lue plus 1. Note that the 4-
Figure 4.46 A combinalional componenl(a)to compule the ITIlm
. .mUm 0 f Iwo numbers: <a) internal input AND gate causes temlinal count t c to
design using a magnitude comparator. and (b) block symbol. become 1 when the counter reaches 1111. Figure 4.48 4-bil up-counter imernal design.
182 Datapath Components 4.6 Counters 183

We could usc the sa me combi nat ional log ic des ign process to build larger incre-
Incrementer .' 'rcuit for the incre- menters. Reca ll that we said in Section 4.3 th at building adders USing the combinational
We need to des ion a CO lllbJl1Htl Onal CI . 0 th e carries: 0 11 logic design process was not very practi ca l. Yet here we built an incremenler using the
menter. We could" sImply
. use .an N-bit adder. by. setlln"
a an N-bit 00 1 1 combi national logic des ign process. A key difference to note is that a 4-bit adder has 8
d I · ·n to
8 input to 0001 an tIe CaJrY- I
aBut usm"
. e loo ic involved in unu sed ~ 1 inputs, whereas a 4-bit incrementcr has only 4 inputs. Thus. we can build wider incre-
adder is overki ll-we do n t need all th t '0001 Instead. 00100 menters as two- level logic imp lementat ions usi ng the combinat ional log ic design process.
an N-bi t adder, because 8 is always JUs . .' mber Of course, at some point. even the number of inputs for an incrementer gets too large, in
dd ' 1 to 3 bma1 )' nu
observe in Figure 4.49 that a mg ' three bits per Figure 4.49 Adding I to whi ch case we might chain smaller incrementers toget her to make a wider incrementer.
invo lves only two bns per column . not b rs
a binary number requi res
col umn like when add .mg twO gener,31 bmary num S ( eon.4 3) onl y 2 bits per co lumn.
EXAMPLE 4.13 Up-counter used in the above-mirror display
Recall that a half-adder adds two bits (see ec I 1." lf: In Example 4.4 and Exa mple 4.6. we assumed
Thus. a simp le .mcrementel. caul d be budt usmg 1,1 pressing a mode button would cause input." xy to
adders, as shown in Figu re 4.50. sequence from 00. 01. 10. 11. and back to 00
again. A simple design to ach ieve such sequencing.
assuming the mode input is 1 for exac tly one clock
x y
:!. cycle per bUllon press (sec Example 3.9), "tilizes an
up-counter. as shown in Figure 4.52.
$c Figure 4.52 Sequencer for xy inputs of
above-mirror display.
"E~ EXAMPLE 4.14 1 Hz pulse generator using a 256 Hz oscillator
u
S Suppose we have a 256 Hz oSci llntor. but we wa nt a
(b) I Hz pulse signal. We can cOllven the 256 Hz signal
to a I Hz signal P lIsing an 8-bi t Counter. The 8-bit
COUlHer wraps around every 256 cycles. so we can
si mply connect the osci llator signal to the counter's
Figure 4.50 4·bit incremenrer: (a) internal design. and (b) block symbol. clock input, se t the counter's load input to 1. and
th en use the cou nter's tc output as the pulse signal.
We could instead design an as showll in Figure 4.53. A I Hz signal may be
usefu l for driving a clock or a wmch, for example.
incremen ter using th e combin a· Inputs Outputs
since I Hz means I pul se per second. Figure 4.53 Clock divider.
tiona I logic design process fro m 32 31 00 cO s3 s2 sl sO
33
Chapter 2. We wou ld start with a 0 0 o 0 0 1
0 0 0 Down-Counter
truth table, shown in Figure 4.5 1. 0 0 1 0 o 0
We obtain each output row simply 0 0 0 0 o 0 1 1 A down-coumer can be designed simil arly to
by adding 1 to the corresponding 0 1 1 0 o o 0 an up-counter, repl acing the incrementer by a
0
input row binary number. We would 0 0 0 0 o o 1 decrememer, as shown for tile 4-bit down-
then deri ve an equation for each 0 1 0 o o counter in Figure 4.54. A decrementer could
output. For example, we can easily 0 0 o 1 1 1 be designed in a similar manner as an incre-
see that the equation for cO is 0 0 o o 0 menter, staning from a tnlth table like that in
eO=a3aZa1aO . We can also easily 0 0 0 0 o o 1 Figure 4.5 1. Note that the term inal count te
see that sO=aO · . We would derive 0 1 0 o o becomes 1 when the down-coumer reaches
0
eq uations for the remaining outputs, 1 0 0 o 1 1 0000 , implemented using a NOR gale-
and then implement the circuit for 1 0 o 0 recall Ihat NOR oUlputs 1 when all its inputs
1
each output. The resu lting incre- 0 o 1 are O. The reason the down-counter detects
0 0
menter would have a total delay of o 0000 for te, rather than 1111 like the up-
0 0
on ly two gate levels, which is less 1 1 1 1
cou nler. is because a down-counter wraps
0 0
around after 0000. as in the fo llowi ng coum
delay than the incrementer in Figure o 0 o 0
4.50 built from half-adders. sequence: 0100, 0011, 0010, 0001. 0000.
Figure 4.51 Truth table for four-b it incrementer. 1111, 1110. . Figure 4.54 4~bit down~counter dl!sign.
184 4 Datapath Components
4.6 Counters I8S
Up/Down-Counter Counter with Parallel Load

An up/down-coullter can COLIIlI Cou nters often come wi th the


e ither up or down. It requ ires an abi lily to initiali ze the count va lue,
dir
input signa l d i r to indi cate the achieved by load ing the counter's
cou nt d irecti on, in audition to the registe r wi th parallel da ta. Figure
elr
count e nab le signal cn t. We' lI let 4.57 shows the design o f a 4-bit up-
ent
d i r=O mean to count up anu counter with para llel load . When
d i r= I mean to cou nt down. control input 1d is 1. the 2x I mu x
Figure 4.55 shows the design of passes load data input L to the reg-
such a .J-b it up/dow n-counte r with ister; when 1d is O. the mu x passes
synchro nous clear. A 2x I mu x the incremenled value. Furthemlore.
passes e ithe r the decremented or we OR the counter's 1d and cnt
incremented va lue. with di r s igna ls to generate the load signa l
selectin Q among the two-d i r=O for the registe r. Whe n c n t is 1. the
(co unt ~p) pass~s the incremented incremented value wi ll be loaded.
va lue and d i r = I (co unt down) Figure 4.55 4-bi t up/down-cou nter design. Whe n 1d is 1. the parallel load da ta
passes the decremented value. The Figure 4.57 Inlemal design of a 4-bit up-counter
wi ll be loaded. Even if c nt is 0, with load.
passed va lue gets loaded into the 4-bit register if cnt=1. di r a lso se lects w~,eth e r to pass
1d = 1 causes the register to be
the NO R or AND output to the termina l count tc ex te rna l output-d 1 r-O (count up)
loaded. A dow n-counter or up/
se lects the AND. while d i r= I (count dow n) se lects the NOR. . . down-counter could similarly be
Alternati vely. we could design an up/down-counter using the regtster destgn process of
exte nded to ha ve a para llel load.
Section 4.2. by directly connecting the incrementer and decreme nter outputs to mu xes In front
Para llel load is useful when we
of each flip-flop. and mapping the c 1 r. cn t. and d i r control stgnals to the mux select llIles. 4-bit down-counter
wan t to generate a pu lse signal that is
No tice that we a lso added a control input c 1 r. whic h we could have added to the ~p­
not directl y obtainable from lelting a
counter and down-cou nter too. Ihat when 1 SYllchrollol/sly clears the regt ster, mealllng
counter wrap around and pulse its t c
reselling the reg ister to a li Os on a risi ng clock edge. We used a 4-b it register wi th clear to
output natura lly. An N-bit counter
support the c lear operati on.
narurally wraps around every 2N
cycles. What if we wan t a pulse Figure 4.58 A counter selup thaI pulses
EXAMPLE 4.15 light sequencer t C every 9 cycles.
e very X cycles, where X is not a
We want to design a sLrip of 8 light bulbs. such thaL the
power of2? For example. say we have a 4-bit down-counter. which nonnally pul es the tc
bu lbs illuminate one :1t a lime. ri ght to left , and then clk
(1 Hz) OUlput and wrap aro und every 16 cycles. and suppose we want to pulse every 9 cycles. We
repeal illuminating in Lh aL sequence. The sequence
can ach ieve Ii pu lse every 9 cycles by selling the load data input L to 9-1. or 8 (1000). and
should proceed at the rate of one bulb per second. Such
by connecting the tc outpu t to the load con trol input 1d. as shown in Figure 4.58. When the
a lighting displ ay might be attracti ve outside a restau-
counte r reaches its lowest value (0000). tc wi ll become 1. cau ing the 1d inpul to become
rant or movie the.ner. for example.
For simplicity. assume we have an oscillaror that 1. Thus. on the next clock cycle, the counter will load 1000. rathe r than wrapping around to
generate a I Hz clock signal (meaning one rising 1111. (Note: the load occurs on the lI exl cycle. not the present cycle. because t c changes
lights O O O O O O O O to I after the rising clock edge. so the new value for 1 d doe n' t gel seen until the next clock
clock edge per second). We ll connecl Ihis clock to a 3-
bi L up-counter. and connect the counter' s three outputs Figure 4.56 Light sequencer. edge.) The counter would thus count in the sequence 8. 7. 6. 5. 4. 3. 2. I. O. pulsing tc and
10 a 3x8 decoder. as shown in Figure 4.56. the n re tu rn ing 10 8. The reason we load 9-1 . rather than 9. even though we wan t a pul e
When the power is on, the system counlS up (we don't kn ow what th e initial value of the counter every 9 cycles. is because we must remember that 0 is included in the count sequence-just
wa,. but it doesn't really matter). wrapping around from 111 10 000. We don' t need the tc output as Ihe count from 15 down to 0 takes 16 cycles.
in this example. We could instead u e an up-counter for the same purpose. but we must make the load
Notice that we used a 3-bil COlllller will! (I decoder, and 1101 all 8- bil COlllller , even though value eq ua l to the tota l cycles minus the desired cycles. So for the above example. we
there were 8 OUlpUtS. An 8-bit counter would generate the sequence 00000000 . 0000000 1. wou ld use a load va lue of 16 - 9 = 7 (0111). The counter would count the sequen e 7. 8.
00000010 ....• 11111110. 11111111. That sequence is 1101 the desi red sequence. 9. 10. I I. 12. 13, 14. 15. pulsing tc and then retuming to 7.
186 4 Oata path Components
4.6 Counte rs 187

EXAMPLE 4.16 New Year's Eve countdown display d d ad gale output set the COunter clear inpu t to I . We assume the counter's clear input clr lakes prece-
I ul the numbers 59 down to O. an a ec er dence o:er th e Counter's coun t input ('nt. Since th e AND gme'~ output wi ll pulse every 60 cycles
in Example 2.30. we uti lized n microprocessor 10 ou P h' . 1ple we' ll repl ace the micropro-
" " b d h' l output In t IS cX3 n . and the Input clock freq uency is 60 Hz. th is circuit convcns a 60 Hz input clock into a J Hz output
(0 Il lumlllate one 01 60 IIght~ <lse on I ~ I' ul 59 down to O. Suppose we have an 8-bit clock. A circliit thm convcns an input clock into a new clock wi th a lower frequency is known as a
cessor by J down-counter with parallel load to ou P 0 \1 1 cd ( 0 load 59 and then count clock diJll·der.
. . f ?55 down to . n' e nc
down-coumer ava il ab le. whi ch ca n CO ll nt rom - I d 59 'nlO the coumer and then the
' 11 d reset 10 oa . I .
down. Assume the u~er can press a bUllon ca e . . (d' nl) 10 the 1 position (coun t) to
. d f
user can move n switch count own rOI11 L:Ih" pOSIation •
onFI cou. 459
Timer
beg in the countdown. The system im plcmcn13lion is shown III Igure . .
A common use of a counte r is as the centra l compone nt wilhin anolher device called a
Happy
New li mer. A limer is a special type o f counter that mea ures time. Measuring time is a very
a dO Yea r! common tas k in a dig ital syste m.
cO iO
d1 One type of timer is based on " down -counte r. We sto re a value into the counler. and
c1 i1
d2 o---.,........... wa il for the terminal count (0) to be reached. If we know the counter's oscillator fre-
c2 i2 We load 999.
i3
d3 1---........-r'\ rather ,hall 1000. quency. the n we can load " value corresponding to a des ired time interval. For example.
c3
bC('(lIIse we mus, SUppose we Want 10 know when one second has passed. usi ng a counter havi ng a clock
c4 i4 ••• remember ,"m 0
freque ncy o f I kH z. We would thu load 999 (in binary. meaning 1111100 Ill) into the
c5 i5 ;s parr o/the
d5a CO/U/I. Tr\' counte r and enable count ing. Aft er I second, the counter wo uld reach 0 and assen its ter-
c6 d59 cOlmlillg from 9 minal count outpu!. notifying us that I second has passed. A timer may repeat this
c7 d60 D. raising
dOll'll fO
(I finger ('(lch lime
process a utomatically, using the term inal count to auto mat icall y reload the de ired time
d61 )'011 say a I lIImbe r. va lue (999 in our example) into the counter. Such a timer might be used in any type of
a·bit d62 fireworks No/ice {"m H'hell watc h or clock. Our earlier three-cycles-hi gh lase r time r (from chapter 3) could have been
c~~~~~r Ie ~~~4 roll remc/t D. fell
d63 fingers are lip. bui lt using a timer component. especially if in tead o f wa nting the lase r high for three
cycles, we wa nted the laser hi gh for a pe riod of time like 1.5 seconds.
Another type of timer is based on lin up-counte r. We reset that counte r to O. and then
Figure 4.59 Happy New Year counldown system using a down-counler.
e na ble counting whe n some e ve nt occurs that we wa nt to time. When the event e nds. we
Notice thaI the tc signol is our "Happy New Year" indicotion. We'veconnecled that signal to di sable the counter, aft er which the Counter contains the nu mber of cycles that occurred
an outpul called fi reworks. which we' ll assume aClivates a deVIce Ihat Ignlles fireworks. Happy during the event. Knowing the time of one clock cycle. we mUltiply the number of cycles
New Year! by the time of one clock cycle to obtain the total time for the event. For example. if we
time an event as las ting 500 cl oc k cyc les. and the timer's osc ill ator freq ue ncy i I kHz .
EXAMPLE 4.17 1 Hz pulse generator using a 60 Hz oscillator the n the time for the event was 500 cycles * 0.00 1 slcycle = 0.5 s. We ill ustrate this type
of timer using all example.
In the U.S .. electric ilY 10 Ihe home operates as an alternating current with a frequency of 60 Hz.
Many appliances convert Ihis signal to a 60 Hz digital signal. and then convert the 60 Hz dlgll,1
EXAMPLE 4.18 Hig hway s pee d meas uring system
signal to a I Hz signal. 10 drive a clock or olher device needing to keep track of lime at Ihe granu·
lari ty of seconds. Unli ke Example 4.2. we can 't Many highways and freeways have ys tems that measure the speed of car at various parts of the
simpl y use a cou nter of a parti cular bilwid th. since no highway and upload Ihal speed information to a cenl ral compu ter. Such inforn1a tiol1 is used by law
basic up-counter wraps around after 60 cycles-a 5- cnrorcemcnr, traffi c planners. and rad io nnd Internet traffic rcpons.
bit counter wra ps around every 32 cycles, whil e a 6· 6-bit up-counter One technique ror measuring the speed of a car use two sensors embedded under the road. 3S
bit counter wraps every 64 cycl es. Let's start with a csc ilJ ustraied in Figure 4.6 1. \Vhen a car is over a sensor, the sensor ourputs n 1: otherwise. the sensor
6-bit counter, which coun ts fro m 0 to 63 and then (60 Hz) outputs a O. A sensor's output travels on underground wires to a speed-measuring computer box. some
wraps around to O. We' II add some some extra logic, of wh ich are above Ihe ground and others of which are underground. The speed measurer delermines
as shown in Figure 4.60. The extra logic should speed by di viding Ihe distance betwcen the sensors (which is fixed and known) by the time required for
detect when Ihe counter has counted up 10 59, and :l vehiclc to Lra vel frollllhe first sensor to the second sensor. If the distance between the ~ensors is 0.01
should clear the counter back to 0 on the neXl rising miles, and a ve hicle takes 0.5 seconds to tr3vel from the first 10 the second sensor. then the ,elucle's
clock edge ra lher Ihan lelting the counter continue peed is 0.0 1 miles I (0.5 seconds • ( I hOllr 13600 seconds)) = 72 mile per hour. .
counting to 60 and beyond . Fi fty- nine as a 6-bit To measure the lime between the sensors. we can con truct a imple FS~ l lhat controls:1 16-bu
a
binary number is 111 11. Thus the AND gale in timer. as shown in Figure 4.61. State SO clears the timer to O. The FS~I transition, 10 ' tate J \\ hen
a
Figure 4.60 detec ts 111 11. in which case the AND Figure 4.60 Clock divider. a car passes over the first sensor. 51 starts Ihe timer counting up. The F M stays in J until the 3r

e
188 4 Datapath Components 4.7 Multiplier-Array Style 189
4.7 MULTIPLIER-ARRAY STYLE
An NxN lIIulliplier is. a d'II , ,'IP'a tl1 component Ihat mul tip
' li.es two N-blt. inpul binary
,, Illl.mbers A (Ihe multiplkand) nnd B (Ihe multi plier). and OUIPUIS an (N+N)-bi t result. For
!..-----------a example, an 8x8 muili plier multiplies 11'0 8- bil bi nnry numbers and OUIPUIS a 16-bil
resu lt. Deslgnlllg an NxN multiplier in 11'0 levels of log ic using the siandard combina-
(a)
b' ti ona l deSign process wi ll result in 100 complex of a design. as we've al ready seen for
prev Ious operati ons like add ition and compari son. For multipliers wilh N grealer than 4 or
so, we need a more effi cienl melhod .
Figure 4.61 Measuring veh icle . We can creale a reasonably sized multiplier by mimick ing how we perl'onn multipli-
speeds in a highw<.Jy speed call a n by hand. ConSider multiplying 111'0 4-bil binary numbers 0110 and 0011 by hand:
measuring system .
(b) OllO (Ihe lap number is ca lled the lIIultiplicalld)
001l (Ihe bOllom number is ca lled Ihe IIIl1ltiplier)
'lSSCS over the o;::ccond 'ensor. causing a transi ti on 10 swtc 52.52 SlOpS Ih~ counti ng .'lIld computes
~~~ ',i;11C lIsino II~e limcr"'s outpu t C. Assuming a I kHz clock input to the tim er. mean.lI1~ each cycle (each row below is called a partial product)
is 0.00 1 s~co~ds. theillhe tim e would be C * 0.001 s. :hal re s~Jl I wou l ~ th~~l be Il1UltlP~l~d byD.?" 0110 (because Ihe righlmoSI bil of Ihe multipli er is 1. and 0110*1 =0110)
3600 to obl~lin the speed. We omit the impiemen talion detail s of th t; speed computatIOn, which 0110 (because Ihe second bil of Ihe multiplier is 1, and 0110*1 =0110)
wou ld Illost likely be implcmented as so ft warc 011 a microprocessor. 0000 (beca use Ihe Ihird bil of Ihe multipli er is O. and 0110*0=0000)
+0000 (because Ihe leflmOSI bil of Ihe mullipli er is O. and 011 0*0=0000)
• HOW DOES IT WORK? CAR SENSORS IN ROADS.
00010010 (Ihe product is Ihe sum of all Ihe panial producls: 18. which is 6*3)
How does a highway speed sensor or a traffic light car sensor know
mat a car is present in a parti cu lar lane? The main method t.oday uses Each panial prodUCI is easi ly oblained by ANDing Ihe presenl multip lier bit wilh lhe
what's called an inductive loop. A loop of wire is placed Just under multipl ica nd. Thus. multiplication of IWO 4-b il numbers A (a3a2alaO) and B
the pavement-you can usually see the cu ts, as in Figure 4.62(a). T~at (b 3b2 blbO) can be represenled as fo ll ows:
loop of wire has a particular "inductance," which is an electrol1lcS
tenn describing the wire's opposition to a change in eleclIic current-:-
a3 a2 al aO
higher inductance means the wire has higher opposition to changes 10
X b3 b2 b1 bO
- - - - - - - - - - - - - -- - ---------------- ----
current. It turns out that placing a big hunk of metal (like a car) near
the loop of wire changes the wire's inductance. (Why? Becau ~e the bOa3 bOa2 bOal bOaO (ppl)
metal disrupts the magnetic field created by a changing current In the bla3 bla2 bl al bl aO 0 (pp2)
wire-bul that's getljng beyond our scope.) The traffic light c,antral b2a3 b2a2 b2al b2aO 0 0 (pp3)
ci rcuil keep checking Ihe wire's induclance (perhaps by Irylng 10 + b3a3 b3a2 b3al b3aO 0 0 0 (pp4)
- - - - --- - ------ -- --- - ------- - - - - - - - --
change the current and seeing how much the current reaJly changes In
a certain time period). and if inductance is more than nonnal, the p7 p6 p5 p4 p3 p2 pI pO
circuit aSSumes a car is above the loop of wire.
Many people lhink Ihal Ihe loops seen in Ihe pavemenl are scales Afler generaling Ihe partial produclS (pp l. pp2. pp3. and pp4) by ANDing the preselll
that measure weight-I've seen bicyclists jumping up and down o.n
mu lli plier bil wilh each mullipl icand bit. wc me re ly need 10 sum those partial products
the loops trying 10 gel a lighl 10 change. ThaI does n'l work, bUI II
together. We can use Ih ree adders of varying widths for compuling Ihat sum. The resulting
sure is entenain ing to watch.
design is shown in Figure 4.63.
Many others believe Ihal small cylinders a((ached 10 a Lraffic lighl 's
suppon anns, like Ihal in Figure 4.62(b), delecl vehicles. Those inslead (b)
Th is design has a reasonable size. abo ut Ihree times bigger than a carry-ripple adder.
are Iypically devices illal delecl a special encoded radio or infrared-lighl The design has reasonable speed. The delay consists of I gate-delay for generating the
signal from emergency vehicles, causing the traffic light 10 tum green Figure 4.62 (a) Inductive loop for partial producls. plus Ihe delay of Ihe adders. If each adder is a carry-ripple adder. then
for the emergency vehicle (e.g .. 3M's "Oplicom" syslem). Such systems delecling a vehicle on a road, (b) the 5-bil adder delay wi ll be 5*2 = 10 gate-delays, Ihe 6-bi l adder delay will be 6*2 = 12
are anolher example of digilal syslems, reducing the lime needed by emergency vehicle signal sensor for gale-delays, and Ihe 7-bil adder delay will be 7*2 = 14 gate-delays. If we a sume lhat lhe
emergency vehicles 10 reach the scene of an emergency as well as changing an intersecti on's traffic
10la l delay of Ihe adders is simply Ihe sum of lhe adder delays. Ihen the lotal delay would
reducing accidents involving the emergency vehicle ilself proceeding lighl 10 green for Ihe approaching
Ihus be I + 10+ 12 + 14 = 37 gale-delays. However. Ihe 100ai delay of carr -ripple adders
Ihrough a traffic light, thus often saving lives. emergency veh icle.
when chained logelher is aClually less Ihan Iheir sum-see Exercise 4. 15.
190
Datapath Compone nts 4.8 Subtractors 191

aO itself borrow from the fa n l I


a3 a2 a1 . u 1 co umn . The result of the second column is then 10 _ I _
1. The third column bec f h b
'. , ause a I e a rrow generated by Ihe second colu mn. has an a of
1, whIch IS nOi less than b I If ' . .
. so 11e resu l athe Ihlrd co lumn IS I-I ~O. The founh col umn
has a=O due 10 Ihe bo f h . .
0-0=0. rrow rom I e Ihlrd colu mn. and smce b is also 0, the resull is

l si column 2nd column


o 3rd column 41h column
o % 10 ..y
o l{) 1 10
..y 0 o 1 o
..y ~O 0 ..y 0 0
- 0 - 0 -0 - 0
1 o 0
(a)

~
A B
; Block symbol p7 .. pO

Figure 4.63 Inlerna l design of a 4-bil by 4-bil array-SlY Ie ll1ullipl ier.


(b) (e)
Delays for larger multipliers. which lVili have an even longer chain of adders, lVi li be
even slolVer. Faster mU liplier des igns are possible. al Ihe expense of more gates. Figure 4.64 Design of a 4-bil sublraClor: (a) subtraclion "by hand". (b) borrow-ripple
Implementation with four full -subtraclors \vi th a borrow-in input wi. and (c) block symbol.

4.8 SUBTRACTORS
An N-bit slIblracl or is a datapath component that takes two N-bit binary inputs A and B. Based on the above-described behav ior. we could create the internal design of 3 full-
and outputs an N-bit resull 5 equaling A- B. subtractor combinat iona l component to implement the behavior of each col;mn. with a
full- subtractor having an input wi representing a borrow by the previous colum n. and an
output wo representing a borrow from the next column. in addition to the inputs a and b
Subtractor for Positive Numbers Only
and the output s. (We use w's for the borrows rather than b's becau e b is already used
Subtracti on gets slightly more complex when we consider negati ve res ults, like 5 - 7 = -2, for the input : the IV comes from the end of the word borrow.) We leave the design of a
because thus fa r we haven't discussed representation of nega ti ve numbers. For now, let's fu ll -subtractor as an exercise for the reader.
assume we are on ly dealing with positive numbers. so the subtractor's inputs are positive,
and the result is always positi ve. This cou ld be the case, for ex ample, when we are EXAMPLE 4.1 9 DIP-switch-based adding/subtracting calculator
designing a system that only subtracts smaller numbers from larger nu mbers. such as when
In Example 4.8. we designed a simple ca/culalor Ihal could add IWO 8-bil bi nary numbers and
compensating a sampled temperature that wi ll always be greater than 80 using a small produce an 8-bil resuli. using DIP switches for inpuls. and a regisler plus LEDs for outpUI. LeI'
compensation value that will always be less than 10. extend thai calculator to tlllow the user (0 choose 311lo ng addi tion and subtraction operations. \Vc'l!
Designing an N-bit subtractor using the standard combinati onal logic design process introduce a singlc·swilch DIP switch that CIS a signal f (for "function") as another sy (em input.
suffers from the same ex ponenti al size growth problem as an N-bit adder. (See Section When f =0. Ihe calculator should add: when f ~ l. Ihe calcutator shoutd subtr:lcl.
4.3.) Instead. we can aga in try to mimi c subtraction by hand in hardware. One illlplemcnlntion of thi s calculator would use an adder. a subtractor. and 3 multiplexor. as
Figure 4.64 shows subtraction of 4-bit binary num bers "by hand." Starting wi th the in Figure 4.65. The f inpul chooses which component. the adder or sublraclor. 10 pass through the
nrst column , we see that a is less than b (0 < 1). necess itating a borrow from the pre- I11U X to (he register inputs. \Vhen the user presses e. ei ther the addition or subtrnclion result gets
vious column . The nrst column result is then 10 - 1 - 1 (in base ten, two minus one loaded inlo Ihe regisler and displayed al Ihe LEDs.
a
equals one). The second column has a for a because of the borrow by the nrst column, This example assumes the result of a subtraction is always :l positive number. ne\"l~:r negathe.
mak ing a < b (0 < 1), generating a borrow from the third column- which must It also assumes thm the result is always between 0 and 255.
192 4 Datap ath Compone nts
4.8 Subtractors 193
DIP switches Notice ~h at a color printer may have
three color 10k cartri dges, one cyan. one
magenla. and one yellow. Figure 4.66 shows
th~ ink cartridges for a particular color
pnnter. Some printers have a single cart ride:c
for ~o l or i ~l s lcad of three. wi th Ihal single
cartndge lIltemally contai ning separated
nuid compartments for the three colors.
A printer must convert a received RC B
inKlge into CMY. Let's design a fas t circuit
to perform th ut conversion. Given three 8-bit
1
value s fa: R. C, and B for a part icular pixel.
o the equati ons for C. M. and Yare simpl y:
C 255 R
CALC M 255 G Figure 4.66 A color pri nter mixes cyan. magenta.
Figure 4.65 8- bil DlP-swi lch- Y 255 8 and yellow inks 10 create any color. The picture
based adding/subtrac tin g shows inside a color printer having those three
c"lcul mo r. Inpul f sc lecls (255 is the max imu m value of an 8-bi t colors ca n ridges on Ihe righ l. labe led C. M. and Y.
between addition and
subtraction.
I OOOOO.O'/ LEDS
number). A circui t to perform such conver-
sion can be built using subtractors. as shown
Such pri mers may usc black ink direc ll y (Ihe big
cnnridgc on the left). ru ther Ihan mi xing the three
in Figure 4.67. colors . to make gr:.Jys and blacks, in order to creale
Ac tua lly. Ihe conversion needs 10 be a better-looking black and to conserve the more
EXAMPLE 4.20 Color space converter- RGB to CMYK slighll y more co mplex. In k isn'l pcrrcci. e,xpcnsive color inks.
Computer moni tors. di gital cam era s. scanners, primers, and other electro ni c dev ices deal with meani ng that mixing cyan, magenta, and
color images. Th ose devices Lreal an image as millions of tiny pixels (short for "pi ctu re ele- ye llowyields a black Ihal docs n' l look as black as you mighl cxpeCi. Funhennore. colored ink.s are
mems"). wh ich are indi visible dots representing a tiny part of the image. Each pi xel has a color, so expenSive c?l11.pared 10 black ink. Therefore. color printers use black ink whenever possible. One
an image is j ust a collecti on of colored pixels. A good computer monitor may support over 10 way 10 maX imize usc of black ink is to fac tor out Ihe black from the C. M. and Y values. In other
milli on uni que colors fo r each pixel. How does a monitor create each unique color for a pixel? In words, a (C, M. Y) value or (250. 200. 200) can be Iho ughl of as (200. 200. 200) plus (50. O. 0).
a common meth od used in what are known as RGB monitors. the moni tor has three light sources
inside-red, green, and blue. Any color of li ghl can be crealed by adding spec ific inlensities of
each of th e three colors. Thus. for each pixel. the monitor shines a spec ific intensity of red. of
>-
green, and of blue at th at pi xel's locati on on th e monitor's screen. so th ai th e three colors add :2
u
IOgelher 10 creale Ih e des ired pi xel color. Eac h subeolor (red, gree n, or blue) is Iypica ll y repre- £2
sented as an 8- bit binary number (thus each ranging from 0 to 255), meaning a co lor is represented <D
by 8+8+8=24 bils. An (R. G, B) value of (a, 0, 0) represe ms bl ac k. ( la, 10. 10) re presenl s a very
dark gray, while (2 00, 200, 200) represenls a li ght gray. (255, 0, 0) re prcse nlS red, whi le ( 100. 0,
"a:
0) represe nts a darker (noninl ense) red. (255, 255, 255) represenls while. ( 109, 35. 201 ) rcpresellis
some mixture of the three base colors. Representing color lIsing intensity valu es for red. green.
and blue is kn own as an RGB color space. Figure 4.67 RGB 10 CMY converter.
ROB color space is great for compuler monitors and cert ain other devices, but not the besl for
some other devices, like pri nters. Mixing red, green. and blue ink on paper will not result in white,
bu t rather in black. Why? Because ink is not li ghl; ralhcr, ink re ReCis li gh!. So red ink refleClS red
lighl, absorbing gree n and blue li gh!. Likewise, gree n in k absorbs red and blue li gh!. Blue ink
absorbs red and green li gh!. Mi x all Ihree inks logelher on paper, and the mi xlUre absorbs olf lighl,
re Recti ng none, Ihu s yielding blac k. Printers Ih ererore use a differenl color space based on th e com-
pleme ntary colors or red/gree nlblue, name ly, cyan/magent a/ye llow, know n as a eMY color space.
Cyan ink absorbs red, re Rectin g gree n and blue (Ihe mix ture o f whi ch is cya n). Mage nta ink
absorbs green Ii ghl , re Reclin g red and blue (whi ch is mage ma). Ye llow in k absorbs blue, rcRecling
red and gree n (w hich is yellow). Figure 4.68 RG B 10 CMYK convener.
1 9~ Datapath Components 4.8 Subtracters 195
We (/re
The (200. 200, 100). which is i.I light gray. call be generated using black ink. Th e remaining (50, O. familia ri ze yourself with the concept. but bear in mind that the
illiroducillg l ell's 1-9
0) can be ge nera ted lIsing a small amoun t of cyan. and u s ill~ no mage ~ll iJ or yellow ink at all , thus complell/em jll SI mtenll on IS to use complements in base two. nOt base ten.
savi ne. prec ious color ink . A CMY color :-.pace c.xtcnd ed with black IS knowll as a CA1 YK color /or illllliliol/
2-8
Consider subtraction invol ving two single-d igit base ten
spnce- (th e "K" comes from the last Jetlcr in the word "black'" " K " is used instead of " 8" 10 avoid purposes- we '1/ 3-7
(lclltally be usillg
numbers, say 7 - 4. The result should be 3. Let' defin e the
confusion with the " B" frol11 " blu e"), 4-6
11\'0 's complemelll, complem ellt of a single-digit base ten number A as Ih e m lm ber
An RGB to CMYK conver1er can thus be desc ribed ;1S: 5-5
Ihal ,vhell added 10 A res,,/Is ill a S"III of lell. So the comple-
K ~ Min imu m (C . M. Y) ment of I is 9, of 2 is 8, and so on. Figure 4.69 prov ide the 6-4
C2 C K complemen ts for the numbers I th rough 9. 7-3
M2 ~ M - K The wonderful thing about a compl emen t is that you can 8-2
Y2 ~ Y - K use It to p~rform subtraction uSing addition. by repl acing the 9-1
where C. M. ;lnd Y are defi ned as ear lier. \Ve thus create the circuit in Figure 4 .68 for convening an number bemg subtracted with its complement. then by adding ,
RGB color space 10 a CMYK color space. We've used the RGBloCMY component from Figure and then by fin ally throwing away tJ,e carry. For example: Figure 4.69 Complements
4 .67. \,Vc've also used two in ~tance s of th e MIN component lhat we created in Example 4.12 to
in bnse ten.
7 - 4 - ) 7 + 6 ~ 13 - ) t 3 ~ 3
compute th e minimum of two !lumbers: using twO such components computes th e minimum of
th ree numbers. Finally. we use three more subtractors to remove th e K va lue from the C. M, and Y We replaced 4 by its compl emen t. 6, and then added 6 to 7 to obtain 13. Finally. we
va lues. In a rcal primer. th e imperfections of ink and paper requi re even more acijllsllneills. A more then threw away the carry. leav ing 3. wh ich is the correct re ult. Thus, we perforllled sub.
rea li sti c color space conve ner mult iplies the R. G. and B va lues by a se ri es of constants, whic h can lr(lCl fOli u Sin g oddi/ioll.
be described using matrices:
complements
I CI I mOO mO 1 m02 I I RI
I MI ~ Im l 0 mll m12 1* I GI
IYI Im20 m2 1 m22 I I BI
Further discussion of such a ma tri x- based converter is beyond th e scope of this exa mple.

Representing Negative Numbers: Two's Complement


The subtractor design in the prev ious section ass umed we onl y dea lt with positi ve input
numbers and positi ve results. But in many systems, we may obtain results that are nega·
ti ve. and in fact. our input values may even be negati ve numbers. We thus need a way to
7-4~3
represent negali ve numbers using bilS.
One obvious but not very effecti ve representati on is know n as signed-magnitude. In Adding the complement results in an answer
exactly 10 too much - dropping Ihe lens column gives
thi s representation. the highest-order bi t is used only to represen t the number's sign, with the right answer.
o meaning positi ve and 1 mean ing negative. The remain ing low-order bits represent the
magnitude of the number. In thi s representation. and using 4-bi t numbers, 0111 would Figure 4.70 SUbtracting by adding- subtracting a number (4) is the same as adding the number"
represent +7. wh il e 1111 would represent -7. Thus, fo ur bits cou ld represent -7 to 7. complement (6) and then droppi ng the carry. since by definition of the complemenl. lhe result will
be exactly 10 too much. Arter all . that's how the complement was defined- the number plus its
(Notice. by the way. that both 0000 and 1000 would represent 0, the former representing complement equals 10.
O. the laller -0 .) Signed- magnitude is easy for humans to understand , but doesn 't lend
itself easily to the design of simpl e arit hmetic components li ke adders and subtractors. A number line helps us visualize why complement work. as shown in Figure -1.70.
For ex ample. if an adder's inpu ts use signed-magnitude represent ation, the adder would Complements work for any nu mber of digits. Say we want to perfonn ubtraction
have to look at the highc t-order bit. and then internall y perform either an add ition or a using two two-digit base ten numbers. perhaps 55 - 30. The complement of 30 would be
subtraction , using different circuits for each. the number that when added to 30 results in 100. so the complement of 30 i 70. - - + 70
Instead . the most common method of representing negati ve numbers and performing is 125. Throwing away the carry yields 25. which is the correct result for 5: - 30.
subtraction in a digital system actually uses a tri ck th at allows u to lise (III adder 10 So using compl ements achieves subtraction using addition.
p elfo rm subtractiOIl . Using an adder to perform subtract ion would enab le us to keep our "Not so fast! " you might say. In or ler to determine the complement. don't w{, have to
simple adder. and to u e the same component for both additi on and subtract ion. perform subtraction? We know that 6 is the complement of 4 by computing 10 - ~ = 6.
The kcy to performing subt racti on using addit ion li cs in what are known as comple· We know that 70 is the complement of 30 by computing 100 - 30 = 70. 0 haven't \\ e
mellts. We' ll first inlroduce complements in the base ten numbcr system just so you can just moved the subtracti on to another step-the step of computing the complement'?
196 4 Datapath Components
4.8 Subtractors 197
Two'scomplemellr Yes. Except. it lUms out that ill base two, we call compute rite complemel1{ ill a milch If yo u Want to know the n . d f' .
. . I ag nuu e a a two s complement negatIve number, you Can
call he compllled simpler way-jllsl by inverling all Ih e bils alld addillg J. For example, cons ider com-
s imply by
obtall1 the mag Dl tude by ta ki ng the two's complement again. So to determine what
puti ng the comple ment of the 3-bit base- two number 00 1. The complement would be the number 1111 represents, we can take the two's complement of 1111 : (1 111 ) ' + 1 =
ifli'erti"8 the bits
and adding J- number that when added to 001 yields 1000-you can probab ly see that the complement 0000+1 .= 000 1. We put a negative sign in front to yield -0001, or-I.
Ih llS al'o iding the should be 111. Using the same method for compu ting the comple ment as we did in base
needior . A qUI ck way fo r humans to mentally figu re out the magnitude of a negative number
slIbrracrion Il'hen
ten , we compute the two's comple ment of 001 as: 1000 - 001 = Ill-so III is the ln 4-bn two's comple me nt (having a 1 in the high order bit) is to subtract the magnitude
computing a complement of 00 1. However, it just so happe ns that if we inve rt all the bits of 00 1 and of the three lowe r bits from 8. So for 1111 , the low three bits are 111 or 7, so the ma o -
complement. add 1, we get the same result! Inverting the bits of 00 1 y ields 110 : adding 1 yields nnude IS 8 - 7 = I, which in -tu m means that 1111 represents _ I. For an 8-bit two':s
110 + 1 = I l l -the correct complement. comple ment number, we wou ld subtract the magnitude of the lower 7 bits from 128. So
Thus, to perform a subt raction, say all - 00 1, we wo uld perform the following : 10000111 would be-(128-7) = - 12 1.
a ll - 001 . To sum,,:,ari ze, we can represent negati ve num bers using two's compleme nt represen-
- ) all + (( 001 ) ' +1 ) tall on. AddulOn of two's complement numbers proceeds unmodified-we j ust add the
all + ( 110+1) num bers. Even if one or both numbers are negati ve, we simply add the numbers. We
=011+11 1 perform subtractIon of A - 8 by taking the two 's complement of 8 and then adding that
= 1010 (th rowaway the carry) two's complement to A, res ulting in A + (- 8) . We compute the two's complement of 8
- ) 010 by simply inverting the bits of 8 and then adding 1.

That's the correct answer, and didn 't involve any subtractions-onl y an invert and
addi ti ons. Building a Subtractor Using an Adder and Two's Complement
We o mi t di scussion as to why one can compute the compl ement in base two by With knowledge of the two's compleme nt representa-
inverting the bits and adding I -for our purposes, we just need to know that that trick tion, we can now see how to subtract using an adder. To
works for binary numbers. compute A - 8, we compute A + (-8) , which is the
There are ac tuall y two ty pes of complements of a binary number. The type we've same as A + 8 ' + 1 because - 8 can be computed as
been using above is known as the two 's complement, obtained by in verting all the bits of 8 ' + 1 in two's complement. Thus, to perform subtrac-
the bin ary number and adding 1. Another type is known as the olle's complemellt, which tion, we inve rt 8, and input a 1 to the carry- in of an
is obtained simply by inve rting all the bits, without adding a 1. The two 's complement is adde r, as shown in Figure 4.7 1.
muc h more commonly used in digital circuits and results in simpler logic.
Adder/Subtractor Figure 4.71 Two's complement
Two's complement leads to a simple way to represent negati ve numbers. Say we have subtrac tor buill with an adde r.
fo ur bits to represent numbers, and we want to represent both positive a nd negative num- We can straightforwardl y design
bers. We can choose to represent positive numbers as 0000 to a 111 (0 to 7). Negative an adder/s ubtractor component,
numbers wou ld be obtained by taking the two 's complement of the positive numbers, havi ng an input sub , such that
because a - b is the same as a + (-b)' So - I wo uld be represented by taking the when s u b= 1 . the compone nt sub-
two's complement of 000 1, or( 000 1 ) '+ 1 = 1110+ 1 = 1111. Likewise, -2 would tracts, but when sub=O, the b7 b6
be (00 10) ' +1 = 1101+1 = 1110.-3 wou ld be (0011 ) ' +1 = 1100+ 1 = 1101. component adds, as shown in
.:t E:ft\SUb
And so on. -7 would be (all]) '+1 = 1000+1 = 1001. Notice that the two's com-
plement of 0000 is 1111 + 1 = 0000. Two's complement representation has on ly one
representation of 0, namely, 0000 (unlike signed-magnitude representatio n, which had
two re presentatio ns of 0). Also notice that we can represent - 8 a 1 000 . So two 's com-
plement is Sligh tly asy mmetric, representing one more negative number than positive
Figure 4.72(a). The N-bit 2x I mul-
tipl exor passes 8 when sub=O.
and passes 8 ' when sub=l. sub
is connec ted to C in also, so that
c i n is 1 when subtrac ting. Actu-
sub

IvY'
, ..
\~ ... _----- ..,,/
adder's B inputs
nu mbers. A 4-bit two's-complement number can represent a ny numbe r from -8 to +7. ally, XORs can be used instead of (b)
Say you have 4- bit numbers and you wan t to store-5 . - 5 wo uld be (0 101) '+1 the inverters and mux , as hown in
The highest-order 1010+1 = 1011. Now yo u want to add -5 to 4 (or 0100). So we s imply add : 1011 + Figure 4.72(b). When sub=O, the Figure 4.72 (a) 1\1'0'5 complement adderl ubtrn tor
bit in two 's a 1 a a = 1111, which is -I-the correct answer. using a I11UX. and (b) allemative circuit for Busing XOR
output of XOR equals the other
complemem aClS gate.
as a si8" bit: 0 Note that negati ve numbers all have a 1 in the highest-order bit; thu . the highest- input 's value. Whe n sub=]' the
means pOJilive, order bit in two's complement is often referred 10 as the sign bit, a indicating a positive o utput of the XOR i Ihe inverse of
I mean.' negative. number, 1 a negative number. the other input's value.
198 4 Datapath Components
4.8 Subtractors 199
EXAMPLE 4.21 DIP-swltch-based adding/subtracting calculator (continued)
binary. We can eas ily detect overfl ow when adding two binary number simply by
Let's revisi t our DIP-switch-based 3dding/subtfaC ling calculator of Example 4. 19. Ob ervc Lhat at 100kll1g m the carry-out bit of the adder- if the carry-out bi t is 1. overflow has occurred.
any ojvcn lime th e OlilpUI displays the results of either the adder or subtraclOr. ,but ,never both So a 4-bl t adder adding IIII + 0001 would output 1 + 0000. where the 1 is the
Simultnncou~ly. · Thus. we rca ll y don', need both an adder and a. sublraclOf ~peratmg In parallel; carry out-i ndicming overflow.
instend. we can lise a single adde rlsubtraClOr component. ASS UI1lI~lg DIP swltc.hes have been set, When using two's com plement
setting f ""0 (add) verMIS f 3 1 (subtract) should result in th e followlIlg computations: numbers, detecting overflow is sign bits
00001111 + 00000001 ( f~O ) 00010000
00001111 - 00000001 ( f~l) ~ 00001111 + 11111110 + 1
somewhat more complicated.
Suppose again we have 4-bi t
(0\ 1 1 r;\ 1 1 1 rl 0 0 0

00001110 numbers but now those numbers are ~o 0 A:Jooo ~111


\Ve achieve thi s simply by co nnecting f 10 (he 5 u b input o f th e ~ldderls ubtraclor. as shown in
in two's complement form . Con- (j)ooo @11 1 (j) 1 1 1
Figure ~ . 73 .
sider the additi on of two posi ti ve overflow overflow no overflow
numbers, such as 0111 and 000 I (a) (b ) (c)
DIP switches in Figure 4.74(a). A 4-bi t adder If the numbers' sign bits have the same value. which
~------~~ ~~~------ differs from the resuWs sign bit, overflow has occuned.
wou ld ou tput 1000, but that is
incorrect-the result of 7 + I should Figure 4.74 Two's complemem o'erflow
be 8, but 1000 represents -8 in detection comparing sign bits: (3) when adding
two's complement. The problem is two po itive numbers. (b) when adding {Wo
that the largest positive number we negative num bers. (c) no overflow.
can represent in 4-bittwo's comple-
ment form is 7. Thus, when adding two positive num bers. we can detect O\'erflow by
checking whether the most significant bit is a 1 in the result.
Likewise, consider the add it ion of two ne£ati ve numbers. such as 1111 and 1000 in
Figure 4.73 S-bil DIP- Figure 4.74(b). An adder would output a sun~ of 0111 (and a caIT) out of 1). 0111 i
swilch-based adding!
incorrect: - I + -8 should be -9. but 0111 is +7. The problem is that the mo t negative
subLracting calcula tor. using
an adder/s ublractor and
number we can represent with 4-bit two' complement i -8. Thus. when adding two neg-
two 's complement number ative numbers. we can detect overflow by checking whether the mo t ignificant bit is a a
representation. in the result .
Notice thaI adding a po itive with a negative. or a negative with a positive. can never
Le('s consider signed numbers using (Wo's complement. If the user is unaware tha t two's com. result in overflow. The result wi ll always be less negati"e than the moot negati\e number.
plement represcntation is bei ng used and the user will only be inputting positi ve numbers using the or less positive than the most positive number. For example. the extreme i the addition of
DIP witches. Ihen Ihe user should only use Ihe low-order 7 swi lches of the 8-switch DIP inputs, -8 + 7. which is - I. Increasing -8 or decreasing 7 in that addition still re ults in a number
leav ing the eighth switch in the 0 position. meaning the user can only input numbers ranging from between -8 and 7.
0 (00000000) to 127 (0111 I Ill). The reason the user can'l usc the eighth bit is that in two's
So detecting overflow in two's complement iovo" es detecting that both input
complement representation. making the highest-order bit a 1 causes the num ber to represe nt a neg-
ative number. numbers were positi ve but yielded a negative result. or that both input numbers were neg-
If the uScr iii aware of two's complement, then the user could use the DIP switches to represent ative but yielded a positive result. Restated. detecting overflow in 1\\0' complement
negative number too. from - I (1111111) down 10 - 128 (10000000). Of course. the user will involves detecting that the sig n bit ' of bOlh inputs are the same as one another but differ
need to check Ihe lefimoSI LED 10 delerminc whclher Ihe ou tpul represent. " posilivc number or a from the result 's sign bit. If we call the sign bit of one input a and the . ign bit of the other
negali ve number in two's complement form . input b. and the sign bit of the result r . then the following equllti n outputs I \\ heo there
Detecting Overflow is overflow:
When we perform ari thmetic using binary num bers of a nxed bit width . sometimes the ove rflow - abr ' + a'b'r
result i, wider than the fixed bitwidth, a si tuation known as overflow. For example, can.
Although the cireuit implementing the above o\t'fflO\\ del'ction equation is quit
ider adding two 4-bit binary numbers (just regular bi nary numbers for now, not two's
simpl e Hnd illluiti vc. we cun cre:tte an e\en simpler circuit if our adlkr gen r:uc!\ 3 1Oa~·
complement numbe,,) and storing the result as another 4-bit number. Adding 1111 +
ou t. The simpler method merel) ompare ' the can, into the 'Ign 1>11 alumn \\ ith the
0001 yields a re_ult of I OOOO-a 5-bit number. which i, bigger thnn the 4 bi lS we have
arry out of the sign bit column-if the calT) in allll ';In, (lut dlll>r. \)\emo\\ h <
to store the re, ult. In ot her words. 15 + I = 16, and 16 require 5 bi" to repre em in occurred.

-----_.._---
200 Datapath Components 4.9 Arithmetic-Logic Units-ALUs 201

Figurc 4.75 illustra tes this 0 0 0


4.9 ARITHMETIC-LO GIC UNITS-ALUS
1 0 0 0
method for several cases. In Figure 0 1 0 0
4.75(a). the carry into the sign bi t is l. An N-bit adthmetic-Iogic ullit (A LU) is a datapath component able to perfonn a variety
+0 0 0 + 1 0 0 0 +0
whereas the carry out is O. Because of anthmellc and log ic Operations on two N-bit wide data inputs, generating an N-bit data
the carry in and carry Oll t difTer. over- ot 0 0 0 10 0 1 o utput Example arithmetic operations incl ude addi ti on and ubtraction. Example logic
flow has occurred . A circuit detecti ng overflow overflow no overflow operallons .'"clude AND, OR , XOR. etc. Control inputs to the ALU indicate which panic-
whe ther two bits dirfer is j ust an XO R (a) (b) (e) ular operatIon to perfonn .
gatc. which is slightly simpler than " the carry into the sign bit column diffe rs from the To unde rstand the need ror an ALU component, consider Example 4.22.
the ci rcuit or the previous mcthod. We carry out of that column. overflow has occurred.
om it discussion as 10 why thi s Illctilod Figure 4.75 1\\lo's complement overflow
EXAMPLE 4.22 Multi-function calculator without using an ALU
works. but laoki ne: at the cases in detec ti on comparing carry into and out of the LeI's extend our earlier DIP-switch-based calculator to sUPPOI1 eight operations. determined by a
Figure 4.75 should help prov ide the sign bi t column: (a) when adding two positive three-switch DIP switch that provides three inputs x. y. and z to our system. as shown in Figure
intuiti on. nu mbers. (b) when adding two negative 4.76. For each combi nation of the three switches. we want to perform Lhe operations shown in Table
numbers. (c) no overfl ow. 4.2, on the S-bit data inputs Aand B. generating the S-bit output on S.

TABLE 4.2 Desired calculator operations


~ WHY SUCH CHEAP CALCULATORS? Inpu ts
Sample output ir
Se\'eral earl ier examples dealt with designing simplethen you need to add $1.000,000 to the se lling price or Operation A=0000 Illl,
X Y Z
ca1culators. Cheap caJcularors. costing less than a thai chip if you wanl to break even (meaning to B-OOOOO 10 I
dollar. are easy (Q find . Calculators are even given recove r your design and setup COSlS) when you sell the 0 0 0 S-A+B 5=00010100
away for free by many companies selling something chip. Ir you plan to produce and sell 10 such chips,
else. But a calculator internally contains a chip then you need to add S 1.000.00011 0 = $ 100.000 to the 0 0 5=A-B S=OOOOIOIO
implementi ng a digital cireu i!. and chips nomlally selling price of each chip. Ir you plan to produce and
0 0 S=A+ I 5=00010000
arcn '{ cheap_ Why are some cnlcu l:uors such a sell 1.000.000 such chips, then you need to add only
bargain? S1.000.00011.000.000 = $ 1 to the selling price or each 0 S=A 5=000011 II
The reason is known as economy of scale. which chi p. And if you plan to produce and sell 10.000.000,
means that products are often cheaper when produced you need to add a mere $1.000.00011 0.000,000 = 0 0 S = A AND B (bitwise A D) S=OOOOO 10 I
in large vol umes. Why? Because the design and setup 50. 10 = 10 cenlS to the selling price or each chip. Ir
costs can be amonized over larger numbers. Suppose it the actual raw materials only co t 20 cenlS per chip,
0 5 = A OR B (bitwise OR) 5=00001111
cOSIS S 1.000.000 to design a CUSlom calculator chip and you add another 10 cents per chip for profit. then I 0 S = A XOR B (bitwise XOR) S=OOOOIOIO
and to setup the chip's manuracturing (not so can buy the chip from you ror a mere 40 cents. And [
unreasonable a number}----design and setup costs are S= OT A (bitwise complement) S=I II 10000
can Lhen give away such a calculator for free, as many
often caJJed nonrecurring engineering. or NRE. companies do. as an incentive ror people to buy
coSIS. If you plan to produce and sell one such ch ip. somethi ng else. The table includes several bitwise operations (AND. OR. XOR. :Illd complement). A biI><is.
operation applies to each corresponding pair or bits or A :Illd B separatel).
\ Ve can design 3 circuit for our aJculator a shown in Figure ·t76. u iog 3 separ.lIC datapath
componen t to compu te each operation: we use an adder 10 compute the addition. 8 subtrnctor to
compute the subtraction. an incremcllIer to compute the increment. and so on. HO\\(!\cr. that
circuit is very ineffic ient with respect to the number of wire. power consumption. or deI3~ . lbere
nre too many wires that must be routed to all those components. and espt..-~d311) to the mu."(. \\ b.icll
wi ll have 8*8 ;: (H input!!>. Furthermore. every operation is computed all th~ nme. \\hh..~h \\asfes
power. hmlgi nc instead that \\c were dealing nOt with -bit numbel'$. but in~tead \\ ith 3~-blt num-
bers. and we wanted to suppan not just operations but 3_ opernuons. Then \H~ \\ould hJ.\ C!\ n
morc wires (32*32 = 1024 \drc~ at th~ I1lU\ inputs). and e\en more po\\cr n!>umpu\'In. Funher-

Display
/ Chip (covered) Battery
more. a 3:!x I lIlU~ \\ ill rcquir'l:' sc\cral I~, els of I!ntes. be,,~a use du ~ to pr.t ~t1 .... al ~~ "'ns. d 3~-inpu(
logic gate (insid~ the IllU\) \\ill li"-cl) n':c.!d to'" be implemented U~lOg ,~'~rn.1 I~\ -I, ('If ... mall r
logic £u t c~.

--_ ... _-_.


202 4 Datapath Components
4.9 Arithmetic-Logic Units-ALUs 203
DIP swilches
control inpul.~ x. y. and z. such that the desired arithmetic or logic result appears at the
,..---=-=-=-=-=-="
1
o
I.-- - - - -yo
OODDDOOB
8
adder s output. The AL-extellder actu ally consists of eight identical components labeled
abext. one for each pair of bits a i and b i . as shown in Figure 4.77(b). It al so has a Com-
ponent cillext to compute the c i n bit.
Thus. we need to design the abext and cillex t components to complete the ALU
design. Con ider the fi rst four calculator operations from Table -1.2. wh ich are all arith-
metic operations:
Wasted
power When xyz=OOO . S=A+B. So in that case. we want IA=A. 1B=8. and ci n=O.
When xyz=OO l, S=A - B. So we Want 1A=A . 18=B ' . and ci n= 1.
When xyz =O1O. S=A+ 1. So we want 1A=A. 1B=O. and c i n=1.
When xy z =O ll, S=A. So we want I A=A. IB =O. and cin=O .. Olice that A will
pass through the adder. because A+O+O =A.
The last four ALU operations are all logica l operations. We can compute the desired
operation in the abext component. and input the result to 1A. We then set 1B to 0 and cin
to O. so lhat the va lue on 1A passes th rou2h the adder unchan2ed.
CALC One possible design of abext pl aces ; n 8x I mux in front ~f each output of the abexr
Fig ure 4.76 -bit DIP-switch.based and cillext components. wilh x. y. and z as the select inputs. in which case we would set
multifunction calculator. using each mux data input as described above. A more efficient and faster de ign would reate
separ.lIe components for each a custom circuit for each component output. We leave the completi on of the internal
function.
design of the abe rt and cillext components as an exerci e for the reader.
We saw in the above example that using sepamte components for each operalion is Examp le 4.23 redesigns the multifunction calcul ator of Example -I.n . this time uti-
lizing an ALU.
not effi cient. To solve the problem. we observe lhat the calculator can only be configured
to do One operation at a time. so there is no need to compute all the operallons III parallel
EXAMPLE 4.23 Multi-function calcu lator uSing an ALU
as was do ne in the example. Instead. we can create a slllgle component (an ALU) that can
compute any of the eight operati on . Such a component would be more area and power Example 4.22 bui ll an eighl· funclion calculmor \\ ithoUl an AL . The result \\ as W:bled area
efficie nt. and would have less delay because a large mux woul d nO! be needed. and power. complex wiring. and long deja) . sing the abo\ c-designed ALL', the akulJ.mr could
Let's stan wi th an adder a our ba e internal AL design. To avo id confusion. we'll in lOtcad be built :IS shm\ 11 in Figure 4.78. I otice the simple and efficient de~ig n.
call the inputs to the internal adder 1A and 1B. shon for " internal A" and "interna l B.': to
di ~ti ngui s h lhose inputs from the ex ternal ALU inputs A and B. We stan \YlIh the deSign
shown in Figure 4.77(a). The ALU consists of an adde r. and ~o l11 e logiC III fro nt of the
adders inputs. We' lI ca ll lhat log ic an arithmeticfl og ic extender. or IIL-extellder. The
purpose of the AL-extellder is to et the adder inpu ts based on the values of the ALU's

a7 b7 a6 b6 aD bO

Figure 4 77 Arnhmetlc- Ioglc unll'


la) AL de\lgn ba",d on a \I ngle
adder with an anthmcllc/Joglc la7 ib7 la6 1b6 Flgur. 4.78 S-hlt DIP-
extender. dnd IOJ drnhmClltlloglc (b) , \\ ih:h-hn, cd mull! -
«Icnder detail flllll' llOncalcuhuor.
U'll1g nn ALU
4 Datapath Components
4.10 Register Files
410 REGISTER FILES
co~ponent having a si ngle N-bit wide data inpul. and a si ng le N-bit wide data outpuL The
An MxN register file is a datapalh memory component that prov ides effic ient access to a wlfmg mSlde the component is done carefully to handle fanout and congestion. Figure
collec ti on of AI registers. where each register IS N blls Wide. To .understa nd the need for a 4.80 shows a block symbo l of a 16x32 register file ( 16 registers. each 32-bits wide).
regis te r file component in building good datapaths. rather than JlIst uSll1g M separate reg- . Consider writing a value to a register in a reg-
ist~rs. consider Exa mple -1.24 . Ister fi le. We would place the data to be written on 32 32
the input W_ data. We then need a way to indicate
EXAMPLE 424 Above·mlrror display system uSing 16 32·bit registers which register we actually Want to write. Since
there are 16 registers, we need four bits to speci fY
A_addr -+
A_data """':'-

Recall the above-mirror display syslem rrom Example 4.4. Four 8-bit r~g i st e~ were m~lIiplexed to
a panicu lar register. Those four bits are called the A_en _
all S-bil OUlpUt. Suppose imacad that the sys tem required sixteen 32-blt registers: to display more
va lues. c3ch of more precis ion. We wou ld therefore need a 32-~ it -\~ id~ 16:< I mulllP.lexor, ~ shown registe r's address. We wo uld thus write the desired
in Fif!ure . t 79. From 3 purely digital log ic perspective. Ihe deSign ISJust fin.e. BUI In pra~ lIce, that regi ter's address on the input W_ add r . For
mu ltiplexor i~ \'cry incfli cicnl. COllnt the number of wires that would . be fcd Into that multlplexor- example, if we wanted to write to register 7. we
16:<12 = 511 wi res. That's n 101 of wires to Iry 10 route from the rcguers to the I1lUX~s-lry plug- wo uld set W_addr~Oll1. To indicate that we
oin!! 5 11 wires into the back of one stereo system for a hands-on demonstration. HaVing too many actually Want to write on a panicular clock cycle
: i;;;~ in a small area is known as cOllgestioll . (we won' t want to wri te on every cycle). we would set the input W_en to 1. The coUec-
tion of inputs W_ da tao W_add r. and W_en i known as a register file' wrile port.

~iO
Reading is similar. We would pecify the register to read on input R_addr. and set
R_en~ 1. Those valJes would cau e the register file to output the addressed regi ter con-
" E
., 0
tents Onto outpu t R_data. R data,R addr.and R en are known as a re.n terfile' read

~~ ~
u..ec
Q)
4x 16
port. The read pon and writ; pon are i ndependent ;;f one another. Thus. during the same
clock cycle, we can write to one register. and read from another (or the same) register.
Let 's consider how to internally design a register file. For simplicity. con ider a 4 x 3_
" 4 13'10
register file. rather than the 16 x 32 register file described above. One internal design of a
4x32 register fi le is shown in Figure 4.81. Let's consider the circuitry for writing to this
o register file, found in the left half of the figure . If W_en~O. the reg; tcr file \\ on't write fO
any register, because the write decoders outputs will be aliOs. If W_en~ I. then the write
decoder decodes ~Ca dd r and sets to 1 the load input of exactly one regi ter. lllat register
will be written on the next clock cycle with the value on W_data.
e
load
32 32
W_data + - -...,...'-___.....___--,
d
figure 4 79 Above·mi rror display design. ass uming sixt ee n J2. bit registers. The mux has too
many input wires . re~ulli ng in conge~lio n . Also. the data lines C arc fanned out to too many
2x4
regj~ters. re~u ltin g in weak current.

Likcwi\e. consider routi ng the dala inpUlto all c;; ixtccn rcgi~tc~. &Ich data input wire is being
iO
branched inlo ,ix tccn ~u bw irc,. Imagine electric current being Iikc a ri ver of waler- branching a
it
main river inlo ~ i x tccn smaller rjver~ will yield much le~, waler now in each c;maller river than in
the main river. Likcwj~e. branChing a wire. known a}. jallolli . can only be done so many times d
before lhe branched wires' current~ arc 100 \ mall to ~ umcicnt l y conlrol l r.lO ~i,tors . Furthennore. wnle
low-c urrent wire, may be very 'low altOio. '0 fa noul can crea te long delayc; over wires too. decoder
d
e
The fanout and congestion proble ms illustnllcd in the prev i u< e nmple nn be solved
by ob,ervi ng that we never need to load more Ihan one regi~ter (It a lime. and lhal we never
need 10 read more than one re!p~ter al a lime ei lhe r. An M N rcgmcr foIe <olves the fanoul 4x32 register "ie

,lI1d conge~lIon problcm~ by grouping the M rcgi~ter\ Into a ~in!!lc component, with that
206 Datapath Components
4.10 Register Files
Notice the circled I ri~ n g ul a r one-input one-output d
. Figure 4:83 provides example timing diagrams describing wriling and reading or a reg-
compone nt placed on Ihe ICda ta line (there would a ILI-
Ister fi le. Dunng cycleJ, we do not know the contents of the register file. so the register file's
ally be 32 such components since ICda ta is 32 blls Wide). q=d
contents are shown as "?" DUring cycle J, we set W_d ata =9 (in binary. or course).
That component i ~ "'-flown as a drirer. so mc l~me~ ca lled a (a)
H_addr=3, and W_e n=l. Those values cause a write of9 to regisler file location 3 00 the
bllffer. illU'1r3!ed in Figure 4.82(3). A dnver S OUIPUI
first clock edge. Notice that we had set R_en=D. so the regi ter file outputs nothing ('T).
equals it, input. but the OU IPUI is a stronger (higher current)
and the value we put On R_addr does not matter (the value is a "don't care", written as "X").
,i~ nal. Remember the fa nout problem we desc nbed III
E;amplc -l.2-l? A drive r reduces Ihe fanout prob lem . In
Fi2ure -l .8 1. the IC da ta lines only fan out to twO registers e= l : q=d d - q elk
before Ihey go Ihro ugh the driver. The driver's OUlput then e=O: q='Z' d- ; - q
2 3 4 5
like no conn~ction
fan~ ou l to on ly IWO morc registers. Thu s. inslcn d of a
W_dataX ~ X i X i~
ranout of four, Ihe H_da ta lines have a ranout of only two
(actually three if you count the driver itself). The inserti on
of drivers is beyond the scope or Ihis book. and is inslead a
(b)

Figure 4.82 (a) driver, (b)


Ihree-Slale driver.
w_addrX::=:::t
; Gtx X X; ~~::3±==
W_en} -: : 1 I :
i
I :
subjeci ror a VLS I design book or an advanced digital . .
desi2n book. But secinc at least one exam ple of the usc of a dnve r hoperull y gives you an R_data >
;:::..
Z Z i' ~
Z~
I. ,

Such componem s
idea-or one reason wh; a register file is a userul component-the component hides the
complexity or ranou l rrom a designer. .
To under;tand Ihe read circuiuy. you must fi rst understand Ihe behaVior or another
R_addr( X !X21X
I '
\3
~ I
! \~ I .1
3 i

ore more
new componelll thai we've illlroduced-the triangular component having two inputs and
I
,
i
,
L' i k'
I
i
I
I' !
I I
commnnl\' J..nOh ll

~:~? I~:~? i~:.:~ i~:;? i~:;? i~:I? i ~!!~


0\ . tn-sla te one output. That component is known as a three-Slate driver or three-state bllffer, Illus-
dnn:rs rtflher trated in Figure -l.82(b). When the control inpul C is 1. th e component acts like a regular
than' three-stote."
But "tri-state" driver-the componen t's out put equals its input. However, when the control inpu t c is 0, 2: ? :I 2: ? :I 2: ? t 2:
,
? :I 2: ? :I 2: 177 2. 177
If a registered the driver's OUIPUI is neither D or 1. but instead what is known as hi gh-impedance, written 3: ? ! 3:
I
9 i 3:
,
9 I 3:
,
9 i 3:

9 : 3: 9

11 3.j 555
trademark of
as 'Z'. High-impedance can be thought or as no connection at all between the driver's
VOl/ollal Figure 4.83 Writing and reading a regisle, fi le.
SemlC:ondu({or inpul and output. '"Three-stale" means the driver has three po, ib le ou tput tates-D, 1,
Corp .. fO rother and Z.
than pUllln ~ the Duri ng cycle2, we setICdat a=22. W addr=1. and W en= . These values ause a
Let's now consider the circuitry ror reading rrom the register file. round in the right
requITed wri te or 22 to register file location I on c1~k edge _. -
(rademarJ.. Hmbol hall' or Figure 4.81. II' R_en=D . the regisler fi le won 't read rrom any register, since the
aJlu t"\ er;. ~se of During cycle3. we et W_en=D. so then it-doe n't marter to wbat valu \\e set
read decoder's ou tputs will be all Ds, meaning all the three- ~ tate drivers wi ll output Z's,
lhe lerm "frI- W_data and W_addr. We also set R_addr=3 and R_en= 1. Those "alues use the reg_
and thus the Outpul R_da ta wi ll be high-impedance. II' R_en-1. then the read decoder
UaU. man,\ ister file to read out the contents or register file location 3 0010 R_da a. ausing ~_ a:c
documenH 11ft' the decodes R_addr and scts to 1 the control input or exactly one three-Mate driver. which
to output 9. Notice that the reading i not yn hronized to cI k ed~e
term rh"l'-\lClle will pa s ilS register val ue through to the R_da ta output.
changes soon after R_en becomes I. Examinin2 the desi2n or Fi2ure -l. I hould make
Be awarc that each shown th ree-state drive r actually repre,en ts a set or 32 three-
clear why reading i not synchronous- etting R_en t; 1 simpl~ enabl the output
,tate driver>. one ror each or the 32 wires coming rrom the 32-bit rcgi~ters and going
decoder to turn on one set or the three-state buffers.
10 the 32-bit R_da ta OU lput. All 32 drivers in a ,el arc controlled by Ihe same
contro l input. During cycle-l. we return R_en to D. Note that this cause me ", 3gtllD.
During cycle5. we Want to simultaneously" rite and read the regi ter iile. We read
The wi res red by the various three-Mate driver', arc known a, a bllS, as indi-
cated in Figure 4. 8 1. A bus is a popu lar alternative to a multiplexor when each mux locati on I (which causes JLda a to be ome ~_) while simultanrou -l~ writing 1 ati 02
with the value 177.
dala input i~ many bllS wide and/or when there are many mux dma inputs. becau e
a bus result; in les, congestion. Finally. during cy le6. we want to simultan usl~ read and 'Hite the : me register
fi le location. We set R_addr=3 and R_en-l. causing I ation 3'< contenl'> fQ to appear
Notice that Ihe regi ster file design ,cales well to larger numbe" or registen.
The write data 11I1e, can be driven by more drivel'\ If nece "ary. The read data line n R_da a sh rtll' after setting those '-:llues. We also set W.3 .Q3t3 =...:S. llld
W_ en-1. On clock edge 6. 5:5 thu. gets ,tored into localion.1. :\ou
arc red rrom three-state drivers and thu, there " no congc'l1on at a single multi-
clock edge. R. da a abo changes to :55.
plexor. The reader may wi sh to compare the rcg"ter file de Ign In Figure 4. I with
the de\lgn In hgure 4.6. which was c"cntially a poor dcslgn or" regi\tcr file . TIle ability t simultnneou\l~ read and " nte locations cf J regl,ter til. , n the •
ution. i ' a "idel) u,ed feature of regbter fiie>. The ne\ t e\ .ullple m e, \I> l fth.it fe lUI\".

- - ---_.-
208 4 Datapath Components 4.13 Product Profile: An Ultrasound Machine Ul9

EXAMPLE 4.25 Above· mirror dis pl ay system using a 16x32 register fil e 4.11 DATA PATH COMPONENTTRADEOFFS (SEE SECTION 6.4)
E)..;}mple 4.4 used four S-bil registers for an above-mi rror display SYMC I~l . Example 4.24 extended
For each datapath component that we introduced in previous sections. we created the most
the system to use sixteen 32-bi t regi sters. resulting in ranOl~t and c onges~loll problems. \Ve can redo
basIc and easy-to-understand implementation. In thi ection. which physically appears in
that ~x ample using a reg ister fi le. The design is shown in Flgu~e 4.84 . 5 11lcc ~he system a"~~ys OUt-
the book as SectIon 6.4, we describe alternative implementations of several datapath com-
puts one of the register va lues to the display. we ti ed the R_ en Input to I . Not ice that the wn ung and
ponents. Each alternative trades off one design criteria for another-most of those
reading of panicular regi sters are independent of one another.
alternatIves trade off larger size in exchange for less delay. One use of this book covers those
alternatI ve Implementations immediately after introducing the basic implementations
(mean ing now). Another use of the book covers those alternative implementations later. after

':": ~'I
shOWing how to use datapath components during register-transfer level design.

figur.4.84 Above·mirror 4.12 DATAPATH COMPONENT DESCRIPTION USING HARDWARE


display design. using a 16x32 - 1
register file. register lite RA DESCRIPTION LANGUAGES (SEE SECTION 9.4)
Thi s secti on, which physically appears in the book as Section 9.4. shows how to use
HDLs to describe several datapath components. One use of the book describes such HDL
A register fi le having one read pon and one write pon is sometimes referred to as a use now, wh ile another use describes such HDL use later.
dual-ported regisrer file. To make clear that the twO pons consist of one read pon and
one write pon. such a register fil e may be referred to as follows: dllal·porred (I read, I
write) register file. 4.13 PRODUCT PROFILE: AN ULTRASOUND MACHINE
A register file may actuall y have just one pon, which would be used for both reading If you or someone you know has ever had a baby, then you may have seen ultrasound images
and writing. Such a register file has only one set of data lines that can serve as inputs or of that baby before he/she was born. like the images of a fetu . head in Figure 4. -(a).
outputs. one set of address inpu ts. an enable input, and one more input indicating whether
we wish to wri te or read the register file. Such a register file is known as a sillgle-ported
register file .

Multiported (2 Read, 1 Write) Register File. Many regi ster fi les have three pons:
one write port , and two read ports. Thus. in the same clock cycle. two registers can
be read simultaneously. and another register written . Such a regi ster file is especially
use ful in a microprocessor. since a typ ical microprocessor in. truction ope rates on
two reg ister and stores the resu lt in a third register. like in the instruction "RO <-
RI + R2 ."
We can create a second read port in a register file by addi ng another set of lines,
Rb_da t ao Rb_ addr . and Rb _en. We wou ld introduce a second read decoder wi th inputs
Rb_add r and enable input Rb_ en . a second set of three·state drivers. and a second bus
connected to the Rb_ da ta ou tput. figure 4.85 (a) Uhrasound image of a fetus. created
using an ullrnsound devi e lhat is simply placed on the
mom's abdomen (b) and lhm fonns the image b~
Other Register File Varia/iOtls. Register files come in all sons of configurations. gc ncrn ting sound waves and listening to the t.~hoes.
Typi cal numbers of registers in a regiMer fi le range from 4 to 1024. and typical register Pholos coune y of Philips ~l edica1 )Slems.
widths range from 8 bi ts to 64 bits per register. but ~ i/e, may vary beyond those mnges.
Th~ mml ",) rt f Regi ters fil es may have one pon . two pons. three pons . or evcn more. but increasing to That image wasn't taken by a camem omt'how in. ned into th uteru" Nt r:uh r ~
I ',,~ rnm on 0 many more than three pons can slow down the rcgbtcr filc'~ perf0n110nCC ~lDd incrca c its an ultrasou nd machine pressed against the mom 's skin :md pointed to\\ ,mI th f tlL.<'
r~~lffn fil~ In
~ i l.e signifi cantly. due to the difficulty of routing olllhose wires around in,ide the regi ter
(I

pmdulf Mat' fO Ullrasound imaging is now common prJctice in ob ·terri · - Illainl) helping d.: ·tl . t"
r"ul ptJ rt f and 5 file. Nevenheless, you' lI occasionally run aero" rcg i'lcr liIe, with perhops J wri te ports truck the fetus' progres, and om t potential probl ms earl). Nt aI . ... );1\ 11\£ nl:- a
rift' {J'J r lf
lot
and 3 rcad pons, when concurrent access IS cflti co l. huge thrill when the get their tirst glimpse of their bab) 's h' ud. h:md..... :md lint f 't'
1 10 4 Datapath Components
4.13 Product Profile: An Ultrasound Machine 211
Functi onal Overvie w Real designers
mllsl often lean! To understand the idea of beamforming. we mUSl first under Land the idea of additive
This section brieny describes the key func tional idea, of how ullra ound imagi.ng work . abolll ,he doma;" sound . Consider two loud fi reworks exploding al the same lime. one I mile away from
Digital de,igners don't typica lly work in a vacuum-in>tead. they a l~pl y their skills to par· for which 'he)' will you, and the olher 2 miles away. You' ll hear the clo er firework after about 5 seconds-
deSign, Mall)'
. -I ar ..lpp I~"
lieu I(;~l t'Oll
I S. ' ~'Ind thus designers
_ typically learn the
. .key functIOnal Ideas underlY
. . lllg designers assum ing sound travels 0.2 miles/second (or I mile every 5 seconds)--a reasonable
tho,e application,. We therefore inlroduce you to lh ~ basIc Ideas of ult rasound appitcatlons. cOllsider such approximation. You'll hear the fanher fi rework after about 10 second . So you'lI hear
Itra,ound imaging works by sending sou nd waves IIlto the body and itstelllng to the echoes leam ing about "boom ... (five seconds pass) ". boom." However, suppose instead that the closer firework
domains. like
that return . Obj~ct; like bones yield difTerent echoes than objects like ski n or nU ld , so an II ltrasoul/d, as olle ex ploded 5 seconds later than the fanher one. Then you'lI hear both at the ame time-
ullrasound machine processes the different echoes to generate Images li ke th ~se III FIgure of 'he !ascillmillg one big "BOOM!" That's because the two sounds add logether. ow suppose there are
-I.85(a)-strong echoes might be displ ayed as white. weak ones as black. Today . ultrasound fea tures of Ihe job. 100 fireworks spread throughout a city, and you want all the sound from tho e fireworks
machines re ly heavi ly 0/1 rast d i gi t~1I circuits to generate the sounds waves, li sten 10 the to reach one panicular house (perhaps somebody you don't like very much) at the same
echoe,. and process the echo data to generate good quality images in real lime. time. You can do this by ex plod ing the closer fi reworks laler than the farther fireworks. If
you time everylh ing just right, that panicular house will hear a tremendou Iy loud ingle
" BOOOOOM! !!!." probably rattling the house's wal ls pretty good. as if one huge fire.
work had exploded. Olher houses lhroughout the city will instead hear a serie of quieter
booms. since the liming of the explo ions don 't result in all the sounds adding at th.ose
olher hou es.
Now you understand a basic principle of beamforrning : If you have multiple sound
source (fi reworks in our example, lransducers in an ultrasound machine) in different
locations. you can cause the sound to add together at any desired point in pace. by care.
fu lly riming the generation of sound from each source uch that all the ouod wa\'es arrive
Figure 4.86 Ba.:;ic components of an Ullr'JSound machine. at the desired point at the ame lime. In other words. you can electronically focus and
steer the sound beam by introducing appropriale delays. Focusing and teering the sound
Figure -1 .86 illustrates the bas ic pan of an ultrasound mac hine. Let's discuss each to a panicular point is useful because lhen Ihal poilll will prodllce a much louder echo
pan individuall y. ,ltan all ollter POiIllS, so we can easily hear the echo from lhat point o\'er all the echoes
from other points.
Transducer
Figure 4.87 illustrate the concept of electronic focusing and teering. using two
A lrallsdu cer convens energy from one form to another. You're cena inl y fa mi liar with
sound ources to foc us and steer a beam to a desired point X.
one type of tran sducer. a te reo speaker, wh ich convens electrical energy into sound by
changi ng the current in a wire. which causes a nearby magnet to move back and fonh,
Bo/h waves reach the focal
which pushes the air and hence creates sound. Another familiar tra nsducer is a dynamic
point ~~ the same ome
microphone. which convertS sound into eleclrica l energy by letting sound waves move a focal focaf
magnet. which induces current changes in a nearby wire. In an ult rasound machine, the polnt
,>- ...
lransducer conven> eleclrica l pulses into sound pu lses. and sound pu lses (lhe echoe ) into , I
electrica l pul ses. but the lranducer u!.es piezoe leclric cry.. tab inMead of magnets. ' .....'
Applying electric current to such a crystal cause .. the cry'tal to change ,hape rn pidly, or
vibra te. thus generating sound waves-typically in the I to 30 Mcg"hert l frequency
range. Human.. can't hea r much above 30 ki lohenl- thc term " ultrasound" re fers to the
wave
fact that the frequency is beyond human hearing. Inver,ely. ,ound wave, (echoes) hitting
the crystal create electric current. An ultrasound machine', tr:l n,duccr component may
contain hundred .. of , uch crYMal,. which we ca n thin~ of ;" hundred, of t",n;ducer.;. (a) (b) (e) (d l
Each ,"eh tran,ducer i.. con.. idered to form a challl/ci.
Figure 4.87 Focu>ing .ound at 3 p3nic'ul3r point using be3mfonning (al Ii t nme ~'lQI~ ~
Bea mfo rmer boll 111 tran du cr g~ncrntc~ ~ound. (b) ~c.'·{lnd lime :-tt'p--the lOp [r:lnMlu r 00\\ ge:oer.u _ ~ ~
A heamform er elel/rfill/ClIIII' "focu,c," and "qeer," the 'OIlI1d beam of:1I1 amy of lllln . too. (e) third time ~tcP-thl' 1\\ 0 '\ound \\ ~l\e~ ~IlJd Jllhe f "at POlOt (d) an II1\bD'3o m, l.I\g

duce" to or from panicul,,; foca l poinL'. without ac tually mO\lIIg.ln hardware like 3 thul the top tmnsduccr j., (\\ 0 lime 'Icp~ II\\ J~ from the focal p0lOt. \\ hlle the )[tl'l11 tr'3.ru
d"h to obtall1 \lIch focu .. lI1g and .. teen ng. three time tel' 3\\,1). mcun1l1g the lOp trnn,du,'er ,hould gent:r.ut." ... ~nJ \'Oe un\( p t r
the bollOIl1 lrnll,dul'cr.
Datapath Components 4.13 Product Profile: An Ultrasound Machine 213
At the fi, ' te) (Fioure .j.87(a)), the bottom source has begun lra n s m i !lir~g its 8 eamforming is lremendously common in a wide variety of sonar applications, such
, r. t lime S.I e (F' ur' .j 87(b», the top source has begun lransmllllng
!)ound wave. Arter two lime steps Ig C • h as observtnga fetu s, observing a human hean, searching for oil underground, monitoring
its sound wave. After tllece time steps (Figure 4.87(c»: the waves frol11 both se n~ors reac the .surro undlllgs of a submarine, spying, etc. 8eamfomning is used in some hearing aids
. . I
t he focal POlllt addll10 t02et ler. Ie) TI , II continue adding as long . a the waves
. fro m both havlllg mu lupl.e microphones, lO focus in on the source of detected speech-in that case,
. ' e. - th ' r We can si mplify the draWing by shOWing only the lhe beamfomnlllg must be adaplive. 8eamforrning can be used i.n multimierophone ceU
sources are 111 phase wnh one. ana ~ .' wn in Fi 2ure .j.87(d).
lilles from the sources to the focal POII1!. as sho -.. . phon es to focus III ,on the user's voice. and can even be used in cellular telephone base
An ultrasound machine uses this ability to electrolllcally focus and steer sound,. In stallons (uslllg radiO signals lhough , not sound waves) lO focus a signal going lO or
. '
ord er to scan, POIIl! by pa in!. tIe I entire reoion
e
in
..
fron t of the LIansducers. The machine commg from a cell phone.
does such scanning perhaps tens or hundreds of limes per second. . Signal P rocessor, Sca n Converter, a nd Monitor
. I he m3chine needs to It sten to . the .echo
FOr eac I1 fDCa1 P01l1t. .
lhat.comes
.
bac k from
. The signal processo r analyzes the echo data of every point in the scanned region. by fil-
whatever object is located at the focal point , to determine If that object IS bone, skin,
tering out noise (see Seclion 5.11 for a discussion on filLering), interpolating between
blood. etc., utilizing the fact that each such object generates a differe nt echo. Remember,
pOInts. asslgnlllg a level of gray to each poim depending on the echoes heard (echoes cor-
the echo from the focal point wi ll be louder than echoes from ~lher POlillS, because lhe
responding to bones might be shaded as while, liquid as black, and skin as gray. for
sound adds at lhat pain!. We can use beamfomling to also focus ilI on a pan lcular pOint 111 example), and olher tasks. The resulL is a gray,scale image of the region. The scan con-
space that we want to lislell to. In the same way lhat we generated sound pulses wl lh par,
vener steps lhrough this image to generate the necessary signals for a black-and-while
ticular delays to focus the sound a ll a panicular pain !. likewise, to "listen" to the sounds monitor, and the monitor displays the image.
from a panicular point, we also want to introduce delays to the Ignals received by the
transducers. That's because the sounds will arrive at the closer lransducers sooner lhan at
the fanher lIansducers, so by using appropriate delays. we can '.'Iine up" the signals fro~ Digital Circuits in an Ultrasound Machine's Beamformer
each LIansducer so Ihat the sou nds coming from the focal pOint all add together. ThiS Much of the conLIol and signal processing lasks in an ullIasound machine are carried OUi
concepi is shown in Figure 4.88. using software running on one or more microprocessors, typically special micro-
processors specifical ly designed for digital signal processing, known as digital signal
processors, or DSP . But cenain tasks are much more amenable to custom digital
ci rcuilIy. such as those in the beamfomner.

foca l
POInI
x"-J
\
Q)Q) • ' I
--'
Sound Generation and Echo Delay Circuits
8 eamforrni ng during the sound genera- I-----:;=========F=~
lion step consi IS of providing
appropriate delays to hundreds of tran -
ducers. Those delays vary depending on
starCout

lhe foca l point. so they can' l be built


resull wilhoUi into the lIansducers themselves. [nstead.
(a) (b) (e) (d) Ihedelay
we can place a del ay circuit in front of
Figure 4.88 Li lening 10 ound from a particular poinl u. ing beamfomling: (a) firsl lime Slep. each LIansducer, as shown in Figure
(b)second lime slep-lhe lOp transducer has heard the sound 1i,,1. (c) Ihird lime slep-lhc bouom 4.89. For a given focal poim. the DSP
Delay
Iran,ducer hears the sound al Ihis lime, (d) delaying Ihe lOp lra n,ducer by one lime slep results in writes the appropriate delay value imo
the waves from the focal poinl adding, amplifyi ng the sound. each delay circui t. by wriLing lhe delay
val ue on the bus labeled de lay_out. Figure U9 Transducer OUtpul dcll~ in-uilS for
NOle that lhere wi ll cenninly be echoes from olher poinL' in Ihe region, but those writi ng the "address" on the lines Iwo channels,
coming from the foca l poilll will be much slronger- hence, the weaker echoes can be fil· fabel ed add r. and enabling the decoder,
lered OU!.
The decoder will lhu et the load line
NOIice lhal beamforming can be u' cd to listen to a panicu lar po int even if the ounds of one of the OllrDeia . component ,
com ing from lhat poim are not echoe' coming bac ~ from o ur 0\\ n ,ound pu lses-the fter wri ting to every ueh c mpollcnt. the 0 P stJJ'lS all of them simullJ.lleQ\l>· ~ b)
,ound cou ld be coming from the objeCt at the point IL,clf. ,uch u\ a cur cngi ne or a person selli ng s ta rCou t to 1. Each OIlIDelll), c mponeOl \\ill. after the _pe<-ilied deJa), put
talking. 8eamformi ng b Ihe electronic equivafelll to poi llltng n hlg flambolic dish in a its 0 output, which we'll assume cau es the lransdul'Cr to generate s undoTIte D P \\ auld
panlcu lar direction, bU I beamforming require\ no rnovlIIg PUrt, then sel S ta rt _ out to O. and then Ii -ten for th -ho,
21 -4 Datapath Components 4.13 PrOduct Profile: An Ultrasound Machine 215

s We can do better by reorganizing


We C3n implement the Oil/Deia\' compo-
how we compute the sum, USing a config-
nent lIsing a down·counter with parallel load . t uration of adders known as an adder tree.
as sho\\" n~ in Figure 4.90. The para llel load 010--- te
ent LI--,.... d
inputs L and 1d load the down-counter With In other words, rather than computing
down~
Id l- - - Id ((((((A+B)+C)+D)+E)+F)+G)+H.
its count value. The ent input commences the
coun ter depicted in Figure 4.93(a), we could
down-CoUilting-when the CQunter reac hes
zero. the coun~er pulses te . The data output of _ c <1--< IIlstead compute ((A+B)+(C+D» +
the counter is unused in tbi s implementation. Out Delay ((E+F)+(G+H)). as shown in Figure
After the ultrasound machine sends out 4.93(b). The answer comes out the same
sound waves focused on a part icu lar focal and uses the same number of adders, bu;
point. the machine Illust listen to the echo Figure 4.90 Out Delay circuit. the latter method computes four addi-
cOllli ng back frolll that focal point. Thi s li s- tions in parallel. then two additions in
tening requires appropriate delays for each start_out parallel, and then performs a last addi-
transducer to account for the differing dis- ~_~_~ delay_out tion . The delay is thus only that of three
tances of each transducer from the focal point. d ---,..:...- figure 4.93 Adding many numbers: (a,
adder. For 256 values. the tree 's first
l.inearly. (b) using an adder tree. :\me that
Thus. each transducer needs another delay level would compute 128 additions in both melhods use seven adders_
circuit for delaying the received echo . ignal. parallel, the second level would compute
as shown in Figure 4.91. The EchoDelar com- 64 addi tions, then 32, then 16. then 8, then 4. then 2. and finally I last addition. Thu . that
ponent receive~ on input t the signal from the adder tree would have eight level. meaning a total delay equal to eight adder dela, . That'
transducer. which we ll assume has been dig-
a lot faster than 256 adder delays-32 limes/asler. in fact. -
itized into a stream of N-bit values. The r-....,....-_.. to The output of the adder tree can be fed into a memory to keep track of the re ults for
componen t should output that signal on output adders
t_de 1 ayed . delayed by the appropriate the DSP. which may access the results sometime after they are generated.
amount. The delay amount can be written by Multipliers
Figure 4.91 Transducer output and echo
the DSP using the component 's d and 1d delay circuits for one channel.
IIlputs. We presented a greatly simplified version
We can implement the EchoDelay com- of beamforming above. In reality. many
ponent using a series of registers. as shown in other factors must be considered durin2
Figure 4.92. That implementation can delay beamfonning. Several of those cons id er~
the OUtput signal by O. I. 2. or 3 clock cycles, ations can be accounted for by
imply using the appropri ate se lect line values multiplyi ng each channel with specific
for the 4x I mux. A longer register chai n. along constant values. which the DSP a2ain
wi th a larger mux. would support longer sets individually for each cbannel. -For
delays. The DSP confi gures the delay amount example. focusing on a point close to the
by writing to the top register. which sets the handheld device may require u to more
4x I mux select lines. A more nex ible imple- heavily weigh the incoming Signals of
mentation of the EchoDelay component would transducers near the center of the device. Figure 4.94 Channel e\tended \\i th a
instead u e a timer component. A channel may therefore actuulIy include multiplier.
Su mm ation Circuits-Adder Tree figure 4.92 EchoDelay circuit. a mUltiplier. as shown in Fi2ure 4.94. The
DSP cou ld wri te to the ";gister shown.
The output of each transducer, appropriately
delayed. hould be , ummed to create a single echo signal from the focal poi nt. as wa iIIus· which wou ld represent a constant by which the transducer signal" uld be multiplied
tented in Figure 4.88. That illu tration had only two transducer;. and thus only one adder. Our introduction of the ultrasound ma hine is greaLl~ simplifit'd from real rna -tune.
What if we have 256 transducers. a~ would be more likely in a real ultmsou nd machine? yet even in thi s simplified introducli n, you an see man~ of this chapt r's dat3P'lth xtn-
How do we add 256 values? We could add the value!> in a lincar way. n.~ illustrated on the ponents in use. We used a down-c unter t implemt'nt the OllrD 1<11 'mpon nt • .1nJ
left Side of Figure 4.93(a) for eight value' . The delay of that cir Ult i, roughly equal to the severa l registers along with muxes ~ r the EciJoD~la\' component. We u>t'd many 3JJe
delay of ,"ven addm. For 256 values. the de lay would roughly be that of 255 adders. That' to sum the in ollling tmn du er , ignals. nJ \\e ust'd a multiplit'f to \\clgh tIK
a very long delay. incomi ng !'ignab.
216 4 Datapath Components 4.15 Exercises 217

Future Challenges in Ultrasound ultrasound . The req uirement th t f


understanding of an r . a a so tware programmer or digital designer have some
Over the past two decades. ultrasound machines have, moved from mostl y a~lal og machines . app Icallon domam IS quite common.
In the commg chapter yo '11 I
to moslly digilal mac hines. The digital syslcmS conSISI of bOlh CUSlom dI gItal CirCUIts and seq uenti al logic desi ( ' u WI app y your knowledge of combinational logic design,
software on DSPs and microprocesso rs. workin g together (0 creHle real -time Images. cu its that c . I gn controller deSIgn), and datapath components, to bui ld digital cir-
One or the mai n trends in ultrasound machines involves crcating three-di mensional an Imp ement very general and powerful computations.
(3- D) images in realtime. Most ultrasound machines or the I990s and 2000s genera ted two-
dimensional images. with Ihe qua lity or those images (e.g .. more rocal points per image) 4. 15 EXERCISES
improvin o durin g those decades. In contrast to two-dimensional ultraso und. generating 3-D
ExerFcises marked with a n asterisk (*) represent especially chal lenging pro blems.
images r:quires ~lie\Ving the region of interest from differen,l perspecti :,es, just li ke people
vicw things from lheir tWO eyes. Such generation also requ ires extenSi ve computations to bl or exerCIses
h . relallng to data pa th components, each problem .Indicates whether the
pro em emp aSlzes the component 's internal design or the component 's use.
creale a 3- D image from the twO (or more) perspecti ves. The res ult is a picture li ke that in
Figure -1 .95. SECTION 4.2: REG ISTERS
Thal's a fetus' face. Impressive. isn'l il ? Keep in ~·s ~·J Trace
. I the behavior .of an 8-bil ParaJl eI Ioad register
' W.
ith.IIlpUI I. output Q. and load conrrol
mind that image is made solely from sound waves IIlpUI d by complellng the following liming diagram.
bounc ing into a woman's wo mb. Color can also be
added 10 distinguish among different Ruids and tissues.
Those computations take time, but faster processors.
coupled with clever custom digital circuits, are Id

bringing real-time 3- D ultraso und closer to reality.


Another trend is toward making ultrasound
mach ines smaller and lighter, so th at they can be used Figure 4.95 3- D ullrasound image
Q
in a wider variety of hea lth care situations. Early of a fe lus's face. Photo counesy of
mach ines were big and heavy, with more recent ones Phi lips Medical Syslems. ~.2 Trace the behavior of an 8-bit parallel load register with input I. OUtpUI Q. load conrrol input
comi ng on rollable cans. Some recent versions are Id, and synchronous clear IIlput clr by completing the following timing diagram.
handheld. A related trend is making ultraso und mac hines cheaper. so that perhaps every
doc tor could have a mac hine in every examination room. every ambul ance could carry a
mach ine to help emergency personnel ascenain the ex tent of cen ain wo unds. and so on.
Ul trasound i used fo r numerous other medica l appli cations. such as imaging of the
~ _____________r---l~_____
heart to detect artery or valve problems. Ultrasound is also used in vari ous other applica·
ctr - - - - - - - -_ _ _ _--.l L - -_ _ _~
tions. like submarine region monitoring.
clk

4.14 CHAPTER SUMMARY Q

In th is chapter. we began (Section 4. 1) by introducing the idea of new bu ilding blocks ~ .J Design a 4-bit register with 2 control inputs 51 and sO, -1 data inputs I .11. II. and 10. and 4
intended fo r common operations on multibit data, wi th those blocks being known as data· dala outputs Q3. Q2. Q I. and QQ. When s 150=00. the regisler maint:uns its "3Jue. \\'hen
path components, or register-tran fer-level component . We then introduced a num ber of 5150=01. the register loads 13.. 10. When s lsO=I O. the register clem itself ro 0000. When
s IsO= II. Ihe regisler complements itself. so, for enmple. 0000 would become 1111. 3ild
datapath components. incl ud ing reg isters. shifters. adder. comparator. counters. multi·
10 10 would become 0101. (Componem design problem.)
pliers, ,ubtractors. arithmetic- logic uni ts. and register fi le,. F r each component, we
examined two a pects: the internal design of the component , and the u,e of the compo·
nent in applications.
We ended (Secti on 4. 13) by describing some ba~ i c principles underl ying the opera·
) '\
H Repeat Ihe previous problem. but when s IsO= II. the regisler re\'erses its bits. so 1110 ",auld
become 011 1. and 1010 would become 0101. (Component design problmLl
-l.S Design an -bit regisler with 2 control inpuls sl and O. data input> I ..lD. and • J;uu outpul>o
Q7.. QQ. s lsO=OO means mai nlain the prescnt \·alue. IsO=01 me3n. load. 3ild IsO=lO me>n>
tion of an ultraso und machine, and showing how severa l of th ' datu put h components
.L ....... clear. s IsO= I I Illeans to swap Ihe high nibble with the 10" nibble (3 nibble is 4 l>il:>\. •
might be u'>ed to implement pans of such a machine. One thing YOll mi ght n ti e i how 1111 0000 w uld become 0000 11 11. and 11000 101 " ould tx>rollle 010111 . '(',.""1'<"1<'/11
de\igning a real ult raso und machine wou ld require ,ome knm ledge o f lhe domnin of drsig" problelll. )
218 Datapath Components 4.15 Exercises 219
I • lice officer is alwa ys outputting;] radar :. i ~ n al and I.n e~s ~ring the 4.20 Design a special mu ltipl ie r eircuil lhal can multiply ilS l6-bil inpul by I. 2, 4. 8. 16. or 32.
-',6. The radar gun used b) ,POass However. when the officer wants to tIcket <Ill mdlv ldual for
.1
speCified by lhree ,"puts a, b. c (abe=OOO means no multipl y. abc=OOl means multiply by I.
~pcr.:d of p. .
the caf !<o as the) "'d 'd of the caT o n the md3f unit. Bui ld a system to
d' I" I swc the mcaSUft: spel: . . abc=O IO means by 4, abe=O II means by 8. abe=IOO means by 16. abe=IOI means by 32).
spec mg. k: mu ~ • ~ . rc for the r:ldaf gun. The system ha!) an 8-bll speed mpul 5, an H'"t: Use a predefined component deSCribed in lhis chapler. (Component use problem.)
implement ~I ~pt:cd S3 \\; fe.llu th . d r gu n and an 8-bil output D that will be sent to the
input 8 from the S3\C butlon on e r.l a ... . 4.21 Trace lhrough lhe exec ul ion of lhe barrel shifter shown in Figure 4.42. when 1=011 00101. x =
mdar .5 gun ~ pc c
d d'I ~ pi")'
•• • (ColIIl'Jollellf lise problem.) I. Y= 0, Z = I. Be Sure to show how the inpul I is hifted after each internal shifter stage.
4.22 Trace through th e execulion of lhe barre l shiftershown in Figure 4.42. when 1=1 0011011. x =
SECTION ~.3 : ADDERS
'" " . rin o nt the outputs of a 3-bit carry-ripple add\! r for evc~ one full-adder·
=
0, y = I, Z O. Be sure to show how the input I is shifted after each iniernaJ shifter stage.
If0j·S -'.7. Trace the \ alues .tppe.l e . . h 0 11 Ass um e all inputs were prevIOusly ze ro for a 4.23 Using the ba,:,el shifter shown in Figure 4.42, whal settings of the inpu ts x. ). and z are
delay time peri od. when adding I II \V II . required to shift lhe ,"PUI I left by six posilions?
long time. . . d d
- . I' I f I time unit. compute th e longest tllne req uire to a d two SECTION 4.5: COMPARATORS
....8. Assu[11Jng nil gates have n de ~) 0 dd
num bers usi ng an S-bi! carry-npple a er. . . 4.24 Trace th rough the exec uti on of the 4-bit magnitude comparator shown in Figure 4.45 '" ben
. I 0 have;] dclay of 2 lime uni ts. OR gates have a de lay of I lime unit, and a= 15 and b= 12. Be Sure to show how the comparisons propagate thought the individual
-'.9. Assuming AND cates f 3 . e units compute the longest time required to add two comparators.
XOR 2ateS have a de l:ly 0 tlJll •
numbe;s using an S-bit carry-ripple adder.
~.25 Desig.n a comparator that determines if three 4-bit numbers are equaL by connecting 4-bit
Dcsi2n a 1 0-~t carry- ripple adde r using ~-bit carry-rippl e adders. (Componelll use problem.)
~: :~ De;i:n an odder lhOl com putes the sum of three S-bil number. using S-bil carry-ripple adders.
mag nitude comparators together and using additional components if necesS3I). ( ComponenJ
If0j·S use problem.)

(Co,,~ponelll lise problem.) . . . 4.26 Design a 4-bit carry-ripple slyle magnitude comparator that has two outpu ts. a greater-than or
Des ion an adder thaI computes lhe sum of fou r S-bil numbers. uSIOg S-b" carry-npple adders. eq ual-to output gle, and a less-than or equal-to output lIe. Be ure to clearly sho\\ the equa-
If0j·S -l.12 (Co117pOllelllllse problelll.) tions u ed in developing the individual I-bit comparators and how they are connected to fonn
the 4-bit circuit. (Compollellf design problem.)
... 13 Design a digital thermomete r lhat can compensate for errors in the t~mperature sensing
. dev i;e 's output T. which is an S-bit in pu.t our system. The com~ensat.l on amou~t can be
t? 4.27 Design a S-bil magnitude comparator. (Compollelll design problem.)
osi li\'e onl y. and comes to our system VIU Inputs a. b. and c. ~rol11 .1 3-pln DIP switch. Our 4.28 Design a ci rcuit tha I outputs I if the circuit'S S-bit input eq ual 99:
p
system should output the compe"'nsated tempera ture on an 8-blt output U. (Co mpOllelll liSt (a) usi ng an eq ual il y com parator,
problem.) (b) using gates onl y.
~ .I ..J Repeal the previous problem. except that the compensati on amo unt can ~e positi ve. or nega- Hint: In th e case of (b). you need only I AND gate and some imeners. (Componem us,
ti ve comino our system via four inputs a. b. c. and d from a 4-pl.n DIP switch. The
to plVblem.)
co~pensati~n amount is in two's complement form (so lhe . per~on. scttlng the DIP switch 4.29 Use magnitude comparators and logic to design 3 circuit that rompme5 the minimum of three
beller know that!). Design the ci rcuit. What i the range by wh ich the Input temperat ure can be 8-bit numbers. ( Componelll use problem. )
compensated? (Co mpone11l lise problem).
4.30 Use magnitude comparators and logic to design a circuit that compme5 the ma..,irnum of (Wo
~. 15 We can add three 8-bil numbers by chaining one 8-bil corry-ripple adder 10 the Outpul of 16-bil num bers. (Compollelllilse problem. )
anothe r -bil carry-ripple adde r. Assuming every ga te has a delay of I lline- Unlt. comp~te the
longe" delay of this lh ree 8-bit number adder. Hint: you may hove to look carefull y ,"Side the 4.31 Usc magnitude comparators and logic to design a circuit thut outputs 1 \\hen an -bit lDput is
carry-ripple adde", even in; ide lhe fu ll-adder;. to correct ly compute lhe longesl delay from between 75 and 100, incl usive. (Compoflenr use problt'm.)
any input to any output. (Compolle11l use problem.) -1.32 You are to design 0 human body temperature alarm system for a h pit.!. Your >~ 'tern
an 8-bit input repre en ting the temperature. whirh can range from 0 to :.!.55. If the nle:lSured
SECTlO, 4.4 : SHIFTERS lemper:llu re is 95 or less, you should set omput A to I. If the temperature I> 96 to 10-l. ~ou
4.16 De; ign an 8-bi t shifter lhat shifts its inpuls lWO bits to lhe ri ght (' hifling in Os) when the shou ld set out put B 10 I. If the temperature is 105 or abo\"e. ~ u should set output C t 1.
( Companelll lise problem"
shi fter\ 'hift control inpul is I. (Compollelll desigll p lVhlem.)
c-: -I 17 Design a circuit thaI OUlput, the avemge of four 8-bit input' rcpre,enllng binary numbers (not 4.33 You are working as It weight gue ~er in an amusement p3.fk. Your job is to tr) to go -- tM
weight of an individual before they ~h!P on the scale. If ~ our gue~... i!-' n)( "ithllli n ~ of
PLU·S . in two', complement fo rm). (CompOllelll ll le pmhlelll.)
the individual'S octuulll'cigh t (higher or lo\\er)_ the indh'idual \\In-. pll2 • BUild 3 \\ !$ht
~. 18 Dc"sn a CIrCUit thaI take, an 8-bitlnput D repre<;ent'"g binary number. (not in two's compl<·
gU~!!J analyzer system that OUtputs \\hether the gues~ \\ib "llhin ten P~)uI'hh. 'Th-e \\ l~ht
ment rorm). and outputs two time~ that \<tluc. (Componelll IHl' IUy/blr",.) guess ullnl) ll! r has an -bit guc~ input G. J.I1 :'>-bit input from the S('31e \\ \\lth the... ::t
-1.19 De"gn a eircUitthat ou tput, nine tim .. 11' 8-blt ,"put D reprc,enllng blnllry numbers (not in \\eight. and a ~i ngle ou tpUt C that is I irthe guc:- ~~ \\clght \\.b \\ ithlO l~ Jeri"'-,\! hmlb of
two\ compleme nt form). II lnt: \e:1 ~ hl ftcr and an odder ( o",po"elll 'HI! 11mb/~m.) lhe game. (CompoIJt'IJI usr l,mblem.)
220 Datapath Components 4.15 Exercises 221
SECTION 4.8: SlJBTRACTORS
SECTIO N ~ .6: COlJNTERS
".J'" Design a4-bi t up-counte r that hrls twO control inputs: elll enables cOlllllin g up. w hile clear
4.45 Creale Ihe internal design of a fu ll.
446 C h ' Sublraclor. (Compollelll design problem)
synchronously resets the counter to all Os: . onvert t e follOWing two's com I .
(a) 0000 II II P emenl binary numbers 10 decimal numbers:
(3) using a parallcl IO::ld register as a building block.
(b) using flip-flops and 11111XeS directly by following the regiSlef design process of Section 4.2. (b) 10000000
(c) 1000000 1
(Componelll desig" problelll.)
(d) 11111111
.t35 Design a 4-bit down-counter that has three co ntrol inputs: elll enublcs cOll ll li ng up. clear syn-
(e) 100 10 10 1
chronously resets the counter to all as. ;}nd sef synchrono usly sets the coume r to all Is:
(3) lIsing 3 parallel load regi ster as a building block. 4.47 Conven the following Iwo's co I .
(a) 0 1001 10 1 mp ement binary numbers 10 decimal num bers:
(b) usi ng fl ip-flops and muxes directly by following the register design process of Section 4.2.
(b) 000 1101 0
(Compoll ellf design problem.)
(c) 111 0 1001
.tJ6 Design a 4-bi l up-cou nter with an additio nal ou tput IIpper. tipper ou tputs a I whenever the
(d) 101 0 10 10
counter is within the uppe r hal f of th e counter's ra nge, 8 to 15. Use J bas ic 4-bit up-counte r as
(e) 111111 00
a bui lding block. (Compoll elll desigll problem.)
-1.-'7 Design a 4-bit up/down-coun ter lhat has four conuol in puts: CIICUP enables counting up. 4.48 Convert Ihe followi ng IWO's complemenl b' be .
(a) 111 00000 Inary num rs 10 deCImal numbers:
elll down enables counlin£! down. clear synchronously resets the counter to aliOs, and set
(b) 01 111111
syn~hronou sl y seLS the cou~ter to all I s. If both counl control inputs cm_lIp and cllt_dow,l are
I. the counter will retain its current count value. Use a parallel load reg ister as a building (c) 1111 0000
(d) I 1000000
block. (Compon ent design problem.) (e) 111 00000
~ . 38 Design a circuit for a 4-bit decreme nter. (Component design problem.)
4.49 Convert Ihe following 9·bi l IWO'S compleme I b' b .
~.39 Design an electronic turnstile system using a 64-bit counter. The input is a bit A, which is I (a) 0 11111111 n Inary num ers 10 deCImal numbers:
for exact ly one clock cycle whenever a person walks through the turnstile. The outpu t is a 64· (b) 1111111I1
bi t binary number. A second input 8 is 1 whenever a reset button is pressed. and should reset (c) I00000000
Ihe OUIPUIIO Os. Knowing Ihal Californ ia's Disneyland altrac lS aboul 15.000 visilors per day, (d) I I 00000oo
and assumin g they all pass th rough your one turnstile. how many days wou ld pass before your (e) 11111111 0
counter would roll over? (Compone1l1ltse problem.)
4.50 (a) 2 Ihe fo llowing decimal numbers 10 g·b·1I I"wo S comp Iement binary
Convert . ronn:
~.-lO (a) Using an up-counter with a synchronous clear contcol input, and ex tra logic. design a
circuil Ihal OUIPUIS a I every 99 clock cycles. (b) - I
(b) Design Ihe co unler from part (a). bUI use a down-counler wi lh parallel load. (c) -23
(c) Whal are Ihe tradeolTs belween Ihe IWO designs from parts (a) and (b)? (d) - 128
(Compone1l1llSe problem.) (e) 126
4AI (a) Gi ve Ihe counl range for Ihe following sized up-counlers: 8-bils. 12-bils. 16-bils. 20·bilS, (f) 127
32·bi lS. 40·bils. 64·bilS, and I28-bils. (g) 0
(b) For each size of counler in part (a). assum ing a I Hz clock. indicale how many minules. 4.51 (a)
Convert
29 the followin!!- decimal numbe rs 10 . b'II IWO'S complement binary ronn:
hours, days, etc .. the counter wou ld counl before wrapping around.
(b) 100
SECTION 4.7: MlJLTIPLIER-ARRA Y STYLE (c) 125
~A2 Ass uming all gales have a delay of I lime·unit. which of the following designs will compule (d) - 29
the 8-bil multiplicalion A' 9 fasler: (e) - 100
(a) a ci rcuil as designed in Exe rcise 4.19. or (f) - 125
(b) an 8-bil array slyle multiplier with one of ils inpuL~ connecled 10 II conSlanl value of nine. (g) - 2
4.43 Design an 8·bi l array,s lyle multiplier. (CompOllelll desigll proMem.)
~A4 De;ign a more accurale versio n of the Celsius 10 F"hrc nheil conve rt er from Example 4. 10.
The new conve r$ion circu it receives n digitized temperature in Cebiu us n 16-bit binary
number C and OUIPUIS Ihe lemperalure in Fahrenheil as a l6-bil OUIPUI F. Our more accurnle
eq ualion for calcul aling an approximate conversion from cl,i ll' 10 Fahrenheil is: F = C'301
16 + 32 . (Compoll elll lise p roblem.)
222 Datapath Components 4.15 Exerc ises 223

. g.bit tWO' s complement binary fOfm :


TABLE 4.4 Desired ALU operations.
~.52 Conve rt the rollowi ng decllllal numbers to
(a) 6 In puts
(b) 26 X y Operation
(c) - 8
0 0 0 S-A+B
(d) -30
(e) -60 0 0 1 S = A AN D B (bitwise AND)
(I) -90 0 0 S=A AN D B (b it wise NAN D)
(g) - 120 .
.. .' 9.bil twO's compl ement bin ary fonn: 0 1 S = A OR B (b it wi se OR )
"'.53 Convert the followmg deCimal numbers to
(a) I 0 0 S = A NOR B (bit wise NO R)
(b) - I 0 1 S = A XOR B (bit wi se XO R)
(c) -256
0 S = A XNO R B (bi twise XNOR )
(d) -255
(e) 255 S = NOT A (bi twise complement )
(I) -8
( 0) - 128 4.58 An instructor teaching Boolean algebra wants to help her students learn and understand basic
o th t has three S-bit inputs. A. B, and C , and a single
"'.5-& Usin2 4-bit subtractors. bui ld a sublraclO f a bl ) Boolean opera tor.; by providing the stude nts wi th a calculator capable of perfomling bitwise
8-bit....output F. where F=(A-B) - C. (Compollelllilse pro em, AN D. NAND, OR. NOR. XOR. XNOR. and NOT operatio ns. Using the ALU specified in
. . that di oili zes a temperature int.o a 16-bit binary number K Exe rcise 4.57. bui ld a simple log ic ca lculator us in g DIP switches for input and LEDs for
. a .S5 You are given a digital thermometer e 0 a 16-bit Fahrenheit value. Use the fol -
in Kel vin . Build a system to conve rt that temperalUre ,I * output. Th e logic calculator should have three DIP swi lch inpu ts to select which logic opera-
ti on 10 perform . (CompOllelll use problem. )
.
lOWing .
equauon (0 proYI'd e an appro xi m, te co nverSIOn : F= (K-273) 2+32. (Compollellt lise
problem. )
SECTION 4.10: REGISTER HLES
SECTION 4.9: ARITHMETIC-LOGIC UNITS-ALUS 4.59 Des ig n an 8x32 two port (I read. I write) regi ster fi le. (Compollent design problem)
~.56 Des ion an ALU with two 8-bit inpu ts A an d B. and co ntrol s ignals x, y, and z. The ALU 4.60 Des ig n a 4x4 three port (2 read. I wri te) register fi le. (Compollem design problem_ )
should support the operation s desc ribed in Table 4.3. Use an 8- blt add er and an anthmeuc!
4.61 Design a IOx l 4 register fi le (one read port. one write port). (Compollem design problem )
logic ex tender. (Componefll design problem.)
4.62 " Create a speed-dial system for a telephone. Ei ght speci:1l buno n bO-b 7 access each stored
TABLE 43 Desired ALU operati ons. number. The most recently dialed number exists as ni ne digits stored in nine 8-bi{ regi rers RO-R .
When the phone user presses another button S simultaneously with any bunon bO-b7. the most
Inputs
Operalion recently dialed number gets stored in the button's corresponding storage. When the user presses a
X y bu tton bO-b7 by itself. the number in that bu tton 's storage gets read o ut and placed on nine -bit
0 0 0 S=A-B outputs PO-P8. Hint: use nine register fi les and some extra logic. (Componenl use problem. '

0 0 S=A+B

0 0 S=A " S

0 S= A/ 8

0 0 S = A NAND B (bitwise NA D)

0 1 S = A XOR B (bitw ise XOR)

0 S = Reverse A (bi t reversa l)


S =NOT A (bitwise compl ement)
4.57 Design an ALU wilh two 8-bit in pu ts A and B. and co ntro l sig ~a l s x. y. an d z. The A ~U
sho uld s upport the operati o ns described in Table 4.4 . Usc an 8-blt adder and an an thmeuc!
logic extender. (CoII/ponenl design problem.)
224 4 Datapath Components

~ DESIGNER PROFILE
Roman began slUdying
Computer

soflware
Science in
college due to his interest in
development.
10015. and delennine if ex isti ng solutions will help you
solve the problems you face as an engi neer." Roman points
out that digital design has changed at a rapid pace over the
last few decades. requiring engineers to leam new design
5
During his undergraduate techniques, leam new programming languages, such as
studies. his interests YHDL or SyslemC, and be able 10 adopl new lechnologies
expanded 10 include digilal
design and embedded
10 stay success ful. "As the industry continues to advance al
such a rapid pace. companies do not only hjre engineers Register-Transfer level
syslems and eventually led for what th ey already kn ow, but more so on how well those
him to become involved in
research developing new
engineers can continue to expand th eir knowledge and
leam new skills," He poinls oUI Ihal "college provides
(RTl) Design
melhods 10 help designers slUdenls wilh an excellent opportunity 10 not only learn the
qu ickly build large integraled circuils (IC). Roman essemial infonnation and skill s from their course work but
conLinued his education through grnduate stu dies and also to learn additional infonnation on their own, possibly
recei ved his M.S. in Compu ter Science. after which Roman by learn ing differenl programmi ng languages, gelting
worked for bOlh a large company designing integrated involved in research. or working on larger design projects." 5.1 INTRODUCTION
circuits (Ie) for consumer electron ics as well as a slart-up Roman is mOli valed by his enjoymenl of Ihe work he
company focusing on high-performance processing. does as well being able 10 work with other engineers who In the previous chapters, we've defined the combinational and sequential components
Roman enjoys working as both a software developer share his interests. "Motivati on is one of the keys to needed to build di gital systems. In thi s Chapter, we'lI learn to build interesting and useful
and hardware engineer and believes that "fundamentally success in an engineering career. While motivation can di gi ta l sysle ms from those components. In particular. we'lI put LOgether datapaLh compo-
software and hardware design are very similar. both come from many different sources , finding a career that nents 10 build datapaths, and we'll use controller to control those datapaths. The
relying on efficiently solving difficult problems. While you are trul y inleresled in and enjoy reall y helps. Co- combinati on of a controller and datapath is known as a processor. Some processors. like
good problem solving skill s are important, good learning workers are also a great source of moti vation as we ll as
skill s are also imponant." Contrary 10 what many stu denls Ihose in pe rsona l computers. are programmable-those processors are the focus of
know ledge and lechnical advice. Working as a member of
may believe. he points au[ that "Ieaming is a fundame ntal Chapter 8. Other processors are custom-designed for a parti cular task. and are nOl pro-
a team th at communicates well is very rewarding. You are
activity and skill mat does not end when you recei ve your able 10 mOli vale each olher and use you r strengths along grammable-<lesign of such custom processors is the focus of this chapter.
degree. In order 10 solve problems, you often are required with the strength s of your co-workers to achieve goals far Di gita l design ers today focus largely on designing cuslom proces ors. as opposed to
10 leam new skills. adopl new programming languages and beyond Ihal which you could achieve on your own." designing lower-level digi tal components. We can define a custom proces or as a digital
c irc uit that implements a computer algorithm-a sequence of instruction that carry Out a
panicular task. For example, we can define an algorithm to filler out noise from a digitized
stream of audio. and we can then create a processor to implement that algorithm. Another
a lgorithm might e ncrypt data for secure e lectronic commerce purpo es. An algorithm might
compare a fingerprint to a set of 10.000 fingerprints to quickly enable a pou e officer to
detemli ne if someone is a wanted criminal. An image processing algorithm might detect a
lank in a large video image. Beamfonning. pan of the ultrasound machine example in the
previou chapler. can be thoughl of as another algorithm. implemented u ing the processor
design described in that c hapter. In facI, several of our exanlples in the previou chapler. like
the above- mirror di splay, DIP-switch-based calculator. and color space on\'ener. an a tu-
a ll y be thought of as very simple proce sors implementing imple algorithms.
Processors can be designed using different design method. The 010 t ommon
me thod in pm tice loday is known as register-transfer le\'el de ign . Regisrer-transfer
level desigll , or RTL design . actually consists of a wide variety of approache- but in gen-
e ral. a des igner specifie the reg isters of a design. des ribe. the po ible tr.lnsJe _ and
operati ons perfomled on input. output. or register data. and define the ontrol that pe.-i-
fies when to transfer and operate on data.
Reca ll the design processes we defined for combinational logic des ign in hapler 2.
and f r seque nti al logic (controller) design in Chapter 3:
226 Registe r-Transler LevellRTLI Design 5.2 RTL Design Method 227

In the combinat ional logic design process outlined in Tabl e 2.5 , . . . A fifth step may be necessary, in which one selects a clock frequency. Designers
I . The first step was to caplllre the desired behavior of the comblll auonal logtc, seeking high performance may choose a clock frequ ency that is the fastest possible based
wit h either a truth table or an eq uation. on the longest register-to-register delay in the fin al circu it.
2. The rcma in ing stcps were to cOllller / th e behavior to a circuit. Implementing the controff er's FSM as a sequenti al circuit. as we learned in Chapter
In the sequential log ic (controffer) design process in Table 3.2. . . . 3, would then compl ete the design.
I. The first step was to caplll re the desired behaV ior of the sequenu al logtc, usmg Notice that the first step captures the desired behavior, whi le the remaining step
a finite-sta te machi ne. that behavior to a circui t.
COII l/ert

2. The remai ning steps were to convert the behavior to a circu it. We' ll first provide a smaff and simple example as a "preview" of the RTL design
method 's steps, before we define each step in more detaif.
It should therefore come as no surpri se th at: .
EXAMPLE 5.1 Soda machine dispe nser
I. The first step of an RTL des ign method wi ff be to captll re the des ired behavior of
the processor. We' ff introd uce the concept of a hi gh-level state machme for cap- We are (0 design a processor for a soda dispen ser. A coin dClcclOr
turi ng RTL behavior. provides our processor with a I·bit input c mal becomes 1 for
one clock cycl e when a coin is detected. and an 8-bit input a
2. The re maining steps wi ff be to cOlI l/e rtth e behav ior to a circuit.
indicaling the coin 's value in cents. Another 8.bit inpu t S indi.
c_
Figure 5. I il lustrates the idea th at the design process cates the cos t of a soda (thi s cost can be set by the machine Soda
d_
ca~ be viewed as first capturing behavior and then con- Capture behavior owner). Once the processor has seen coins whose value equals or dispenser
processor
venin o the behavior to structure. That process applies exceeds the cost of a soda. the processor should se t an OUlput bit
regardless of whether we are performing combinati onal d to 1 for one clock cycle. causing a soda to be dispensed (this
machine has on ly one type of soda). The system does not give Figule 52 Soda dispenser
logic design. sequential logic design, or RTL design. block symbol.
Convert to circuit change-any excess money is kept. Figure 5.2 provides a block
In this chapter. we wiff introduce the RTL design symbol of the system .
process. also known as the RTL design method. As the
process is largely creative, we wiff utili ze numerous Figule 5.1 The design process. Step 1 of our RTL design method is to capture the
Inputs: c (bit), a(8 bits), s (8 bits)
examples to iff ustrate the process . We wi ff also intro- desired behavior of the system. Figure 5.3 shows a Outputs: d (bit)
duce several high-level components that are useful high-level state machine describing the desired Local registers: tot (8 bits)
behav ior. The first state. Ill il. sets the output d to 0
during RTL design. including memory components and
and initializes a local reg ister tot 10 O. tot will
queue components.
keep track of how many cents the syslem has seen
so far. The Slate machine then enters stare Wail.
5.2 RTL DESIGN METHOD (Recall from Chapter 3 that a transition with no
condi ti on has an implicit "[rue" condition. and thus
RTL des ign is carried out using a wide variety of methods in practice. but it may be transitions on the nex t rising clock edge.) The FSM
useful to defi ne a general method as in Table 5. 1 stays there as long as no coin is detected and the d=t
total cents seen so far is less than the cost of a soda. Figure 5.3 Soda dispenser high-!e\·.!
TABLE 5.1 RTl design method.
When a coin is detec ted. th e stale machi ne goes lO S t ~lle m:Jchinc.
Step Description state Add. which adds the coin 's value to t o t. and
CapfUre {I high-level Describe the system's de ired behavior a a high-level state machine. then returns to stale Waif. Once tot is greater than
Q. S(llfe machine The Slale machine consists of slales and Lran sil ions. The Slate machine or eq ual to (in other words. nO( les than) the cost
~
is "high-level" because the tran sition condili ons and the stal e actions of a soda, th e state machine goes to stale Disp.
cii
wh ich dispenses a soda by selling d to 1. The state
are morc than just Boolean operations all bit inputs and outputs.
machine then return s to Slale /"il .
; Crea~e a darapllI/i Cleate a datapat h to carry out the data opeltltions of the high-level
Slep 2 is to create a datapath. We'll need a local
<0 Sla le machine.
regis ler for tot. an adder connected to tot and a
~ COlin eel the datapath Connect the datapmh to a controller block. Connect external Boolean to compute tot + a, and a comparator con·
~" 10 a cOli/rolle r inputs and output to the controller block. ne ted to to t and S to compute tot<S. The
.., Derive 'he Convert the high-level state machine to a fini te-,t.te machine (FSM) re sulting dalapUlh appears in Figure 5.4.
e- cOllfmller 's FSM for the controller. by replaci ng data operations with sctting and rending
cii
of control signals to . nd from the dutapath .
228 Register-Transfer l evell RTLI Design 5.2 Rll Design Method 229

Step 3 is to connect the datapath to a The previous example gave a preview of the RTL design method. Notice that we
controller. Figure 5.5 shows the con- started with a high-level state machine, w hich wasn't just an FSM because there were
nections. Notice that the controller's loca l registers dec lared, and beca use there were dat a operati o ns (rather than just Boolean
inputs and outpu ts arc all just one-bit operations) in the states an d o n the transitions. We th en created a datapath to implement
signals.
those local registers and to carry o ut the data operation. We further needed a controller to
Step 4 is 10 derive (he comfoller's contro l that datapath . We defi ned the behavior of that controlle r to be the same as the
FSM. The FSM has the same states behavior of the hig h-level state mach ine. except the contrOller's FSM used datapath
and transitions as the high-level stale cont ro l signals to carry ou t and evaluate the datapath operations. Finally, we could design
machine, but utilizes the datapath 10 Data path the contro ller using Chapter 3's Contro ller design process.
perfoml any data operations. Figure
We now disc uss each RTL desig n me th od s tep in more detail, while illustrating each
5.6 shows the FSM fo r the controller. Figure 5.5 Soda dispenser controller ste p with ano ther example.
in the high-level stale machine. stale and datapath co nn ec ti ons.
fil iI had a data operarion of tot =
o (tot is 8 bits wide. so tha! ass ign- Step 1-Creating a High-Level State Machine
ment of a is not a single-bit
operation). We replace that assign- InpulS: c, toU Cs (bit) A hi g h- level state machine is a computation model s imilar to a finite-state machine_ but
ment by selling tot_c 1 r ~ 1, whi ch Oulpuls: d, toUd, toCctr Ibit) with add itio nal features th at ena ble the desc ripti o n of computations involving more than
clears the tot register to O. State toUd just Boo lean data.
Wait's transitions had data opera ti ons Reca ll that a finite-state machine (FSM) consists of inputs. o utputs, states_ state
comparing tot < s. Now we have d ac ti o ns (a mapping of states to o utput va lues), and state transitions (a mapping of state
a com paralOr computing th aI com- a nd inputs to nex t states). However, the inputs and outputs of an FSM are limited to
parison for the controller. so the Boolean types, actions are limited to Boolean eq uati ons, and transition conditions are
controller need onl y look at the
limited to Boolean ex pressio ns. These limitations make specifying of computations
result of that comparison in th e
invo lving da ta cumbersome, other than for just si ngle-bit data.
signal tot_l t_s. State Add had a
Controlier d=l Fig ure 5.3 showed a high- level state
da!a operati on of tot ~ tot + InpulS: c (bit), • t8 bitsl. s (8 bits)
a. The da!apath computes that addi - machine describing the behavior of a soda d is-
Outputs: d (bit)
tion for the con troller using the Figure 5.6 Soda dispenser conto ller FSM. penser processo r. Notice that the state machine is Locat registers: tot (8 bits)
adder. so the controller merely needs not an FSM beca use of the severa l reasons hi g h-
to set to t_ 1d ~ 1 to cause th e addi- lig hted in Fig ure 5.8. One reaso n is beca use the
tion result to be loaded into th e tot state mac hine has inputs that are 8-bi t types,
regi ster. 15 whereas FSMs o nly a llow input and o utputs of
I- 15 0
To complete the design, we
st sO c
,;: nl nO d
1-
0:
I-
g Boolean types (a single bit each) . Another reason
wou ld implemen t the cont ro ller's is because the state mac hine declares a local reg-
0 0 0 0 0 t 0 0 t
FSM as a Slate register and combi- ister tot to sto re intem,ediate data. whereas
national logic. Figure 5.7 shows a 0 0 0 1 0 1 0 0 1 d=l
E FSMs don ' t all ow local data storage-the only
partial state table fo r the controller, 0 0 1 0 0 1 a 0 t Figure 5.8 ada dispenser high-le\·eJ
"sto red" item in an FSM is the s tate itse lf. A
with the states encoded as /lIil : 00 , 0 0 1 1 0 1 0 0 1 State machine with noo-FSM con.sttucts
third reason is beca use the state ac ti ons and tran-
Wail: 01. Add: 10, and Disp: 11. To 0 1 0 0 1 1 0 0 0 hi ghlighted.
s ition conditions invo lve data operations. like
complete the controller design, we 0 1 0 1 0 1 0 0 0
would complete the state table.
'iii tot = 0 (re member th at tot is S-bits wide). tot < s (there' no .. <_. Boolean oper-
!: 0 1 1 0 1 0 0 0 0
create a 2-bit Slate register, and ator), and tot ~ tot + a (w here the "+" is addition. not OR. and there's no addinon
0 t 1 1 1 0 0 0 0
crcate a ci rcuit for each of the five Boolean operator). whereas an FSM allows o nly Boolean equations and expre <tons. _
1 0 0 0 0 1 0 1 0
outputs from the tab le. as discussed
in Chapter 3. Appendix C provides
"0
"0
<: ... ... The refo re. a useful foml of hi gh-level state machine i an extenston of an F 1\I lD
w hic h:
a. 1 1 0 0 0 0 1 0 0
details of com pletin g the controller'S
design. That appendix also traces 0'" ... ... inp uts and o utp uts may involve dma types beyond just single bits.
through the functioni ng of the con- • local registers may be declared (of various data type ). and .
troller and datapath wi th one Figure 5.7 Sada dispenser contro ller's stute table actions and condition may involve general arithmetic equmion. and e: prenoru;.
another. (panial ). rather than just Boo)c.1t1 equations and expressions.
230 Register· Transfer Level (RTL) Desig n
5.2 RTL Design Method 231
Slich a high-level state mac hine is not the onl y possible ex.te nsion to an FSM . Do~ens
of varie tics of ex tended FS Ms ex ist. However. we will be lItlh zlI1g the a bove-descnbed from bunon B L
extended FS M varie ty throughout th is chapter. That parllc ula r varie ty o f hIgh-level state to laser
Laser·based
mac hine is someLimes ca lled an FSM with data . or FSMD . . distance
We will contin ue to use the following conve ntions fo r hIgh-leve l state machines, measurer S
from sensor
which we also used fo r FSMs:
Each tra nsi ti on is implic itl y AN Ded with a ri sing cl ock edge. Fig ure 5.10 Block diagram of the laser· based distance measurement system.
Any b it output not ex pli c itly assigned a va lue in a Slate is imp lic itly assigned a O.
NOIe: this convention does not apply for mulllbJl outputs. Step I-Create a high-level state machine.
We can desc ribe the ove rall co mrol of the sys tem using a hi gh-leve l stille mac hine. To facilitate the
We now prov ide anoLher example of describin g a sys te m using a high-level state
c reation of the Sia le mac hine, we enumerate th e sequence of eve nts underlying the measurement
mach ine. system:
Th e system powe rs o n. Initiall y. th e sys tem 's laser is off and the system outputs a distance of
EXAMPLE 5.2 Laser-based distance measurer- High-Ievel state machine omelers.
There are cou llIless app licatio ns thnt req uire one to acc urately measure ~he dis tance of an object The sys tem should then wai t fo r the use r to in it iate measurement by pressing a button. B.
from a known point. For example. road buil ders need to acc~ r~lI e l y de te~mlne the .Iength of a stre~ch
Arter the bUllon is pressed . the system should tum the laser on. We'll choose to leave the
of road. Map makers need to acc ura tely dete rmine the locat.lOns a.nd. heights of hills and mountainS laser on for one clock cycle.
and the sizes of lakes. A t! iant crane for constructing skyfl sc bUi ld ings needs to acc urate ly deler·
mine the dista nce of the s~ding crane arm fro m the base. In all of these applicntions. stringing out Aflc r the lase r is pulsed. the system s hould wai l for the sensor lO detcctlhe laser's reflection.
a tape measure to measure the....distance is not very practical. A bellcr method involves laser·based Mean whil e. the sys tem s hould count how much lime passes from me
lime the laser was
pulsed un ti l th e reflec ti on is sensed.
distance measurement.
In lase r-based dis tance measure ment. a lase r is paimed at th e object of illlerest. The laser is Aft er the re flecti o n is detected. the system s hould use the amoun t of time passed since the
briefly turned on. and a timer is started. The lase r light, trave ling at the speed of light. travels to the laser was pulsed to co mpule the d istance to the obj ec t o f i nteresL The system should then
object and refl ects back. A sensor de tects the refl eclion of the laser light. causing the timer to stop. return to waitin g for th e use r to press the buno n so that a new measuremen t can be taken.
Knowing the time T taken by the light to trave l to the object and back. an d knowing that the speed of The above seq uence guides ou r constru c-
Inpuls: B. S (1 bit each)
=
ligh t is 3x lOBmeters/second. we can com pute the distance 0 eas ily by the eq uati on: 2D T seconds ti o n o f a hi gh-leve l Sla te mac hine. \Ve begi n
Outputs: L (bit). 0 (16 bits)
* 3x IO~ meters/second. Laser.bused distance measureme nt is illustra ted in Figure 5.9. with an iniLial state. which we call SO. SO 's task

0-?
i to ensure lh31 when our system powers on. it
does nol o utput an in correct distance. and it
does not tu m the lase r on (possibly injuring the
o uns uspecting user). Speci fying th is behavior as
Objectot a high-level Slate mac hine is straighlforward L = 0 (laser off)
interest and seen in Fig ure 5.1 1. Olice th at the high. 0 = 0 (distance = 0)
leve l state mac hine differs from an FSM in tha t Figure 5.11 Panial high-level state macnine for
20 = T sec . 3xl 08 mlsec the la te's ac ti ons use u dUla type that is larger measureme nl system: initialization.
than one bit (namely. D is 16 bits). However. the
hi gh-level slale machine itself fo llows the convention thai every tr::lOsition i~ implicitly A1 Ded ,,; th
Figure 5.9 Laser·bascd distance meas urement . a rising clock edge. so th e state machi ne onl y transitio ns during clock edges (just like for an FSMt
Note that even though the assignl1lents L - 0 and D= 0 look the same. the assignment L = 0 :L<signs
a 0 bit to the one· bit output L, whereas the assignment 0"" 0 assigns the l6-bit binaT) number 0
Let's design a processor to contro l the lase r and the timer and to compute dislHnces up to 2000
(which is ac tually 00000000000 00000) to the 16-bi t output D. ome other n ullions distingUlsb
meters. A block diagram of the system is shown in Figure 5. 10. The system has a bit input B. which
bil ass ignments from dala assig nments usi ng different notations. such a en -losing a bit in singk
equals 1 when the ulicr pre~seo;, a bu tt on to stan the meas urement. Anothe r bi t input S comes from
quoles. For eX~Hnple. the bit assignment L - 0 could be \\ rinen instead :b L - ' 0 ' .
the ' en,or. and is I when the rcnected laler is detec ted. A bit output L control. the luse r. turning the
After initializa tion. the measuremenl system wailS for the user to pre:,.., the!' bunon S. \\ hJ h ini-
la,er on when L i, 1. Finally. an N-bit output D indicates Ihe diltance in binary. in units of meters-
tiales the measurement process. When the user pre " e.IIi:i the bUHon. B \\ ill l'qUal 1. .U1,j th
we' ll aSlume a dis play converts that binary number into a decima l number alld dis plays the resul~
mcnSUfement sy. tem should proceed to acti":lte the laser. To perronn the \\ aiting. \\ e add :1 ~(3le
on ''" LCD for the U; cr to read. D will have to be at lealt I I bitl. sill c I I bits cun re prese nt the after O. which we cull SI. shown in Figl1 n~ . L . The shO\\1l mmsitj os C.3U_ th' . . tatc m~h. . tufl(' {('I
num ber; 0 to 2047. and we want to measure dillanCc,1 up to 20()() me tefl. Le t'l make D 16 bits. re muin in '({He I whi le B - 0 (mc:ming B' is trod.
232 Register-Transfer Level (RTL) Design 5.2 RTL Design Method 233
When B= I. the laser should Inpuls: B, 5 (1 bit each) each clock cycle would thus corres pond to one meter. Thus wi th a 300 MHz clock. Octr counts the
slay on for one cyc le. In olher Outputs: L (bi t). D (16 bits) number of mete~s (hal the lascr beam traveled from the measurer t.o the object and back to the mea-
words. when B= 1. the state sureLTo CO unt Just the dis tance rrom the measu rer t.o the object, we divide Octr by 2 (algebraic
machine should tran sition 10 a B' (buNon not pressed) Simpli fication of the equa tions in this paragraph veriry that D = Dc t r /2). We' ll pcrfonn this cal-

0--8,-,
Slat e that IUrns the Inser Oil. fol- cul allon III a state we Will call S4. Our fina l hi gh-level state machine is shown in Figure 5. I 5_
lo\\'ed by a slate that turns th e
Jaser ofr. \Vc'lI ca ll the lase r-on Inputs: B, S (1 bit each) Outputs: L (bit). D (16 bits)
Local Registers: DClr (16 bits)
st:lte 52 and the laser-ofr slate
53. Figure 5.13 shows how 52 L=O (buNon
and 53 afe con nec ted in the D=O pressed)
high-level state machine. Figure 5.12 Parti al high-level slate machine for
In Male 53. the slat e measurement system : wai ting for a button press.
machine shou ld wa it until the S4
!\cnsor d e l eC ( ~ Ihe laser's renee- S
Inputs: B. 5 (1 bit each)
L=O Dctr =O L= 1 L=O 0 = Dctrl2
lion (S=] ). The SIJIC machine Outputs: L (bit). D (16 bits)
D=O Dctr = DctH 1 (calculate D)
remains in 53 while S=O. As
B'
mentioned in the earlier sequence Figure 5.15 High· level staLe machine for measurement sys tem: calculating the value of D.
of events. the state mach ine

"0-0,0-8
should meanwhile count th e We can summarize the behavior of the high-level state machine in Figure 5.15 as follows:
duration between the laser bei ng
pul sed and the laser's reflection 50 is the initial state. In state 50, the state mac hine initializes the laser to off by setting L=0
being sensed. From the discus- L= 0 L= 1 L=O and se ts the ou tput 0=0 too. The machine then transitions to 5l.
D=O (laser on) (Iaserolf)
sion of timers in Chapler 4. we 51 clears Dc t r to 0 and then wa its unti l th e bUllon is pressed. When the button is pressed.
know Ihm with a given clock Figure 5.13 Partia l hi gh-level state machi ne ror the machine transi ti ons to state 52.
period. we can measure time by measurement systcm: pulsing the laser for one cycle.
52 turns on the laser. The machine th en transition to 53.
counting the number of clock
cycles and multiplying that number by the clock peri od (time = cycles' ( I/clock freque ncy». Thus, 53 turns off the laser and increments Dctr every clock cycle (with a 300 MHz clock. every
\\e use a locrt! register. wh ich we' lI cn ll Detr. to count clock cycl es. Th e slate machine increments cycle corresponds to one meter). The machine stays in 53. incrementing Deo- during each
Dc t r as long 3S the state machine is wailing for the laser's reflect ion. (For si mplicity. we ignore the clock cycle, until the refl ec tion is sensed, at which time the machine transition to Stale 54.
possibility that no refl ection is ever detec ted.) We must also initialize Dc t r to D. which we choose to 54 sets the output 0 to the count ed num be r of cycles di vided by two, which corresponds to
do in State 51. \"lith these modifications. our high-level state machine is seen in Figure 5. 14. th e measured distance in meters. The machine then returns to state 51. which waiLS for the
bUllon to be pressed again.
Inputs: B, 5 (1 bit each) Outputs: L (bit), D (16 bits)
Local Registers: Dctr (t6 bits) A real laser-based distance measurer might use a faste r clock frequency in order [0 measUJ'l!
distance with a greater precision than just 1 meter.
B' 5' (no reflection)

"'8--0,0-~(reflectlon)
The hi g h-level state machine de cribed above is just one type of FSM \-ariation. A dif-
ferent state mach ine variation that was previo usly qui te popular was called Algorithmic
50 51 52 53 ? Stale Machines, Or ASMs. ASM are s imilar to flowcharts. except that A M include a
B
no ti o n of a clock that enab les transi ti o ns from one slate to another (a traditional flow hart
L= 0 Dctr = 0 L= 1 L= 0
D=0 (reset cycle Dctr = Dctr + 1 does no t have a n explicit clock concept). ASMs, like flowchans, comain more "srru lUre"
count) (count cycles) th an a Slate mac hine. A tate mac hine can transition from any Slate to an) other lale,
Figure 5.14 Partial hi gh- level state machine ror measuremen t sy tem : wai ting ror the laser whe reas a n ASM restri ts trans itio ns in a way Ihat cau es the omputation I look more
reflec ti on and counti ng clock cycles. like an algorithm- an o rdered seq uence of instructions. An AS 1 u e ' several type of
boxes. including s tate boxe , condition boxe , and output box' . A Ms Iypicall) nls
Once the rcnec ti on i, detected (5-1 ), ou r high-leve l state mach ine should compute the distance a llowed local data storage and data operations.
othat i, being mea,ured. From Figure 5.9: we know that 2*0 = Tsec • 3x 10" mlsec. We also know T he advent of hardware desc'ription languages (see Chapter 9) s~ms 10 hu\c large!)
that the time T in second, is Octr • ( I/clock rrequency). To 'i mpliry the system's design, let's re pl aced th e use of A Ms. as hardware de cripti n language, contain tht" nSlru IS sup-
"" ume the clock rrequency i, )x lO" Hz. or 300 MHz. Since light (fttvcb )x l o" me ters pc r second, po nin g algorit hmi c stru ctu re, and much more. Thus, we do not de, critx M ' funh r.
234 5 Register-Transfer LevellRTLI Design
5.2 RTL Design Method 235
Step 2-Creating a Datapath (a) Output 0 is a data output (16 bits), so we make D an outpul of the dalapath. as shown in
Figure 5. I 6(i).
Given a hi gi1 -lcvc J slate machi ne. we wa nl to crea tc a data path lhat .can im~ l e menl all the
data storage and computati ons on non-Boolean data types present III the high-level state (b) We need a register to implement the 16-bi l local register Dctr. Noting thaI the operations
machine_ Doing so will enabl e us to then replace the 11I g h - l e,~e l state machme ~Y an FSM on Dc t r are clear (in Slate SI ) and illcremelll (in state 53). we can implement that register
by instantiating a 16·bit up·counter, as shown in Figure 5. 16(ii). Furthennore. as we Want
that merely controls the datapath . We can decompose the create a datapath step Into
1.0 control when the output 0 changes (no lice th at we only change 0 in slate 54). we instan.
severa l substeps: tiale a 16-bit register Dreg at the OUlpu t D. as shown in Figure 5.1 6(iii). We extend the
Step 2: Create a datapath Dc tr COunter and Dreg register control signals to be inpulS to the dalapath_ wi th each
signal having a unique name, as in Fi gure 5. 16(iv).
(a) Make all data inputs and out puts to be datapath inputs and outputs.
(b) Imple ment the data storage by adding a reg ister component into the datapath for Dreg_cl' _ t-D_a_ta_p_a_th_________~
every dec lared regi ster in the high-level state machille. Furthermore, we tYPIcally Dreg_ld - 1-- - - -- -------.
want to add a reg ister component for every data output. (iv)
Dctr_clr
(e) Methodica lly examine each state and each transition, add in g and connecting new Detr_cnt Oct,: 16-bit Dreg: t 6-bit
up-counter register
datapath components to implement new data computat ions. We add mUltiplexor.; Q Q
in front of component inputs as they become necessary III orde r to share a com-
(ii) (iii)
ponent among multi ple signals th at use the same component in different states.
Sometimes we find that a component already eXIsts (e.g., a regIster) but that we 16
need to add a new control inpUlto that component (e.g., a clear input on a register (i) 0
to set the register to 0). Figure 5.16 Partial dalapath for the laser-based distance measurer.
A com mon term used to describe the adding of a component into a design is ;nstan-
(c) Noting that S3 wri tes 0 with Dc t r di vided by 2. we insert a righl shifter between Dc t r
tiation . Thus. we say that we "instantiate a new regi ster" rather than we "add a new and 0 reg 10 implement th e divide by 2, as shown in Figure 5.17.
register."' Using the term "instanti ate' rather than "add" hel ps avoid possible confusion
with the use of the term "add" to mean arithmetic add iti on (e.g. , saying "we add two reg-
isters ' could otherwise be confu sing). When we instantiate a new component, we should
give that component a name that is unique from any other datapath component name. So
Dreg_clr - t-----------+--,
if we instantiate a regi ster. we mi ght ca ll it "Reg;ster} .'· If we instan ti ate another register,
DregJd - +-----------4.
we mi ght ca ll it "" Register2.'· Actually, we should give meaningful names whenever pos-
sible. So we mi ght ca ll one register "Telllperatl/reReg.'· and another register Detr_el'
"" HI/I//;dityReg."" Detr_cnt
When we instant iate a new component_ we may create addit ional datapmh inputs cor-
responding to the contro l inputs of the component. For exa mple, instanti ating a register
will create a new datapath input corresponding to the register's load and clear control
inputs. We should give unique names to each new datapath control input. ideally o
describing which component the input controls and the control operati on performed. For
Figure 5.17 The dmapath for the laser-based distance measuremenl sySlem.
example. if we instantiate a register named Register}. we mi ght then create two new data-
path inputs named Register}_load and Register! J lear. Li kew ise. we may need to utilize
The resulting datapath in Figure 5. 17 is <1 very simple dalapmh. but a d3lapath noneLhel
control outputs of a component. li ke the out put of a co mparator. in whi ch case we should
give tho. e outputs un ique names 100. The previous example did not require any multiplexors_ so we -II illu trate separatel) \\ h)
sometimes multiplexors must be instantiated . Consider the ample high-le\eI , tate nla -hine
EXAMPLE 5.3 Laser-based distance measurer-Creating a datapath porti on shown in Figure 5. 18(a). Figure S. 18(b) show - the daropnth :lftcr implementing the
We now continue Example 5.2 by proceedjng to the ,econd ' tep of Ihe RTL design method. <Iclions of' state TO. Tho e a lions require an adder. with the E and F registers ronne.."ted to
the A and B inputs of that adder. Figure 5.18(c) shows that datapath after implementing th
Step 2--Crea te a data path actions of state T! . That state also requires an add r. blll because one alread) e,ists III the
We can follow the , ub<leps of thi , step to creale the d""'path , howli in Figure 5. 17: datapmh. we need not instantiate another udder. H \\ e\ cr_ the R anJ G regisl'rs must
236 Reg ister-Tran sfer Level (RTL) Design
5.2 RTL Design Method 237
connect 10 Ihe A Hnd B inputs of that adder. bUI Ihose inputs of Ihe adde r already have con- Step 4---Deriving the Controller'S FSM
ncclio,,, from E and F. We Iherefore need 10 instanliale multiplexors. as shown in Figure
5.18(d). I a lice Ihm we creale unique names for each mu x 's control in put. If we created Our d atapath correctl y, deri ving an FSM for the controller is traightfa r-
Local regIsters: ward. The FSM wt ll have the same states and transitions as the high-le vel state machine.
E. F. G. R (16 bils) We merel y defi ne the FSM 's in puts and outputs (all wi ll now be single bits). and replace
any data computations in the actions and conditions by the appropriate datapath control
SIgna l values. Remember, we created the datapath specifically to carry OUt those compu-
tattons, and therefore we should onl y need to appropriately configure the datapath contro l
add_A_sO stgnals to Implement each panic ular computation at the right time.
add_B_sO-+--='F::.J
EXAMPLE 5.5 Laser-ba se d distance measurer-Deriving the controller's FSM
We continue the previous example by goi ng to slep 4 of Ihe RTL design method.
Step 4-Derive the conlroller's FSM.
(a) (b)
The last ste p is to design the co mroll er's interna ls. We can describe the comroller's behavior by
(c)
refining our high-level Slate machine from Figure 5. 15 inlo an FSM. replacing the "high-level:'
ac ti ons and conditi ons. li ke Dc t r""O . by actual controller input and o utput signal assignments and
(d) condilions, like Dctr _c 1 r=1. as shown in Figure 5.20. Olice that the FSM does nOl directly
Figure 5.18 1"' lanlialing dalapmh Illuxes: (a) sample high-level Slate mac hine portion, (b) dalapath indicate the computations that are happening in the datapath. For example_ 5~ loads Dreg with
aflcr im plcmcnling TO's aClions. (e) datapath afler implementing TJ's actio ns. res ulting in two Dctr /2 . but Ihe FSM itself onl y shows Dreg 's load signal being activated. Thus. the overall
sources for each ;'Idder input. (d) dalapa lh after instantiating muxes 10 handle the multiple sources. syslem behavior can be determined from Ihe FSM by looking also at the datapath.
Step 3-Connecting the Datapath to a Controller Inpuls: B, S Oulpufs: L, Dreg_elr. Dreg_fd. Dctr_e1r. Delr_ent

Slep 3 of the RTL design melhod is actuall y quile straight forwa rd . We simply create a
controller block ha ving the system's Boolean inputs and outputs, a nd we connect the con-
troller block with Ihe datapalh conlrol in puts and outputs.

EXAMPLE 5.4 Laser-based d,stance measurer-Connecting the data path to a controller


COlllinuing Ihe previous example. we proceed 10 step 3 of Ihe RTL design me thod: L=O L= 0 L=l L=O
Dreg_elr= 1 L=O
Dreg_elr=O Dreg_elr = 0 Dreg_e1 r =0 Dreg_elr = 0
Step 3-Connect the datapath to a controller_ Dreg_Id = 0 Dreg_fd = 0 DreQ.,ld = 0 Dreg_ld =0 Dreg_Id = 1
\Ve connect (he dalapalh to a controlle r as shown in Figure 5. 19. We connect the co ntrol inputs and Detr_cl r = 0 Detr_etr = 1 Detr_elr = 0 Detr_elr =0 Dell_e1r = 0
oUlPUIS (B. L. and 51 to Ihe controller. and Ihe dala OUlpUt (D) 10 the datapath. We also connect the Delr_enl = 0 Dctr_cnl = 0 Delr_eOl = 0 Detr_ent =1 Detr_cnt = 0
con troller to the d"tapath control inputs (Dreg_dr, Dreg_ld, DClr_dr. DCIr_CII/) . Normally we don't (laser off) (clear counl) (laser on) (laser off) (load Dreg with Dctrl2)
draw (he c lock ge nera tor block. but we've explicitly shown the clock gene rator in the figure 10 make (clear Dreg) (count up) (slap counling)
clear that the ge nerator must be exactly 300 MHz. Figure 5.20 FSM description of the controlier for the laser-based distance measurer. The desired
actio n in each state is shown in itali cs in the bouom row: the c rresponding bit signal assignment
L
from bunon ~ f - to laser th at achi eves thal acti on is shown in bold.
Controller from sensor
Dreg_elr S
~ HOW DOES IT WORK?-AUTOMOTIVE ADAPTIVE CRUISE CONTROL
Dreg_Id
The earl y 2000s saw the advenl of automobi le cruise front . One way to me,:uure th:n db-lance u ~ a l~er­
r- 1> Detr elr Datapath control sys te ms that not o nl y maintained n paniculnr based distance mea urer. "ith the I~~r and :: n: r
Dctr_eOl speed, but also mainlained a particular dislQrlce from placed in the front grill of the C"'. ""nil( -led to •
0 the car in front-thus slowing the automobile down circuit and/or mkroproce sor that rolnputt!'~ th
10 d'Splay
when necessary. Such "adaptive" cruise control thus
f6 -{ 300 MHz Clock r- ..> adapls to changi ng highway Irnflie. Adaptiv. erui e
distan e. The distance is then mput to the :rui~
control s~stem. "hich dett!'nlline, \\h n lO irK're~ or
Ftgure 519 COOlrollcr/dnWp"th (proce,~or) de~ign for the laser-based d"tnncc measurer. controllers must measure the dislonce to the car in decre~e th~ automobilt", ~[k"t'd .
238 Register-Transfer l evel (RTl) Design 5.3 RTl Design Examples and Issues 239
Recall from Chapler 3 thaI we typically follow the co n~en li on that FS ~ output signals not A block diagram of the system is shown in
explici tl y assigned in a sta te arc implici tly assigned O. F0 1l 0W 1I1 £ that ~o n ve ll t l o n , the FSM wou~d Figure 5.22. Such an arrangement is very simi lar
look as in Fi gure 5.21 . We mn y still choose 10 explictl y show th e a,sslgnll1Cnt of 0 (e.g .. L = 0 10 to the arrangement in a desktop compuler, where
state 53) whe~ that as:-.ignmcnl is a key ac tion of a stale. The key aCllons of each stale were bolded a mas ter processor can read peripheral processor
in Figure 5.20. registers-peripherals might include a disk drive,
a CD-ROM drive, a keyboard, a modem, etc.
We have just described what is known as a
Inpu/s: B, SOu/puis: L. Dreg_elr, Dreg_d, Dctr_elr, Detr_cnt
bus protocol. A bus protocol defi nes a sequence
of ac tions over a set of data, address. and control
li nes, 10 carry out a data transfer over those lines
from one processor to another.
A bus interface implements a bus protocol Figure 5.22 Bus interface example.
for a processor. Let 's implement the bus interface
for one of the peripheral processors. Figure 5.23 to/from processor bus
L= 1 L= 0 Dreg_ld ~ 1 provides a block diagram for a peripheral di vided rd 0 A
L=O Dctr clr= 1
Dctr_cnt ~ 1
Dreg_clr ~ 1 (Cle;;r counl) (laser on) Dctr_cnt = 0
(Iaserolf) imo a main part and a bus interface part. The main
(Iaserolf) (load Dreg with Dctrl2)
(count up) (stop counting) part's output 0 is an input to the bus interface.
(clear Dreg)
Let's assume the peripheral's own address is
Figure 5.21 FSM descripti on of the controller for the l ase r-b~sed distance .l11 ea~ u:e r, u s i ~g the
another input. called Faddr, to the bus interface.
convention th at FSM outputs not explicit ly assigned a value 10 a state are Implicitly aSSIgned O. Fad d r might come from a DIP swi tch. or
perhaps another register. The bus interface also
has inpuls and outputs that connec t to the bus
\Ve wou ld complete the design by implementing thi s FSM , using a 3·bit state regis ter and signals rd, D, and A.
combinat ional log ic to describe the nex t stat e and output logic. as was described in Chapter 3. Step J of our RTL design method is to Peripheral
crea te a high-level state machine. Based on the
bus protocol we defined. a peripheral's bus inter- Figure 5.23 Bus interface block diagram.
5.3 RTL DESIGN EXAMPLES AND ISSUES face part sends data only if the address on input A
RTL design involves a certain amount of creati vity and insight. Thus, a good way to begin matches the address on input Faddr AND the processor requests a read by sening rd to 1. While
to leam RTL design is perhaps through seeing everal examples. We now provide additional the bus interface waits For an instruction from the master processor to send data. the bus interface
exampfes of the RTL design method. th rough which we also ex plain some detailed issues. should not interfere with what another processor may be writing to the hared darn lines. D. Thus.
while waiti ng for a matching address and rd= 1. the bus interface should drive 0 with no value
(known as high impedallce. represemed as "Z").
Simple Bus Interface Design Example When the bus interface detects a matching address and rd =1. the bus interfa e should output
data from the input 0 (from the mai n part) to the output D. However. we must also ensure that 0 does
not change while the master processor is reading from the bus interface. \Ve can keep the \"aJue on 0
EXAMPLE 5.6 Simple bus interface
stable by storing 0 into a local register Ol. As long as the bus interface is not sending data. the bus
Processors typically need 10 transfer data to and from other processors. They typically communicate interface updates 01 wi th the curre nt
such data over a bus, to reduce wire congestion problems that might oth erwise occur (see Section value of O. When the bus interface is Inputs-. rd (bit); a (32 bits); A, Faddr (4 bits)
4 . 10). Suppose 16 different processors each has a 32-bit outpu t connected to a single 32-bit bus sending data, the bus interFace does Outputs-. 0 (32 bits)
Local register. at (32 bits)
named D. Suppa e another processor, a master processor. may want to read the output of any of not upd3le 0 I and out puts 0 I on D.
those 16 processors. (Let's call those 16 processors per ipherals. which is a common term for a pro- causing 0 to not change during a send. ~ _ _ --.'rd·
cessor that is aux.iliary to a master processor). The maMer processor ou tput s a 4-bit addrc s, A. that We cun see that the bus inter-
all the 16 peripherals can read. with each peripheral having its own unique address (0000, or face's implementation of the bus \
0001. or 0010, etc.). Because the ma' ter proces>or must always set the 'tddress lines to a value. protocol can be described by a high-
but might nOt always want to read, the ma' ter processor has another output. rd, that the muster pro- level siale machine using IWO states,
D~a1
ce"or sets to I when reading, and 0 when not reading. So if the mOMer proce sor wonts to read the shown in Figure 5.24: a tate in o ='"Z"
value in periphcml number five. the maMer proccs>or wou ld 'et the addres, lines A to 0101 , then 01 ~a
which the bu s interface waits to be
'et rd to 1. The master procc"or would then rcad the datu lines D (perhaps storing the read data able 10 send data ( lI'ai/MyAddl"l'ss)
into a local regi'ter), and then return rd to O. Additionally, the value on D ,hould not change while and (\ state in which the bus illlelf~lcc Figure 5.24 High-le\el st3te mnchine of the sending
the m3)tcr procc\sor i\ reading. sends dRw ( eIltIOn/a) . half of a simple bus inter!'""".
2-10 Register·Transfer Level lRTL I Design 5,3 RTL Design Examples and Issues 241
example. Hu ndreds of OIher "Slan.

clk~
Figure 5.25 provide . . :1 sample
dard" bus prolOcols ex is!. Designers
timin g di 32.r3111 of the !o.t~lIc machine's
beh:l\lor. tw ~tilnds for state IVail/Hy- Inputs rh : rt----"l !
not needing to interface to other chips
often defi ne their own bu s protocol
Address. SO for Selld Dow ). As long as rd - - - ' 4------J j
: 4-- for communication,
the system is in the \V state. the system
OUlpU IS Z on D. When r d= 1 and
State [ I w I w ISO I w I w I SO I SO I w I
A"" Fa d dr. the system outputs the con- outPu~1~--z--I1
-0-1+1--z-11---0-1-+I--.zI
tents of 01 beginning at the nex t cl ock
cyc le's rising edge . The sys te m con- Figu re 5.27 PCI card plugged inlo a PC's PCl SIOL
tinue :;. to outpu t 0 1 as long as rd= 1. Figure 5.25 Bus interface liming diagram .
\Vhen read returns to O. the system ~ ALL =5 ARE NOT EOUAL.
returns to th e lVailMyAddress slate at
Figure 5.24 showed two di stinci uses of !he " _ "
the nex t ri sing cl ock edge and hence Ihose two meani ngs of Ihe " - " symbol. Some
symbol. In a stal e's ac tions, ":::" meant "assign the languages use differe", ymbols 10 distinguish !he two
outputs Z again . . . . .
Slep 2 i ~ to c rea lt:: a datnpath . 3S shown on the nght III Figure 5.26. The data path contams a va lue oflhe righl side 10 the lef! side," e.g. , D= 01. On mean ings. For example, Ihe C language uses "=" for
~. bil equalil y comparalOr 10 compare A and Faddr . a 32·bil regiSler 01. and a 32·bil wide !hree· a transi tion, ""," meant "the left and right sides are the "assign" and " ==" for "!he same: ' VHDL uses " : ="
Slale dri ver 10 enable dri ving of D by nOlhing or by 01. A, Fadd r, and 0 are Ihe dalapa!h 's dala same," e.g" A- Fa dd r . Be careful nOI 10 confuse (or" <-") for "assign" , and " m " fo r "!he same."
inputs. and 0 is the onl y data out put.

Video Compression-Sum-of-Absolute Differences (SAD ) Design Example

EXAMPLE 5,7 Video compression-sum·of·absolute differe nces


Inputs: rd, A_eQ_ Fadd r (bit)
Outpurs: Q l _ld , D_en (bit) Digitized v ide~ is becomi ng increasingly commonplace. like in Lhe case of the increasingly popular
DVO (see Secllon 6.7 for further infomlalion on DVDs). A slraig hlforward digiti zed video consislS
rd
of a sequence of digitized pictures. where each picture is kn own as afram e. However. such disti-
A/ter(12004 ti zed video resul ts in huge data fi les. Each pi xel of a frame is stored a' everal b)tes. and let's sa; a
flatl/ral disaster ill frame contains about a mill ion pixels. Let's assume Lh en Lh at we requ ire about I Mbyte per frame.
Indonesia. a n' and we play approximalely 30 frames per second (a nomal rale for a TV), 0 tha.-s I MbYlelframe
o en = 1 32 news repo rter
Ql _'d =a reported from rhe =
• 30 frames/sec 30 MbYles/sec. One minUle of video would require 60 sec' 30 Mb},e sec I. =
Controller "D
:..;a"ta",p",a:.:th" -_ -I-....J SCCll t! by "camero GbYle , and 60 mi nu les would require 108 GbYles, A 2·hour movie would require o'er _00 Gbn es.
L--------------===~,BU S Interface phone... rhe vid eo Thai 's a 101 of dala, more Ihan can be downloaded quickl y O\'er the l",emel, or Slored on a DVD,
was smoolh as
o whi ch can onl y hold between 5 Gbytes and 15 Gbytes. In order 10 make practical use of digitized
10llg (IS Ihe scelle
wasil " changing Video wiLh web pages, digital camcorders , cellular telephones, or even with DVD.., we need to com-
signijiC(lIIfly, press those files into much smaJler files. A key technique in compressing "ideo i~ to recognize that
Figure 5.26 Dalapalh (righl) and co nlroller FSM de!ocriplion (Iefl) for Ihe simple bus inlerface. Wh en 'he sum! successive frames often have much similarity. so instead of sending 3 sequen e of digitized pi rures.
chat/sed (like
we can end one digilized piclure frame (a "base" frame), follo\\ed by dala descrihingju I !he dif.
pa1l1liflg across
Step 3 i!o 10 conneCI Ihe dalapa th to , controller, a, show n in Figure 5.26. The controller has the I01ldscape ), ference be,ween !he base frame and !he nexi frame. We can .end j U' 1 the difference <!ala for
one eX lemal comro l inpul. rd, "nd also gels a conllol inpul fro m Ihe d,wpal h, A_ eq_ Fad dr, indio the video became numerous frames. before sending another base frame. uch a method result~ in some 1 -- of quallt).
caling whelher A cqual' Fa ddr . The conl roller has IWO cOlllrol OUlPUIS 10 Ihe da lapalh, with \'el)' jul.) ., bUI as long as we send base frames frequentl y enough, the quali lY mal be ", eptable.
o L 1 d causing 01 10 be loaded wilh 0, and O_ en co ntro lling Ihe Ihree-Slale dri ve r, because the
camera pholle had
Of course, if Ihere 's a major change fro m one frame 10 !he n~\I (like for a hange f scene, or
S tep ~ i, 10 deri ve Ihe coni ro ller' , FSM. We simply replace Ihe dOl" operalions in Ihe higll· to trallsm;t com- lots of 3ctivi ty). we can't use the difference method. Video compressi n de\lC'eS therefore need (Q
level \lale machine of Figure 5.24 by Ihe appropriale co nlrol , ignal;, . , sh wn on the ler. ide of plete piclllres quickly estimnte the similarity between two successive die.iti1.ed frames to determine ,, ~lher
Figure 5,26. We replace A- F addr by Ihe , ign31 A .eq Fad d r, Ihe aClion, of 0- " Z" and of 0-0 rotller tlltm j llst frames can be sent using the difference method. A common \~ ) to detennlOC the , tnulant) of ( \\ 0
by D en-D and D_en-1. and Ihe acti on of 01 - 0 by 0 1 l d- 1. We would Ihen implemenl the differences. frame is 10 compute what is known n the IIm -of.absollllt'~iffi " "(,(' ( 0 ). For c3('h p1\el in
resulting ill / t'u'er
FSM u' '''g a "ale regi"er (in Ih i, case only I bil) and cornbin alionulloglc. fram es trans- frame I, we compute the di fference between th 3t pi'\;cl \\ ilh the corre~ponding p1\el In ~ ~.
You ma y have heard of , evera l popular bu'"" like Ihe P I (Pe ripheral olll ponenl Interface) milled ol'er tht Each pi xel is represented b n number, so differen e means thc.~ difterence In numbef'.. uPPc \\
bu, '" a pef\onal compUle r Thole are Ihe buse, Ihal a PC "ca rd" plug' "" 0 In a pc, like !he canl limited band- represe nt a pi e1 with a byte (real pixels are usuall) repres.nle<! b) al 1<3-<1 !hR..., b) tc' \, .m.:l \\ e "'"
, how n in Figure 5,27 . You ca n \Co on Ihe card the lI1ewl pad, o f Ihe bu,", IIch pad corresponds width a/the co mparin g Ihe pi xel ", Ihe lIppe r ler. offr:lme, 1 and 2 in Figure 5 . ~S {.1 1 . J ) frame I ', upr<!r-I fI
camurl phont". pixel has:l volm' of ... 55. Fr.J.UlC _'s pixel i ~ clearl) the ,lInc. ' l' \\ ould h3h' l \ alue If '= -5 ;11'\....,
to one WlfC of the hUI . The bu, prolocol for PCI " fllr morc tolllpl., Ihan Ihe prolocol in Ihe abolt
2-'2 Register-Transfer Leve llRTLI Des ign
5.3 RTL Desi gn Examples and Issues 243

Inputs: A, B 1256-byte memory); go (bit)


Oulput sad 132 bits)
Local registers: sum. sad_reg (32 bils); i 19 bils)

!go
-~~'--'
SAD
Digitiz ed Digitized Digitized Difference of
frame 1 frame 2 frame 1 2 from 1

~ ~ ~ [J
sad
~ 8 S8
~ ;::s ~ i<256
sum=sum+abs(A{i]-B(i])
1 Mbyte 1 Mbyle 1 Mbyte 0.01 Mbyte
i::i+1
(a l Ibl

Figure 5.28 A key principle of video compression recognizes th at successive frames have much
similarity: (a) sending every frame as a distinct digiti zed picture. (b) instead. sending a base frame
and then difference data. fro m which the origi nal frames can later be reco nSlfll ctcd. I f we could do (a) Ib)
this for 10 frames. (a) woul d require I M byt~ · 10 = 10 Nlbyt.s. whil e (b) (compressed) woul d
requi re o nly I Nlbyte + 9 • 0.0 1 Mbyte = 1.09 Nlb)' tes. an almoSt lOx size red ucti on. Figure 5.29 Sum-of-absolu te-differences (SAD) compo nent: (a) block diagram. and (b) high-level
slate machine.

Thu~. the difference of these two pi-xcls is 255 - 255 = O. We might compare the nex t pixels of both S tep I of our RTL design method is to create a hieh-Ieve l state machine. \ e can describe the
frames in that row. find ing the difference lO be 0 agai n. And so on for all the pi xels in tha t row for behavior o f the SAD component using the high-Ie"el ; tate machine sbown in Fieure - 29(a)_ We
both frames. as we ll as the nex t several rows. However. when we compute th e difference of the lefl· dec lare the inputs. o utputs , and local regi sters sum. i . and sad_reg. The sum ;;'gister will hold
most pixcl of the middle row. where th ai black circle is localcd. we see that frame I's pixel wiIJ be the ru nning sum of differences; we make thi s regi ster 32 bi LS wide. The i re£istcr will be used to
black. say wi th a value of O. On the other hand, frame 2's corresponding pi xel wi ll be white, say index into the current pixel in [he block memories: i will range from 0 to 256.
and therefore we'll
with a value of255. So the difference is 255-0 = 255. Li kewi se. somewhere in the middle ofthm make it 9 bits wide. sad_reg will be connected to the ou tP~t sad (i!"s good procti e to register
ro\\. well find anothe r di fference , thi time with frame I 's pixel white (255) and frame 2's pc,.1 your data o utputs). so will be 32 bits wide, like the S ad output. The tate machine initiall) waits for
black (D)-the difference is again 255-0 = 255 . Note that we onl y care about the difference. not !.he Input go 10 become 1. The state machine then inirializes registers s urn and i to O. The st:Ue
which is bigger or smaller. so we are actu ally looking at th e absolutc value of the difference machine then enters a loop: if i is less th an 256. the state machine computes the absolute value of
between frame I and frame 2 pixel. By summing the absolute value of the differences for every the difference of th e two blocks" pixe ls indexed by i (the notation A[ i) refers to the data in "ord
pair of pixels. we get a number represen tin g the si milarity of the two frames-D means identical, i of memory A) . updates the runnjn g sum. increments i. and repeat:s. Otherwise. the stale machine
and bigger numbers means less simi lar. If the resulting sum i ~ below some threshold (e.g., below loads sad_reg with the sum. which now represents the fi nal sum. and rerurns to me fin" srate to
1.000). we mighl then appl y the met hod of se nding the difference data . as in Fig ure 5.28(b)-we wait for the go signal to become 1 again.
don't exp lai n how to compute the difference da ta here. as th at is beyo nd the scope of this example. One poin! to re·emphasize is that the order of actions in a state does not impact the resul .
If the Sum is above the thresho ld. then the difference between the block is too g reat. so we might because nil those actions occur si multaneously. Thus. for the tnu: in ide the loop. arranging me
in!ltead send the full digitized frame for frame 2. Thu~. vidco with similarity among frames will ac ti o nsas " Sum: sum + abs(A[i)-8[i) ) :i : i T I"oras"i = i T I: 5 urn = Sun
ac:hic\c a higher compression than video wi th plenty of d i ffe rcncc~. r'
+ a b s ( A[ i ) - 8 [ i ) does not impact the results. Either arraneemem u " the old vnlue of ; .
Actually. mO~1 video compre ion mcthods compute , imilari ty not bc tween two entire frames, Slep 2 of our RTL desig n method is to crea te a darapath.-We see from the high-level - e
but rather be twee n correspond ing 16x 16 pixel blocks-yet the idea is the sn me. machine that we' l! need a subtmctor. an absolute-value omponem (\\hich \\e ha\i:~ 001 designed
ea rl ier. but is . traightforward to design). an adder, and a comparison of i to 256. We build the datn-
Computing the sum of absolute differences is !<ilow in software. ~o thnt task may be done usine
path s hown in Figure 5.30. TI,e adder will be 32-bits wide. so the -bil input conling from the abs
a CU'i tom digital circui t. wh ile other [ask may remain in . . oftware. For example . you mighl find ~
SAD circuit imide a digital ca mcorder. o r in ~ i dc a cellul ar telephone th at supports video. Let',
co mponen t wi ll need to have as appended for its high _4 bits.
S tep 3 is to connect the datapath to n controller block.. < sho\\ n in Figu!1." 5.30. ~OIe that
de,ign \uch a circuit. A block di agram is . hown in Fig ure 5.29. The c ircuit " inputs wil l be •
we've defined the interfnce 10 the A nnd B memories. consisrimz of 9. read line. 3ddre.. ~ 1in~. d
256-byte memory A. ho lding the con ten!> of a 16.<16 block of pixe" of frame I. and another
data lines. Also note that we hawn't explici tl y listed the inputs a';,d outputs t,r the ntroller'~ ~l .
256-byte memory 8. holdi ng the corr.'ponding block of fr:llne 2. Memon., Will be di sc ussed in
a, they can be secn at the periphery of the controllers blo.:k.
Secllon 5.6. for now. conSIder Ihe memory a, a regi«er hie. and Ignore dcw ti , of the in terfuce to the
S tep -' i, to convert the high-Ie" el stnte machin to an FSM. We 'ho\\ the ~I \1O th I ft "J
memo"e,. Anolher cirCUli input go lell, the circlI lI when to be~ln co mpuling. An OUlput sad will
of Figure 5.30. For comcnicnce. \,e\e shown the original h.i.gh-le\el J. ~tl()lb "ro...~ ,."u). and
pre,ent the re, ult after 'orne number of clock cyc le'.
\\e've ... hO\\ 11 their repl3c~ll1enl b) th~ F t action\".
l·U 5 Register·Transfer Level (RTL) Design
5.3 RTL Design Examples and fssues 245
RTl Design Pitfalls and Good Practice
Pitfall: Assuming a R . t I
Written egLS er s Updated in the Sta te in Which the Register Is
Perhaps the most com . k ' .
th t '. ma n mlsta e m Creallng a high-level state machine is assuming
a a regl ter IS updated in the t . h'
. " s ate m w Ich the register is written . Such an assumn-
tlOn IS mcorrect and ca n lead t r
.
th e register . ' a unexpected behavior when the state machine reads
m the same state d I 'k .
. .. ,an I eWl se when the state machine reads the reo;qer
m a transitIOn condit ion lea' h .,...
. I h' vmg t at state. For example. Figure 5.31 (a) shows a
simp e Igh· level state mach ' E .
Iowmg. . Ine. xamme the state machine, and then answer the fol-
two questi ons:

• What wi ll be the value of a after state A?


• What will be the fi nal state: Cor D?
sad The answers may surprise yo u.
The value of a will not be 99; a 's Local ,egislero. A, 0 (8 brts)
value will actually be unknown. The
Figure 5.30 SAD datapath and controller FSM . reason is illustrated by the timing
diagram in Figure 5.31 (b). State A
To complete th e design. we \\ o uld convert the FSM to a co ntro ll er impleme ntati on (3 state reg· configures the datapath to load a 99
i ~te r and combinati onal logic). as described in Chapter 3. into R on the next clock edge. and COn- (a)
figures the data path to load the value
Comparing Software a nd Custom Circuit Implementations of reg ister R into register a on the next
In Example 5.7. we said Ihat the output appears after some number of clock cycles. Lei's clock edge. When the nex t clock edge
determine exact ly how many cycles. After go becomes 1. our state machine will spend occurs. both those loads Occur Silllll/'
one cyc le initiali zing registers in 5 /. then will spend two cycles in each of the 256 loop ralleolls/y. a therefore gets whatever
iterations (states 52 and 53). and finally one more cyc le to update the output register in value was in R JUSt before the next
o
state 5.J. for a total of 1+ 2*256 + I = 514 cycles. clock edge. which is unknown. (b)

If we executed SAD in software. we wou ld likely need more than two clock cycles Furthermore, the final state will
Figure 5.31 High-fe,el sml. machine
per loop iteration. We would need perhaps two cycle to load internal registers, then a not be D. but will rather be C. The thai behavcs diffenenl than some people
cycle for , ubtract. perhaps two cycles for absolute va lue. and a cycle for sum. for a total reason is illustrated by the timing may e.'(peC'L due to reads of a reruter in
of six cycle per iteration . The cu torn ci rcui t we buil t. al two cyc les per iteration, is thu diagram in Figure 5. 3 I (b). tate B the arne tate as ",Tiles to that ;'cister:
about three times faster for computing SAD. as uming equal clock frequencies. configures the datapath to load 100 (al smle m3 hine. (b) timing di3i.un,
We' ll see in Section 6.5 that we could aClually build a SAD circuit that is IIlllCh fasler. into R on the next clock yele. and
config ures the controller to load the
~ DIGITAL VIDEO- IMAGINING THE FUTURE. next tate ba ed on the transition condition. R is 99. and therefore the transition ndition
R<lOO is true. meaning the controller wi ll be configured to load tate C into the tale
People 'Cern 10 have an '"'au able appetile for good mighl imagine video di'play (With audiol on our wafls regi . ter. not state D. On the next lock edge. R be ames 100. and the ne \ t • tate become C.
quality \ideo. and thu, much allention " placed on al home or wor~ Ihal conlinuaJly dispfay whm'
de'elopmg f..,1 and/or power-efficienl encoders and The key i to alway remember that a srare' acriolls COllji uu rht! dorapuliJ and
happening at anolher home (perhaps our mom's house)
de.;ode" for dlgHlll video device,. like DVD players and or al a panner ollice on the other ide of the counlJ)'- cOllrroller slich rhar rhe lIexr clock ed~e lI'i1lload Ihe desirt'd \'Olue -bUllh,'St' \0/ ...
recorde". dlgH.f VideO came"". cell phone, , upponing like a vinuaf window to an Iher place. Or we mighl dOI/'t actually get loaded wllil rhar lI~fr clock ed e. Thu • . an) e\ pre .i n. m.1 st3le 's
d'gH.] "deo. 'Ideo confcrenc'"8 UnlL'. TV,. TV ,.t.top Imagine ponahJc device; that enahle u; 10 continually actions r outgoing transition condition~ \\ill be u ing the pre,ious ,J.!ues f regJ'.
Ix". . .'" It\ ,"",re,I,"S 10 Ihmlloward the fUlUre- \CC what wmcone eJ\C 'Wcann!! n tiny camcm- pcrhnps ler. . not the values being 3S igned in that state itself. B\ (he . me reasantn"!, all th
.. ." umlng "Iden enuxhng/det.:cxhng become, even more our child or 'I"'U"''''",. TI,o"", de'elopmcnL' could aCli ons of a state occur . irnultnneou, h on the ne\t d -. edlre. and thu, ~,uld
p<J\'ocrful and d1ill.a1 cOmmUOICiJlIOn 'peed, IOcrea\C. \Itt \ ignlficanlly change nur 11\ 109 p~lItcrn\ written in an order. . -
2-16 Register-Transfer LevellRTlI Design 5.3 RTL Design Examples and Issues 247
A~Slllnin g th at (he designer must take into account the adder when
Local regislers: R. Q (8 bits)
actuall y \\ant~ Q to equal 99 and computing longest register-to-register
the Iinal swte to be D. then a sol u- delays to determine a circuit 's critical path
tion is to add an ext ra swte before (see Section 5.4).
reading tbe value of a register that Therefore, we wi ll follow the design
we assi gn. Figure 5.31(a) shows a practice of always pUlling a register directly
new st~e 111~~chiJ1e in wh ich the before the data output, as shown in Figure
(a)
assi £nmem of Q=R has been 5.34(b). Even if we don 't explici tly declare
mo\:ed 10 state B. after R=99 has the register as a local register, we always
taken effect. Fu rthermore. the assume it is there in interpreting the high- (b)
state machine has a new state. B2. level state machine, and we always add that
Figure 5.34 (a) P will exhibit spurious
th at simply allows R to be updated register when creating the datapath. Alter-
va/ues. (b) regislering P solves the problellL
with the new value before we read nat ively, we can explicitly declare that
that val ue in the transition cond i- (b) register, and then assume that the output is
tions. T he liming diagram in directly connected to that regi ster-thi s is the approach we took in Example 5.7. in whicb
Figure 5.32(b) shows the behav ior Figure 5,32 High-level state machine that avoids we declared the register sad_reg . !t's good practice to no/ read this register. the reg-
th;t the designer ex pected. reading j ust-assigned regis ter : (a) state machine. ister's on ly purpose is to connect to the Output port. -
An alt ernative so lution for the (b) timing diagram.
Regi stering data ou tputs does have the potential di sadvantage of delaying wriles to
transition issue in thi s case would the output port by one cycle, depending on the example_
be to uti lize comparison values that take into account that the old value is being used. So
instead of comparing R 10 100. the comparisons might instead compare to 99. Data-Dominated RTL Design
Avoid ing th is pitfall is the reason that we included state 52 in Examp le 5.7.
Pitfall: Read ing Outputs We can consider RTL designs as falljne into one of two Calegorie : contral-dontinated
designs and data-dominated designs. - -
Another common mistake is 10 Inputs: A, B (8 bits) Inputs: A. B (8 bits)
create a high-level state machine Outputs: P (8 bits) Outputs: P (8 bits)
A cOlltrol·dom illated desigll is a design whose controller comain mo I of the om-
in which an external output i read Locat register: R (8 bits) plex ity of the design. When creatine such a desi!!n a desi!!T1er focu es mostlv on the
in the state machine. Outpu ts can design of the controller, meaning design effort -g~e mo ~Iy into defining ~e state
only be wrillen and can not be behavior of the system. Once the desi!!ner has defined thaI tate behavior. hei be can
read. For exam ple. Figure 5.33(a) derive the datapath straightforwardly from that stale behavior. A contral-dominated
shows an inval id high-level state design typically responds to eXlemal inputs in a precise anlQun! of time. and typi a11~ bas
a simple datapath.
machine-the read of P in state T
is not allowed. If you wish to read (a) (b) A data·dom illated desigll is a de ign whose datapath contains mo t of the m-
an output . then create and use a plexity of the design. When creating such a design. a designer focu es mostl~ on the
Figure 5.33 (a) Read ing an output is not allowed.
loca l register. Figure 5.33(b) (b) using a local register. design of the datapath. meaning de ign effort goes mostly into instantiating and inrerron-
shows use of a loca l register R 10 necting datapal.h components. Once the designer has defined lhe dampath. h she an
avoid reading output P. define the controller's state behavior straightforwardly. A data-dontinated d -ign lypi :ally
has a lot of para llelism in its datapath. and the datapath ma_ be large. For a d:lla-doffil-
Good Design Practice: Registered Data Outputs nated design. de igners oflen ski p the first tep of our RTL d ign method of Table -.1.
fI 's a good idea to always en ure your design has a register at every data ou tput. Doi ng so The laser-based distance mea urer example in the pre\~ou-. uon \\ as an <!.umple
prevent;, those outputs from displayi ng spurious values. For example. the state machi ne of of control-dominated design. ince the complex it)' of the d ' ign \\ as reall) in th :on.
Figure 5.33(b) could be imp lemented as a datapat h in whi ch output P is directly con· troller, not tl,e datapath .
nected to the output of an adder, as shown in Fi gure 5.34. P wi ll therefore output spurious The tenns "control·dominated" and "data-dominated" are ll1erel~ descripti\ . and
values for 'ome time after R i loaded wi th A. while the addition is being computed. Fur· an't be used to trict ly categorize de-igns. me de igns \\ill e:-.h.ibit propenies oitx'lh
thermore. If B or A changes in some other states. P will also change. but such change is types of de igns. It ' like the tenn "intra\'en" and "extra\en" for d s.:ribtng pIe--
hkely not the intended behavior of the state machi ne-P should only change when we while the temlS are useful. people an' t be ' trictly categorized ' either mtro\<'ID
ex plicitly assign P in a Mate. Another problem is that any proces 'or usin g the P output ex troverts. since many peoplc are somewhere in bet\\·een. or e\.h.ibit f ature, "f xh

-, -. - -- ------------
248 Register- Transfer Level (RTL) Design
5.3 RTL Design Exa mples and Issues 249
catego ri es. The example of lhe si mple bus interface was an exa mple of a d es i g ~ lhat has
Notice how lhe data moves to the ri ght on each clock cycle, so that register xtO holds the current
a similar amoulll of contro l and data des ign. The VIdeo compressIo n SA D C" c ull , at least mpu t sample, X tJ holds the previous input sample, and x t2 holds the sample before the previous
the way we designed it. was also a mix of control and d a t ~. . one. For the example, we'll assume data is 12 bits wide.
RTL design is very mllc h a creati ve process. Two desIgners may come lip wIth very
different desions for the same system. fo llowing perh aps d Iffe rent des Ign methods, wllh
3-tap FIR filter
those designs~differing in te rms of performance. size, and o Lhe r metrics. X(I) x(I-I) x(I-2)
xtO Xl1 xt2
FIR Filter Design Example y
12 12 12 12
As our previous examp les we re ei ther contro l-domi na ted or a mi x of control and data, we
now provide an example of a da ta-dominated design.

Figure 5.36 Beginning to bui ld the datapalh for the FIR fil ter-inserung and connecting thex(I).
EXAMPLE 5.8 FIR filter x(l -f ). and x(I-2) registers.
A. digita l fi lter takes a SLream of digita l inputs and ge nerales a stream of digita.1 ou tpu ts with some
feature of the input stream removed or modified. Figure 5.35 shows a block diagra m of a popular Now we need another register for each tap to hold the constant value cO . c1. or c2-we·U
digital filter known as an FIR filter. X and Yare N-b its wide each. such as 12 bits each. As a fi ltering worry later abo ut how those registers will be loaded. We' ll also need a multiplier for each tap. to
mul tiply the tap·s X value by the Constant C val ue. The datapath with the constant registers and mul-
y
ti pliers is shown in Figure 5.37.
x
digital fil ter 12
elk

Figure 5.35 General block di ag ram of an FIR fi lter.


x
example. consider the following stream of digital tempera ture values on X comi ng from a car
engine temperature sensor sampled every second : 180. 180. 18 1. 240, 180, 18 1. That 240 is prob- clk
ably nOf an acc urate measuremenl. as a car engine's temperatu re can nOI j um p 60 degrees in one
seco nd. A digital filter wou ld remove such "noise" from the inpu t stream. ge nerat ing perhaps an
output stream on Y like: 180. 180. 181. 181. 180, 181.
An FIR filter (usuaUy pronounced by saying the leiters ··F· ."f"" ··R"). short for ·'Fi nite Impulse
Re ponse·· filter. is a popular general digi tal fi lter design that can be used for a wide variety of fi l-
Figure 5.37 Ex tendi ng the datapath for the FrR filter-inserung and connecting the cO_ c L and c2
tering goals. Figure 5.35 shows a block diagram of an FIR fi lter. The basic idea of an FIR fi lter is
regis ters, alo ng with the multipliers. for each tap. For simplicity. clock connections are DOl sho""Il.
simple: the present output is obtained by mu lti plying the present input value by a constant, and and all data lines are assumed to be 12 bits wide.
addi ng that resuh to the previous input value limes a con lant, and adding that resull to the next
earlier input val ue limes a constan t. and so on. In a se nse. adding 10 previous va lues in th is manner
The out put Y is the sum of each tap·s prod uct. We can thus insert adders to compute !be sum_
results in a weighted ave rage. We describe digital fi ltering and FIR fi lters in more detai l in Section and we can connect th al sum to the aUip Ul Y. as shown in Figure -.38.
5.11 . For the purpose of this example. we merely need to know lhat an FIR fi lter can be described
by the following equation: We have completed the heart of the FIR filter datapath design. We DOW need to pro\"idc: a
method for a user to load values into the constant registers cO. c1. and c2. LeCs rn:ate!lIlOlbet-
y( r) = cOxx( r) + c l xx(t- I ) + c2xx(r-2) in put C to the fi lter. a load line Cl. and a 2-bit address Cal and CaO_ that the filter user an use to
load a panicular constant reg ister. Ca I Ca 0-00 indicates that register cO should be loaded I
An FIR filte r with three term. as in the above eq uation. is known as a J-tap FIR fi lter. Real indica tes th at c I should be loaded. and 10 indi ates that c2 hould be loaded_ L ding of the ,
FIR filter; typicall y have many tens of taps-we u,e only three taps for the purpose of illustration.
on input C into th. appropriate register occurs on a lock edge only when CL-l. We <= trnigbt-
A filter de. igner using an FIR filter achieves a particular filtering goal s;mply IJY c1r00s;/l8 lire FIR
forwardly design the circuit for such loading using a decoder. as shown in Figure _19. ~ address
filter 's con.'ilGl/tr.
lines Ca I and Ca 0 feed into a 2x4 decoder. thus enabling the appropriate register (JlO{e that address
We wi h to de~ ign a ci rcu it to implement an FIR filter. Becau." the FIR filter eq uation is ju t
II is unused). The load input CL is connected to the decoder" enable input_ 'Ole that.. -\ 3IS<'
data tran<formation and no control. let·s sk ip Step I of the RTL de.ign method and go straight to
added a register at the Y output. \ hich is genernlly good design practi _ i~ such l ~"'ter
tep 2--<lesigning the datapa th . We' ll need a regi~ ter for each tap to hold X(I). x(I- I ). nnd x(I-2). On
ensures tlle output doesn· t Ruc luute a intemlediate products and sums are mputed. and rectu.;, -
each clock cycle. we ll want to move x(I- I ) to x(I-2). to move x(1) 10 x(I- I ). and to load .I"(r) wi th th.
the likelihood of the user accidentally extending the riti al path b~ nne<:ting tttrough. \0( of
pre.,.,m Input. We lhus <tart the datapath wilh three reglSlers. connected 0' . hown in Figure 5.36. combinational I gic before loadi ng Y il1lO a register.

-- • ~ ~ # ~ - - -- - - - - - - - - -
150 5 Register-Transfer Level (RTU Design
5.4 Determining Clock Frequency 251

~ HOW OOES IT WORK?-VOICE QUALITY ON CELL PHONES_


Cellular telephones have become commonplace over
the past decade. Cell phones operate in environments
the job of those di gital syslems is 10 fi ller OUI the
background noise from the audio signal Pay anenlion
x far noisier than regular "Iandlinc" telephones.
incl uding noise from 3U1omobiJes. wind. crowds of
next lime you talk [Q someone using a cell phone in a
clk noi sy environment. and nOlice how much Ic\ noise
lal king people, elC. Thus. fi heri ng OUI such noise is
you hear Ihan is probably aClUall y heard by the
especially important in cell phones. Your cell phone
microphone. As circuits continue to improve in speed
contains at least one, and probably more like several,
ize. and power, filtering will likel ) improve further.
microprocessors and custom digital circuits. After
Some slate-of-the-an phone~ may even use two
co nvertin g the analog audi o signal from the
microphones. coupl ed wiLh beam forming techniques
y microphone into a digital audio stream of bits. part of
(see Secti on 4 . J 3). to focus in on a user's voice.

. th .... output y '111 Ih't.: FIR filler as Ihe


Figure 5.38 ComplI lIllg SlIlll of Ihe I"P prod uclS (all dala lines
arc :J.~')ul1led 10 be 11 bit:-- "ide). Fo r hard ware implementati o n. let's as ume th at the adder has a 2 ns dela). Let' also
ass ume that c haining the adders together res ults in the delays adding. SO that ("'0 adders
c hained together have a delay of 4 ns (detailed ana ly is of the inte rnal gate of the adders
co uld show the delay to actua ll y be sligh tl y less). Let ' as ume the mu ltipl ier has a 20 os
de lay. Then the criti cal path . o r lo ngest register-to-register delay (to be di cussed funher
in Secli o n 5.4). wo ul d be fro m cO to yreq. going thro ugh the multiplier and two adders as
shown in Fig ure 5.39. That path 's length wou ld be 20+4 = 24 ns. , o te that the path from
clI o yreg would be eq ua ll y lo ng, but no t lo nger. A critical path of 24 n means the data-
pat h could be clocked at a freque ncy of I / 24 ns = 42 M Hz. In o ther wo rds. a ne\\ ample
could appear 01 X eve ry 24 ns. and new o utput wo uld appear at Ye\'e l)' 24 n .
Now let's consider the hardware perfonnan ce of a larger ized fi lter: a 1000tap FIR
filte r rather th an a 3-tap filt er. The main perfonnance difference i that \\ e- Il need to add 100
va lues rath er th an just three. Recall fro m Sec tio n 4.13 that an adder tree is a fas t wa) to add
many values. One hund red values will req uire a tree wi th seven le\'e ls- - 0 addition _ then
25. then 13 (ro ug hl y). then 7. then 4. then 2. then I . SO the total dela~ wo uld be ~O ns (for
the multiplier) pl us seve n adder-delays (7*2ns = 14ns). for a total dela) of 3-1 05.
For a software imple mentation. we' ll aSSume 10 ns per instruction. _ -ume h mul-
tip licati o n o r additio n wo uld req uire two instructions. A 1000tap filter \\ o uld need
a pprox imately 100 multiplication and 100 addi tions . so the to ta l ti me \\ ould be (100 mul-
ti plicatio ns · 2 instr/mult + 100 add it ions * 2 instr/add) * 10 ns per in -tru tion = -WOO os_
In other words. the hardware implementatio n wo uld be mer 100 tim' Ia, ter (
3-1 ns) th an the software implementat io n. hardware implementation uld there fore pm:
Figure 5.39 Finalll.lOg the FIR fi ltcrdalapath wi th circuitry for loading the constalll rcgisters. We've
100 lime mo re dala th an a software implementation. res ulting in much better tiltenng.
aJl,o added a reg l'~lc r Oil the Y output. which is good dc~ i gn prac tice. The crit icn l pu th- the longest
rcg/\ lCHo-rcg/\ lt:r delay- if., <,hown :1... a dotted linc.

Our RTL dC'ign me lhod "lVolve, IIVO ' Ieps afl er de, igning Ihe dalap:ll h 10 complete Ihe con- 5.4 DETERMINI NG CLOCK FREQUENCY
troller. HO\l.cvcr. thl \ pUrll cu lar dc\ ign doc\ nOI requi re a contro ller. nOI e\en n simple one! n,iJ
ewmplt' H "u",.o ed an l'.\frClI/e l'xm1lph, oj a (/ma- (/OI1lIlUlleti rle,\ix". RT L de igll produces a processo r. co nsisting of a datapJth und l controller. in' ld th
da tapa th and cont ro ller are registers. a nd reg iste.." reqUI re ad'" , ign:ll . .-\ ... 1.' ' lgn;1]
C o mpa rin g Soft wa r e a nd Custom C ircuit /mplemcntation,~ m us t have" panic ular frequen c) . The frequenc~ \\ ill d~tenlllllc ho\\ f,bt th , ) , t III \\(11
It " Inle rc, tlllg to co mpare the perfo rman ce of Ilte ha rd wa re "np/e me lltati o n of a 3-lap exc ' ut e i t~ >pecilkd tlIS" . b\ iou,I) . a 10\\ 'r f~ uenc) \\ ill re,ult III , 1.,\\ ' r \ · Utlc'l\.
fIR filter with a ,oft"'are imp le mcllIa ti o n. The c riti ca l path goc, fro m Ihe X t and c reg- \\ hile a hi g her frequc nc) \I ill result in a fu>tcr c\ ·utl o n. <'(1\ '" '1~ . J I.trg r
i, ter,. throu g h n ne multiplier. and th ro ug h two add.". be fo re rc.lc hlltg the Y rcg ;;tcr yreg. period i. , 10\\ <1'. \\ hilc 1I "Illllkr I 'nlxf I' fast'r.

_~_ _ _ _ _ _ _ _-
~_ <0 _
252 Reg isler-Transfer LevellRTLI Design
5.4 Determining Clock Frequency 253
Desio"ers of dioita l circui ts often (but not always) want their systems to execute as The above analysis assumes that the onl y delay
fast as ~~ssible. Ho~'ever. a designer cannot choose an arb itraril y hi gh clock freq uency between regIsters IS caused by logic delays. [n reality,
(meaning an arbi traril y sma ll period). Consider, fo r ex an~ple. thestmple ClrCU !! m Ftgure lVires also have a delay. [n Ihe 1980s and 1990s, the
5..l0. in which registers a and b feed th ro ugh an adder Into register c. The adder h as a d~lay of logic dominated over the delay of wires-
delay of 2 ns. me;ni ng that when the adder's inputs change .. the adder's outpu ts WI ll not Wire delays were often negligible. But in modem chip
be stab le unt il after 2 ns-before 2 ns, the adder's out puts wtll have spunous values (see technologIes, the delay of wires may equal or even
Section -1.3 for a descrip tion of spurious val ues appearing at an adde r's outpu ts): If the exceed the delay of log ic, and thus wire delays cannot
designer chooses a clock period of 10 ns, the circu!! should work fin e. Shortentng the be Ignored . Wire delays add 10 a palh 's length just as
period to 5 ns wi ll speed the execution . Bu t logIC delays do. Fi gure 5.42 ill ustrales a path length
shortening the period to I ns will result in calculatI on wllh Wire delays included .
incorrect ci rcuit behavior. One clock cycle Furthermore, the above ana lysis does not consider
might lond new va lues into reg isters a and b. se!Up times for the regislers. Recall from Section 3.5
Figure 5.42 Longest path is
The next clock cycle wi ll load reg ister c I ns that flip-flop inputs (and hence register inputs) must be 3 ns con idering wire delays.
Imer (as well as a and b). but the output of the stable for a specified amounl of time before a clock
adder won' t be stable until 2 ns have passed. edge. The setup lime adds to the path length.
The value loaded into register c will thus be Even considering wire delays and etup times, designers typically choose a clock
some spurious va lue that has no useful period thaI is stiliiollger than the critical path by an amount depending on how conserva-
meaning . and will not be the sum of a and b. ti ve the deSIgner wants to be with respect to ensuring the circuit works under a varie!)' of
Thus. a designer must be careful not to set operating conditions. Certai n conditions can change the delay of circuit components, con-
the clock freq uency too high. To determine the dilions like very high lemperature, very low temperature, vi bration, age, etc. Generally,
highest possible frequency, a designer must Figure 5.40 Longest path is 2 ns. the longer the period beyond the critical path, the more conservative the design_ For
analyze the enti re ci rcuit, and find the longest example, we might determine that the critical path is 7 ns. but we might choose a clock
path delay from any reg ister to any other regi ster. or from any circu it input to any register. period of IO ns, or even 15 ns, the latter being qu ite con servative,
The longest reg ister-to-register or input-to-regi ster delay in a circuit is known as the cir- If low power is a design goal , then a designer might choose an even longer periO<i
cuit's critical pal". Designers then choose a clock whose peri od is longe r than the such as lOOns, to red uce circui t power. Why reducing the clock frequency reduces power
circuit's crit ical path . will be di scussed in Section 6.6.
Figure 5.4 1 illustrates a ci rcuit with at least four poss ibl e paths from any register to
When analyzi ng a proeessor (controller and datapath) to find the critical path, a
any other register:
des igner must be aware that register-to-register paths ex.i I not just \\ithin the datapath
One path starts m register a, goes (Figure 5.43(a», but also within the controller (Figure - .43(b»), between the controUer
through the adder, and ends at register and dalapaLh (Figure 5.43(c», and even between the proce or and external omponems.
c. This path 's delay is 2 ns. The number of possible paths in a circuit can be quite large. Consider a circuit \\ith .\'
Another path starts at register a, goes registers that has paths from every register 10 every other register. Then there are S",V, or.\'~
possible regisler-to-regisler paths. For example, if i 3 and the three regi ICrs are named'-\,
through the adder, through the multi-
plier, and ends at register d. This
path's delay is 2 ns + 5 ns = 7 ns,
• CONSERVATIVE CHIP MAKERS, AND PC OVERCLOCKING,
Another path starts at regisler b, goes
through the adde r, th rough the multi - Chip makers usually publish their chips' mlL,imum chip's published mal imum, b} cbanging !he PC's 81
plier, and ends al register d. This Max
(2,7,7,5)
clOCki ng frequency somewhat lower than (he real (basic input/oulput S) tem) sening .. NUJ1lcrOUS v..
=
path', delay is al,o 2 ns + 5 ns 7 ns. = 7 ns mMimum- pcrhap 10%, 20"", or even 30% lower. ueh posl stnoso on !he su ;;es :md f:lil=- of
conservatism reduces the chances thn! the chip \\i ll fuil in trying to o''erdock ne:ui) .' IJ PC ~_--it
The la'l path >tarts at register b, goe, Figure 541 Dctenni nins the critical pOIh. unnnlicip3loo silualions. such as extremes of hot or Id the norm is about 10'l- hIgher lII.ln !he puNosbod
through the mU ltiplier, and ends (It weather, or slight vnrintiom in Ule chip m!U1ufucturing ma:~imum. ' \\ . I don't f'e\"'QnUllef'kJ \erckx: ng ,flY
regi"er d. Thi , palh \ delay i, 5 ns. process, Many pcrnlnal computer enthusiasts have taken one, you ma~ d:Ull:~e !he ml="""", ,.. due 1(\
nd,".ullOge of such con",,,,,,,hm b} "overclocking" their O\erhe:lting). but i"~ lOt re.ting tft ~ tb-: '" ~ ~
The longest path is thus 7 ns (there arc aCluall y two ,ueh path,), Thus, Lhe dock
PCs, meaning to sct the clock frcquenC} higher than J prescO\..'e of C(l(L~n all\ • dc--.Ign
penod mU\1 be al lea'l 7 n'>,

-- - -------------------
5 Register-Transfer Level (RTL) Design 5.5 Behavioral-Level Design: C 10 Gates (Optional) 255

int SAD (byte A (256J, byte B (256]) II not quite C syntax


{
uint sum; short uint i;
sum = 0;
i =0;
while (i < 256) {
Figure 5.44 C program description sum = sum + abs (A/ij - B(ij);
or a sum-or-absolule differences i= i + 1:
computat ion- the C program may
be easier to develop and easier to relu m (sum);
understand th an a state machine.

That code is much easier to understand ror mOSI people than Ihe high-level stale machine in
Figure 5.29. Thu • ror some designs. C code (or somelhing similar) is the mosl natural tarring poinL
To begin the RTL design melhod. we could conven Ihis code to a high-Ie, el lale machine_like thaI
in Figure 5.29, and then proceed to complele the RTL design method and hence design the circuiL

Ii is instructive 10 define a Struclured method for converting C code 10 a high-level stare


machine. Defining such a method makes clear 10 us thai C code can be autamatically com -
piled to either software on a programmable proces or. or 10 a cllsrom digiral circuit_ We
Figure 5.43 Crit ical paths throughout a circuit: (a) within a datapa th . (b) within ;] controller,
poi nt Oul lhal moSi designers lhal stan with C code and then continue with RTL design do
(c) be",een a controller and d.l. pnth.
lIot nece saril y follow a particular melhod in performing such cOI1l·ersion. Howe\er. auto-
B. and C. the n the possible paths are: A- >A , A->B, A- >C. B->A. B->B, B->C, mated lools do fo llow a melhod having some similarities to the one we now describe. \>'-e
C- >,-I. C->B. C->c. for 3*3 = 9 po sible paths. For N=50. there may be up to 2500 pos- also poi nl OUI lhm lhe conversion melhod wiLl somelimes result in "extra" tates that you
, ible paths. Because of the large number of possible paths. automated tools ~an be of great might notice could be combined Wilh other slales-these extra states would be combined by
assi tance. Timing analysis tools automatically analyze all paths to determlOe the longest a later optimi zation slep. though we' lI combine some of them as we follow the method.
path, and may also ensure that setup and hold times are saltsfied th roughout the CIrcUIt. We consider lhree Iy pes of staiemenlS in C code-as ignment statemenLS_ while
loop. and condilion statements (if-lhen-else)-and provide higb- Ie\ el tate rna hine tem-
5.5 BEHAVIORAL-LEVEL DESI GN: C TO GATES (OPTIONAL) plales for each such Slalemenl.
An assignment tatement in C
As trans i ~t or pcr ch ip continue to increase and hence dc. igners build more complex digital
Iranslates simply into a stale in a largel = expression: . .
systems that use tho e additional transistors. digita l ystem behavior becomes increasingly Slate machine. wi th lhe Slate's
diffic ult to understa nd. Frequently. a designer bu ilding a new digital y tern find s it useful to actions carryi ng oUl lhe assignmenl.
fi r t descri be the de ired system behavior u ing a programming language. like C. C++. or as shown in Figure -.45.
Java. in order to fi rst get that de ired behavior correct. (Alternati ve ly. the designer may use Figure 5.45
An if-thell 131ement in C trans- statement.
the high-level programmi ng constructs in a hardware descripti on language. li ke VHDL or

.
lates into a Slme Ihm checks Ihe
Veri log. to fi N get the desired behavior correct.) Then. the des igner convens lhat program-
condilion of Ihe if Slmemen!. and
mi ng language descri plion 10 an RTL design. by following Ihe RTL design melhod Ihal
branches 10 lhe sime for lhe thell
usua lly Sians wilh a high-level Siale mac hine RTL descripli on. Converting a syslem's pro-
part if Ihe condi lion is lrue. Olher-
gramming language de!>Criplion 10 an RTL descriplion is known as beha vioral-level design .
wi se branchi ng pas l Ihose tutes to .f (cond) {
We-li lOlroduce behavioral-level design tIl,ing an example.
an end hlle. as shown in Figure II lhen stmts (then Slm
5.-16.
EXAMPLE 59 SUI -of absolute-dlHerences In C for video compression +
We can tranJate an if-rhell -else
Recall bamplc 5.7. which we crealed a ,um-or-ahsolutc-(hrrcre nce, component. In Ihat
111
eJ;ample. we 'tdrted with ~I hlgh.lc\'cl , late machine-but Ih.1I \ UlI C nmchlllc wa., n 't vc ry easy. to stmemenl in inlo Il similar -late
undeN •• nd We can more eaSily descnbe Ihe compulallon of Ihe , um of ab,ol ule d,rrerence, u~,"g mac hine wilh a stUiC Ihut c h~'Cl..s lhe
C code. a, ,h""n In I.gure 5.4-1 onditi on of the if stmemOn! . but
256 Register-Transfer LeveII RTL} Design 5.5 Behavioral-Level Design: C to Gates (Optional) 257
this time branc hing to states for the EXAMPLE 5.11 SAD C code to high-level state machine conversion
else pan if the if condition is fal se _ +
as shown in Figure 5.47.
if (cond) (
~ We wish to Conven the C program de cription of the sum-of-absolute differences example of
Example 5.9 to a high-level state machine. The code is shown in Figure 5.5()(a)_ written as an infi-
The else pan commonl y con- --pond ~
1/ then stmts nite loop rather than a procedure call. and using an input "go" to indicate when the system should
tains a nothe r if state ment as C (then stmts) (else stmts)
co mpuLe the SAD. The "while ( I )" statement, afler some optimization, translates just to a transition
programmers may have multiple
else if pans in a region of code.
else (
II else stmts (end)
+ -.-J from the last state back to the first state. so we' lI hold off on adding that transition until we have
formed the rest of the state machine. We begin with the statement "while (!go):' which based 00 the
Finally. a while loop statement template approach translates to the states shown in Figure 5.50(b). Since the loop has no statemeots
in C translates into states simi lar to
an if-then statemenl. excepl that Figure 5.47 Template for if-then-else Slnternent.
Inputs: byte A(256).B[256) :--- ---------!(!g-~)-l, i--------------i

¥
bit go; "
after executing the while's state- Output: int sad : : : I::

ments. if the while condition is true, mainO ! ! /~ !go .go go


{ I:'" !..______ ____.1
the state machine branches back to uint sum; short uint i; ; 1 : sum=O
the condition check. rather than to while (1) { ,,.." ! ! i=O
the end state. as shown in Figure 1---------------, /" : !
: while (!go); (': :
5.-18. Only when the condition is while (cand) ( ------- --- -- ---- - I I
(d)
false can we reach the end slale. (while stmts)
sum = 0; : :
II while stmts i = 0; L_________________ J

I
Given these simple templates. (c)
we can conven a wide variety of C + -~;;~;-(i~-256){- -- - ---- --------i (b)
sum = sum + abs(A(i) - Bn)); :
prog rams to high-level state
___ L=:.U·_1~ _________________ . _J
machines. from which we already
know how to create circuit designs
following our RTL design method. Figure 5.48 Templme for while loop statement. (. ) !",o,"m
EXAMPLE 5.10 Converting an if-then-else statement to a state machine
We are given the C-like code shown in Figure 5.49(a). which computes the maximum of two data
inputs X and Y. We can translale that code to a state machine by first translaling the if-then-else state-
ment to states using the method of Figure 5.47. as shown in Figure 5.-l9(b). We then translate the lhen ; ______________ J. __
j
statements 10 states. and then the else statements. yielding the final state mac hine in Figure 5.49(c).
j l(k256)j . - .
Inputs: uint X. Y
OutPUIS: uint Max .-.
: :./'

if(X >Y) (
r------------,
: Max X; :
=
r-----------" (the n stmts) (else stmls) iL ______ __________
I (9)

~
else (

('~)c?--J
r------------,
: Max=Y; :
r-----------"

(a) (b) (e) Figure 5.50 Behavioral-Ie,'el design of the sum-or-absolute difleren,.., ,'Ode; (3) ongin31 C
code, wrillen n~ an infinite loop. (b) lrnnsiating the statement ',\, hile l!g.o):'· to 3. ,Ute :tu~~
Figure 5.49 Behavioral-level design slani ng from C code: (a) C code for compuling Ihe max of two
(e) simplified stn lc~ for "while (!go):" and states for the !bSignmcm ,tll "ment that (-.: UQ\\~
numbers. (b) translating Ihe if-Ihen-else Stalemenl 10 a high- level 'tnte mac hine. ( ) translaling the
(d) merging tit two assignment ~t3tcS into one. (e) insening the template fOf the nt" \ t \\hil 10l'P.
Ihen and else ,tatements 10 states. From the stale machine In (c). we could usc our RTL design
(f) inserting the SIBle, ~ r th!lt \\ hil' loop. merging (\\ 0 3.~ ignmenl '19.t 'menlS tOto one. ,,) the rirul
method 10 complete the deSIgn. Note: max can be Implemenled morc efficie ntly: we u e max here
high-Ievd , tnte ma hine. \\ ith the ',\, hi Ie (I)" inciud,'<! \l~ tran",uoolOg tn>m the '3>t ,.u~ - t
(0 provide an easY-lo-understand example.
the fin-I SHIlt'. and \\ ith ob\ioll'\l) unnccessat: st..ltc.:o, ~mo\\.'(.L
258 Register· Transfer LevelIRTL) Design 5.6 Memory Components 259

III the loop bod). \\c;! can simplify the loop 's Slates a.s shown in Figure S.50(c), Fi g l~re 5.50(c) also Random Access Memory (RAM )
~ho\\.s thl! ~ Iah:: .. for the next IWO S t ~ICI11CI1IS. which are assig nment MalCl11c nt s. SInce I~OS~ two
a.. ,i!!nmenb could be done si multanl!ollsly. we merge the IWO Sl~iles illlo Olle, as show n 111 Flgu~e A RAM is logicall y the same as a register file (see Section 4.IO)--both components are
5.56<d). We then Irtlll~lale the next lI·hi/e loop. using the Il'hile loop 1~l1lp ~ a le. to the SUites sh,own In memories whose words (each of which can be thought of as a register) can be individually
Figure :i.SO(e), We fill in the SH\ les for the wllile loop's statements III Figure 5.50(0. merging the read and written using address inputs. The differe nces between a RAM and a regi ter file are:
I\\~ J"si2l1mclll :-talemenl :-.131es into one stale since the assig nments can be done simultaneously.
The size of M- We typicall y refer to smaller memories (from 4 to 512 or perhaps
Fhwrc 5~50(f) :11500 shows the state for the last statement of the C code. whi ch ass igns sat/=sum.
even 1024 words or so) as regi ster fi les. and larger memories as RAM .
Fi~allv. \\ e eliminate obvious. ly unnecessary cmpty swtes. and add ;1 tra nsit ion from the last slate 10
the fir~l state to account fo r the entire code being encloscd in a "while ( 1)" loop. The bit storage implementation-For large numbers of word. a compact imple.
NOl ice the similarity between our final hi gh- leve l state machine in Figure 5.50(g) and the high- mentation becomes increaSingly imponant. Thus. a RAM typically uses a very
le"el stiJle 1113chine we des igned from sc ratc h in Figu re 5,29. compact implementation for bit storage. rather than u ing a Rip-Rop.
\\'e will need to map the C data Iypes 10 bits at some point. For exa mple: the C code, de~lares
The memory's physical shape-For large numbers of words, the phy ical shape of
i to be a shan unsigned integer. whi ch means 16 bits. So we could dec lare 1 to be 16 bits In the
the memory·s implementation becomes imponant. A tall rectangular hape will
a
high-level s t3le machine. Or. knowing the range of i to be to 256. we cou ld instead de fine i to be
9 bib (C doesn't have a 9-bil wide data lype). have some shon wires and some long wi res, whereas a square shape will have all
\Ve could then proceed to des ign a contro ller and dalapalh from this stale machi ne. as was medium length wires. A RAM therefore typicall y ha a square hape. to reduce
done in Figure 5.30. Thu~. we can translate C code to g~ltcs. using a straightforward automatable the memory's critical path . Reads are perfonned by first readi ng out an entire row
method. of words, and then selecti ng the appropriate word (column) out of that row.
There's no c1ear· cut border be tween what defi nes a regi ster file and whal defines a
Through the previous exa mples. yo u have seen howe code can be convened to a
RAM . Smaller memories (typicall y) tend to be called regis~er files , and larger memorie
custom dig ital circuit using methods that are full y automatab le.
tend to be called RAMs. But you ' ll often see the tenns used quite interchangeably.
General e code can conta in additional types of statements. some of which can be
A typical RAM is single-ported. Some RAMs are dual·poned. Adding more pons 10
eas il y translated to states. For examp le. afar loop can be tra nslated to states by first trans·
RAMs is much less common than to register files, because a RAM ·s larger size makes the
fonning the for loop into a IIhile loop. A 51vitch statement can be tra nslated by tirst
de lay and size overhead of extra pons much more costly. 'everthele . conceptuall~. a
translating the 511·itch statement to if·the,,·eI5e state ments.
RAM can have an arbitrary number of read pons and wri te pons. ju t like a register file.
Some e constructs pose problems for converting to a circu it. though. For example,
Figu re 5.52 shows a block diagram for a ID24x32
pointers and recurs ion are not easy to translate. Thus. too l that automate behavioral
sing le-pon RAM (M = 1024. N =32). data is a 32-bit wide
design from e code typica ll y impose re tric tions on the a ll owable e code that can be
set of data lines that can serve either as input lines during
handled by the tool. Such res tric ti ons are know n as suhsellillg the lang uage.
writes or as output lines during reads. add r is a JD·bit input
While we have emphas ized e code in thi s sec tion . obviously a ny simi lar language, serving as the address lines during reads or wri te. rw is a I· 1024 x 32
such as e++. Java. VHDL. Veri log. etc .. can be converted to c u tom d igital c ircu its-with bit control input that indicates whether the present operation RAM
appropriate language subsetting.
should be a read or a write (e.g .. rw = 0 means read. rw = l
means write). en i a I·bit control input th3t enables the
5.6 MEMORY COMPONENTS RAM for reading or writing-if we don·t want to read nor
write during a particular clock cycle. we set en to 0 to Figure 5.52 IO~J,3~ RA\I
Register·transfer level design involves instanti ating and con·
necting datapath components to fonn data paths, controlled by
controllers. RTL design often uti lizes some additi ona l compo·
nenh Outside the data path and controller.
One such component is a memory. A n MxN mem ory is a
~
~
§... ~
prevent a read or write (regardless of the value of r\~).

WHY IS IT CALLED "RANDOM ACCESS" MEMORY?


In the early days of digital de, ign. RA i s did not exist. 999 was
block symboL

under the head. In Nher \\ ords. the tape \\ as


::;

B
memory com ponent ab le to . to re M data items of N bit; each . If you had infomlntio n you wanted your digi tal ircuit acce sed requtlJ{ial/y. \\'ben R."'-M \\ ~ firq rei a.cl
Each data item in a memory i. known as a word. Figure 5.5 1 to store. you stored it on a magnetic drum. or :l its Illostappealing feature \\J.!. that 3.n~ ''r.lndQ(1)'·
depicts the storage avai lab le in an MxN memory. magnetic tape. Tape drives (and drum drives too) had address auld he a 't'sSt.'iI in the S!lJ11C lJ1l()unt of rune
N·b/IS
We can genera ll y categorize memoric, into two gro up' : to spin the 13pe to get the head. whi h ould read or as any other ad~-.s-re-gardles of the- pre'ioo'l~
wide each
RAM memory. which can be written to and read from, and write on to the (ope. alx)\'c the de ired melllo!,) read addre_· . That" · be<-au: then' '" no ··o.ad-- '" tt'
MxNmemory location. If the hend wa~ urTCI111y ubo\'c locution 900. acres. n R. ~ t. and no pinnlll£, of t3~" or drum,
ROM memory, which can on ly be read from. Howcver, a' wc
and you wanted to wri le t loclllion 999. the tape Thus. the Icnn ''rJndoOl JI.: ~ •• Illem~ \\3..' u..ed..
,ha ll sec. the distinction between the two categoric, is billr- Figuro 551 Logical would hnve to pi n P"'t 901. 902, ... 99 . until location .tnd ha... :-tlll'k to thi:- da~ .
nng due to new technologic,. \ lew or a memQry.

-- - - - - .
260 Register-Transfer LevellRTLI Design
5.6 Memory Components 261
Figure 5.53 shows the logical internal structure of an MxN RAM. By " Iogical" struc- popul ar types of RAM-stat ic RAM and dynamic RAM . However. be forewarned that
ture. we mean that we can think of the structure bei ng implemented in that way, although the Internal'deSIgn of those block S 'InVO Ives electrontcs
. .Issues beyond the scope of this
.
a real physical implementation may possess a different actual structure. (As an analogy, a book, and Instead IS wi thin the scope of textbooks on VLSI or advanced digital design.
logical structure of a telephone incl udes a microphone and a speaker connected to a Fortunately. a RAM component hides the complexi ty of its internal elecrronics by using a
phone line. al though real phys ical telephones vary tremendously in their implementa- memo~ controlle:, and thus a digital designer's interaction with a RAM remains as dis-
tions. includ ing handheld devices. headsets, wireless connections. built-in answeri ng cussed In the prevIOus ection .
machines, etc.) The main pan of the RAM structure is the grid of bit storage blocks, also
known as cells. A collection of N cells fo rms a word , and there are M words. The address Static RAM
inputs feed into a decoder. each output of which enables all the cells in one word corre- Stati~ RAM uses a bit storage block involving
sponding to the present address values. The enable input en can disable the decoder and two Inverters connected in a loop. as shown in
prevent any word fro m being enabled . The read/write control input rw also connects to Figure 5.55. A bit d will go through the
every cell to control whether the cell wi ll be wrillen wi th wdata. or read out to rdata. The bOllom inverter to become d', then back
data lines are connected through one word 's cell to the next word 's cell , so each cell must through the top inverter to become d again-
be designed to only output its contents when enabled and thu s output nothing when dis- thus, the bit is stored in the inverter loop.
abled, to avoid interfering with another cell 's output. NOlice that this bit storage block has an extra
line, da ta '. passing through it, compared
LetA = 1092 M with the "logical" RAM structure in Figure
5.53. Figure 5.55 SRAM cell.
To write a bit into thi s inverter loop, we
set the data line to the value of the desired
i5 bit , and d a t a' to the complement. So to store
"C
a I, the memory controller sets d a t a =1 and
'addrIA-l)
" data ' =0, as shown in Figure 5.56. (To store
a O. the controller would have set data=O and
clk data ' =l.) The controller then sets
enabl e=l, which turns on both shown tran- e~~~~e
sistors. The data and data ' values thus '---------------------------'
rdata(N-l) rdatalN-21 rdataO appear in the inverter loop as shown (over- Figure 5.56 Writing a I to an
Figure 5.53 Logical internal structure of a RAM . writing whatever value was there before). SRAM cell.
Fully understanding why thi circu it works

~~r~:.i'
Notice that the RAM in Figure 5.53 has the involves electrical details beyond the scope of
same inputs and outputS as the RAM block diagram this discussion.
in Figure 5.52, except that the RAM in Figure 5.53 Reading the stored bit can be done by first elling the da ta and da ta' line bolh 10
has separate write and read data lines whereas 1 (an act known as prechargillg). and then by serting enable 10 1. One of the enabled
Figure 5.52 has a single set of data lines (a single transistors will have a 0 at one end. causing the precharged 1 on the da ta or da ta' 10
port). Figure 5.54 shows how the separate lines drop to a vol tage slightly less than a regular logic 1. Both the data and data' lines
data(N-l ) dataO
might be combined inside a RAM having just a connect to a special circuit called a sellse amplifier that detects wheth~r the \'oltage On
single set of data lines. Figure 5.54 RAM data inpui/ d a t a is slightly higher than data' . meaning IOf!ic 1 is stored. or whether the \' Ita!!.~ n
output for a single port . data ' is slight ly higher than on data . me~in~ logic 0 is slOred. Again. detail -fthe
Bit Storage in a RAM electronics are beyond the scope of this discussion.
Notice that the bit storage block of Figure 5.-7 utili zes ix transistors-{\\O in 'ide
Compared to a register file, the key feature of RAM tS ItS compactness. Recall from each of the two inverters_ and two transistors outside the in\'erters. ix transi_ t rs are
Chapter 3 that we implemented a bit storage block using a D nip-Oop. Because RAMs fewer than needed inside a D flip-flop. A tradeoff is that special circui~ must be used t
store large numbers of bits, RAMs utilize a bit torage block that is more compact than a read a bit stored in thi s bit storage block. where:!., a D Hip-flop ourput ' regular logic
flip-flop. We thus discuss briefly the internal design of the bi t storage blocks inside tWO values directly. uch special circuitry slows the access time f the SIOI\.-d bit.

------------------
262 Register-Transfer LevellRTLI Design
5.6 Memory Components 263
RAM based on a six-transistor bi t storage Because the stored bit challges (discharges) even when power is upplied and we are
block. or similar such block, is known as a not writing the bit storage block, RAM based on the one transistor plus capacitor bit
sIalic RAM. or S RAM. A static RAM mai n- storage block is known as dynamic RAM, or DRAM.
tains the stored bit as long as power is Compared to SRAM , DRA M is even more compact, requiring only one transistor per
supplied to the transistors. Except. of course. bi t storage block rather than six transistors. The tradeoff is that DRAM requires
when the block is being written. the stored bit refreshing, which ultimately slows the access ti me. Another tradeoff. not alluded to
does /lot change- it is ~latic (noL changing), word above, is that creating the relati vely large capaci tor in a DRAM requires a special chip
Dynamic RAM enabte fabricati on process. and thus incorporating DRAM with regular logic can be costly. In the
To sense amplifiers
An alternati ve popu lar bit storage block used I 990s, incorporating DRAM with regu lar logic on the same chip was nearly unheard of.
in RAM has only a single transi stor per block. Figure 5.57 Reading an SRAM. Technology advancements, however, have led to DRAM and logic appeari ng on the same
Such a block utili zes a (re lati vely large) chip in more and more cases.
capacitor at the ou tput of the transistor. as d~ta Figure 5.59 graphically depicts the compact- MxN memory
shown in Figure 5.58(a). ness advantages of SRAM over register fi les, and implemented as a:
cell DRAM over SRAM , for storing the sallie number

pi:-
Writi ng can occur when enable is 1: register
d a t a ~ 1 will charge the top plate of the word of biLs. file
enable I d
capacitortoa L w h ile d a ta~O will make it O.
When enable is returned to O. a 1 on the top Tttapacltor SRAM
,/ slowly Using a RAM
plate will beg in to discharge across to the • discharging
Figure 5.60 shows timing diagrams de cribing DRAM
bottom plate of the capacitor on to ground ~
(Why? Because that 's what a capacitor does.) (al how to write and read the RAM of Figure 5.52.
However. the capacitor is intentionally The timing diagram in Figure 5.60 shows how to
designed to be relatively large, so that the dis-
data~ write a 9 and a 13 into locations 500 and 999
ch arge takes a long time, during wh ich tim e enable~ during clock edges I and 2, respectively. The next Figure 5.59 Depiction of compacrnes
benefits of SRAM and DRAM (not to
cycle shows how to read location 9 of the RAM ,
the bi t d is effecti vely stored in the capacitor.
Fi gure 5.58(b) provides a tim ing diagram
d~Ibl
by setting a ddr~9 . data ~Z , and rw~ O scale).
(meaning read). Shonly after r w become 0, data becomes 500 (the value we had previ-
illustrating the charge and discharge of the Figure 5.58 DRAM bit storage (a)
capacitor. ously stored in location 9). Notice that we had to disable our writing of data first (by
bit storage block. (b) discharge. setting it to Z). so a not to interfere with the data being read from the RAM . AI 0 notice
Reading can be done by first setting da ta
that Lhis RAM's read functiona lity is asynchronous.
to a voltage midway between 0 and L and then setti ng enable to 1. The val ue stored in
the capaci tor will alter the voltage on the data line. and that altered voltage can be sensed , , ,
by special circuits connected to the data line that amplify the ensed value to either a ctk~ ctk~
log ic I or a log ic O.
lt turns out that readi ng the charge stored in the capacitor di scharges the capacitor.
Thus. the RAM must immediate ly write the read bi t back to the bit storage block after
addr
1

,
~

,
l

, addr R setup i
reading the block. The RAM mu t contain a memory controller th at automatically per-
forms such a write back .
data ~
data ~;~~m;> 500
~
1 1/:

Because a bit tored in the capacitor graduall y discharges to ground, the RAM must rw
,, write ,,:
1 means "
, rw ~etup 1
refresh every bit storage block before the bi ts completely discharge and hence the stored en~ h h : I I
I
time :
I aa:ess
DRA'v1 ch/fJ'ifirrt
bit is lost. To refresh a bit storage block, the RAM must read the block and then write the l RA M{9] i RAM{13] i ! ! bine
read bit back to the block. Such refreshing may be done every few microseconds. The now equals 500 now equals 999 1 '
appeared Ifl the
(a) (b )
early 197(}(, ufld RA M must include a built-in memory controller that au tomati ca ll y perform these
((Ju/d hold only a refreshes. Figure 5.60 Rending and writi ng a RAM : (a) timing diagroms. (b) setup. hold. and J< ss time -
f~ tlwu wnd bm
W{)t!unDRA \tfs Note that the RAM may be bUl>Y refreshing itself at a time that we wish to read the The delay between our setting the rw line to read and the rend datu stabilizing ut the
('(In hold tnt",)' RAM . Funhermore. every read must be followed by an au tomatic write. Thu . RAM da ta output is known as the RAM's access time or read tillle _
hllllon\ of bill
based on one- Lra nl istor plus capacitor technology may be slower to "ecess. We now provide ,m example of using a RA t in an RTL design.
26~ 5 Register-Transfer level (RTl) Design
5.6 Memory Components 265
EXAMPLE 5.12 Digital sound recorder using a RAM disables the three-state buffer. to avoid interfering with the RAM 's output data that will appear
Let's design a system that CJ Il record sound. and can pl ay ba~k thai reco rded so~ n d . Such a recorder dUflng RAM reads. That state also sets the RAM address lines. and sets the RAM control lines to
i!'> found in various toys. in telephone an swering machines. In cell phone ~u~~Otn g announcements, enable reading. Read data will thus appear on da ta lines. The next state X loads a value into the
and numerous Dlher devices. \Vc'lInccd an analog-to·digital convener (0 d Ig iti ze the sound, a RAM dlgJt~I-lo-analog conve rt er, 10 convert lhe data jusl read from RAM to the analog signal. That stale
(0 store the digitized sound. a digital .lo-analog convener to output the digitized sound, and a pro- also IOcrements the counter a. The machine return s to state W to continue reading. until the entire
cessor 10 cont~1 both convert ers ~nd the RAM . Figure 5.61 shows a block diagram of the system. memory has been rcad.

~
4096x16 Read-Onlv Memory (ROM)
RAM
A Read-Only Memory (ROM) is a memory that can be read from. but not written to.
Because of being read only, the bit-storage mechanism in a ROM can be made to have

If
microphone
several advantages over a RAM, including:

Compoct/less-a ROM's bit slorage may be even smaller Ihan a RAM's_


NO/l voIOlility--A ROM 's bi t storage mai ntains its contents even after the power
supply to the ROM i shut off-when turned back on. the ROM's contents can be
read agai n. In Contrast. a RAM loses its contents when power is shut off. A
memory Ihat loses its contents when power i shut off is known as volatile. while
speaker
a memory Ihal maintains its contents wi thout power is known as nonvolatile.
Figure 5.61 Utilizing a RAM in a digital sound recorder system.
Speed-A ROM may be faster to read than a RAM. e pecially Ihan a DRAM.
To slOrc digitized sound. the processor block can wIV-polVer- A ROM does not consume power to maintain its contents. in con-
implement the high-level stale machine segment shown in trast to a RAM. Thus, a ROM consumes less power than a RAM .
Figure 5.62 . The mach ine fi rst intializes its intern al
Therefore. when the data stored in a memory will not change. we might choose to
address counler a to 0 in state S. Next. in sta te T. the
store that data in a ROM to gain the above advantages .
machine loads 11 value inlo the analog-Io-digital convener
to cause a new analog sample to be digitized. and sets lhe Figure 5.64 shows a block symbol of a I024x32
three-state buffer to pass that digitized value to the ROM . The logical internal structure of an MxN ROM
is shown in Figure 5.65. Notice that Ihe internal data
RAM's da ta lines. That state also sets the RAM address
to the counler a's value. and sets the control li nes (0 structure is very imi lar to the internal structure of a
enable writing. Th e machine lhen transitions to slate U. RAM shown in Figure 5.53. Bit storage blocks
en
whose transitions check the value of a against 4095. That form ing a word are enabled by a decoder output. with
Slate also increments a. (Remember th at the transi tions Figure 5.62 State machine for the decoder input being the addres . However.
from U will use the old va lue. not the incremenlcd value, stori ng digitized sound in RAM. because a ROM can only be read and cannO! be Figure 5.64 10.4x3. ROM
of a. Thus. the transitions compare with 4095. not 4096.) written, there is no need for a rw input comrol to block symbol.
The machi ne returns to Slate T and hence cOnli nues specify read versus write. nor for wdata inputs to
writing samples in ~eq uential memory addresses as long
provide data being written. Also. because no synchro-
as the memory is nOt yet filled (a < 4095). Notice that
nous writes Occur in a ROM . the ROM does not have a clock input. In fact. not only is a
the comparison is with 4095. not 4096. This is because
the action in Slale X of a - a + 1 does nOI occur until ROM an asynchronous component. but in fac t a ROM can be thought of as 3 combina-
a=O
the next clock edge. so the comparison of a < 4095 on tio/lol component (when we only read from the ROM: we'lI see variation later).
, tate K s ou tgoing tran ~ ilion uses the old value of a, not Some readers mighl at th is point be wondering how we write the initial ntents of a
the incremented value (See Section 5.3 discussion of ROM lhal we then can only read. After all. if we can't write the content of a RO~1 at all.
common pitfallq then the ROM is really of no u e to us. Obviously. there must be a \\ 3) 10 write the con-
To playback the stored digititcd .ound. the processor lents of a ROM . bUI in ROM terminology. the writing of the initial contents of 3 RO~1 i
block can implement the high-level Mate machine known a ROM programmillg . ROM types differ in their bit storage bl k implemenm-
segment hown in Figure 5.63. After initializing the Figure 5.63 State machine for tions. which in lurn causcs differen es in the methods used ~ r RO;\1 programming. We
counler a in stale V. the machine eJ1le r~ Male W St:Jle tV playing ,ound from the RAM. now describe several popular bil slomge block implementations for R ~t.
266 Register· Transfe r Leve l (RTL) Desi gn

i5
5.6 Memory Components 267
LetA = log2 M
bit storage Fuse-Based Programmable ROM-One.Time Programmable (OTP) ROM
word l e a i block Fi gure 5.67 illustrates Ihe bit storage cell
enab •••
dO --- --- - - --- (a"ceW' ) for a fu se-based ROM . A /use-based ROM
addrO --- ------ word uses a fu se in each ce ll. A fuse is an elec- data line data line
-0 addrl (t__ L_ ~~ J trica l component that initially conducts
u I - , --- I

'"addr(A· ' ) from one end to the other just like a wi re,
but whose connection from one end to the word

~
word word
enable-enabi8 other can be destroyed ("blown") by enab~le~__-'~__it-t__~~__-tr
data pass ing a higher-than-norma l current
en fuse blown luse
th rough the fu se. A bl own fu se does not
conduct and is instead an open circuit (no Figure 5.61 Fu se-based ROM ce lts: left ce ll
data(N·' ) dala(N·2) dataO connection). In the figure, the cell on the programmed with t . ri ght ce lt with O.
left has its fu se intact, so when the cell is
Figure 5,65 Logical intern al structure of a ROM . enabled. a 1 appears on the data line. The
cell on the right has its fu se blown. so when the cell is enabled. nothing appears on the
ROM Types data line (special electron ics wi ll be necessary to conven that nothing 10 a logic 0).
A fu se-based ROM is manufactured with all fuses intact, so the initiall y stored con-
Mask-programmed ROM
tents are all Is. A user of this ROM can program the contents by connecting the ROM to
Figure 5.66 illustrates the bi t storage ce ll
data line o data line a special device, known as a programmer. that provides higher than normal currents 10
for a mask- prog rammed ROM . A mask-
only those fuses in cells that should store Os. Because a user can program the contents of
programmed ROM has its contents pro-
this ROM. the ROM is known as a program mable ROM , or PROM.
grammed when the ch ip is
A blown fuse cannot be changed back to its initi al conducting form . Thus. a fuse-
manufactured . by directly lIIirillg Is to
based ROM can onl y be programmed once. Fuse-based ROM are therefore also known as
cells that should store a I , and Os to olle-lime programmable (OTP) ROM.
cell s that should store a O. Recall that a
"I" is ac tuall y higher-than-zero Erasable PROM-EPROM
voltage coming from one of everal Figure 5.66 Mask· program med ROM Figure 5.68 depicts a logical view of an
power input pins to a chi p-thus. wiring ce lls: teft cell prog rammed with 1. right erasable PROM cell. An erasable PROM.
a I means wi ring the power inpu t pin ce tl w ith O. or EPROM. cell uses a special type of ~-~
-§~
data line data line

transistor, having what is known a.s a :g~


celt
directly to the ce ll. Likewise. wiring a 0
floatin g gate, in each cell . The details of a "''''
to a cell means wiring the ground pin ~
directly to the cell . Be aware that Figure 5.66 presents a logical view of a mask.pro- floating gate transistor are beyond the word
grammed ROM ce ll- the actual phys ical design of such cells may be somewhat scope of thi s section. but briefly-a enable

diffe rent-for example. a common design strings several vert ical cells together to form a fl oat ing gate transistor has a special gate in
trapped electrons
large NOR-like logic gate. We leave details for more ad vanced textbooks on CMOS whi ch electrons can be "trapped:' A Lran-
circuil des ign. sistor with electrons trapped in its gate Figure 5.68 EPROM celts: left celt
Wires are pl aced onto chips during manufacturi ng by using a combination of light. stays in the nonconducting siruation. and programmed with L right celt \\ ith O.
sensiti ve chemica ls and light passed through len es and "masks" that block the light from thus is programmed to store a O. Other-
reaching regions of Ihe chemicals. (See Chapter 7 for fun her details.) Hence the term wise, the cell is considered to store a 1.
"mask" in mask-programmed ROM . pecial electron ic circuitry convens sensed current · on the data line' a; logic I or O.
Mask-progra mmed ROM has Ihe best compactness of any ROM type. but the con· An EPROM cell initially has no electrons trapped in any fl oating gate transistors. -
ten l.~ of the ROM must be known during chip manufacturing. This ROM type is best the initially stored contents are all I s. A programmer d \ ice applies higher-than-nonnal
suited for high-volume well-established products in which compactness or very low cost voltage to those transistors in cells that should store Os. That high \'olt:\g" 'ause, d -
is critical, and in which program ming of Ihe ROM will never be done after the ROM's trons to IlI/l1le/ th rou gh a slllall insulator into the fl oating gate region. When th' \ Itnge is
chip i, manufactured . removed. the electrons do not have enough energy to tunnel ba k. and thus are trapped as
shown in the right ce ll of Figure 5.6 .

-
268 Register·Transfer LevellRTLI Design 5.6 Memory Components 269
The electrons can be freed by exposing the electrons the data to be programmed and the add '. . .
wnlmg the EEPROM f h' ress mto mternal regIsters, freemg the circuit that is
to ultrav iolet (U V) light of a part icul ar wavelength . The
Modem EEPROMrom avmg to hold th ese va1 ues constant dunng . programming
UV light energizes the electrons suc h that they tunnel back s can be prog d .
more, and can retain thel' ramme tens of thousands to millions of time or
through the small insulator, thus escaping the floating gale r Contents
While erasing one word t . for .tens t0 one hun d red years or more without power.
region. Exposing an EPROM chip lo UV light therefore other applications need to a a tIme IS fine for some applications that utilize EEPROM
"erases" all the stored Os. reslOring the chip lo having all erase large block f ' .
camera application would d s 0 memory qUIckly-for example. a digital
1s as contelllS. aftcr which the EPRO M can be pro- . nee to era e a blo k f .
pIcture. Flash memory is a Iype of EEP . c a. memory correspondmg to an entire
grammed agai n. Hence the term "erasable" PROM . Such a memory can be erased ve ui ROM In whIch all the words with a large block of
chip can typica lly be erased and reprogrammed about ten Figure 5.69 The "window" time. A flash memory ~ q ckly, perhaps sImultaneously, rather than one word at a
thousand times or more, and can retain its contents without in (he package of a Many fl ash memories:a: al~o~:~~letely erased by setting an erase control input to 1.
power for ten years or more. Because a chip usually microprocessor that uses an erased while other ' y a specific regIon , known as a block or sector. lo be
appears inside a bl ack package thm doesn't pass light. a EPROM 10 Slore programs. regIons are left untouched.
chip with an EPROM requires a wi ndow in that package
Using a ROM
through wh ich UV light can pass. as shown in Figure 5.69.
We now provide examples of using a ROM in RTL designs.
EEPROM and Flash Memory
An electrically erasable PROM , or EEPROM, utili zes the EPROM programming method EXAMPLE 5.13 Talking doll using a ROM
of using high voltage lO trap electrons in a fl oating gate tranSislOr. However, unlike an
EPROM that requires UV light to free the electrons and hence erase the PROM , an We wish to design a doll thai s aks lh " .
moved. A block diagram of th pc e. message NIce 10 meel you" whenever the doll's righl arm is
EEPROM uses anot her high voltage to free the electrons. EEPROMs thus avoid the need e syslcm ISshown in Fioure 5 71 A 'b . .
ann has an output V that is 1 when vibr.ltion '. 0 " VI ration sensor In the dolr right
for placing the ch ip under UV li ght. then output a digitized version of the "Nice IS sensed. A .~rocessor detects, ~e vibration and houJd
Because EEPROMs use voltages for erasing, those voltages can be applied to spe- attached to a speaker. The "Nice 10 mec " [0 meet yo~ message to a dlguaJ-to-analog converter
actress. Because that message 'II t you message wil l be the prerecorded voice of a professional
cific ce lls on ly. Thus, whi le EPROMs must be erased in their entirely, EEPROMs can be
message in a ROM. • WI nOI change for the li fetime of the doll producl, we can store thai
erased one word al a lime. Thus, we can erase and reprogram certain words in an
EEPROM wit houl changing the conlenlS of olher words.
4096 x 16 ROM speaker
Some EEPROMs require a special programmer device for programming. However,
most modem EEPROMs do not require special voltages to be applied to the pins, and also
include internal memory controllers that manage the programming process. Thus, we can
reprogram an EEPROM 's contents (or part of its contents) wi thout ever removing the chip
from the system that the EEPROM serves-such an EEPROM is known as being in-system vibration
programmable. Most such devices can therefore be read and wrillen in a manner very sensor
similar to a RAM. Figure 5.11 Utili zing a ROM
Figu re 5.70 shows a block diagram of an in a lalking doll system.
EEPROM. Notice that the data lines are bidirectional.
32
just as was the case for RAM . The EEPROM has a --+- data
control inpul wri te-vlri te=O ind icates a read 10
~ ad dr Figure 5.72 shows a high-level stale machine
operat ion (when en=1), whi le write=1 indicates s.egmen t that plays the message after detecting vibra-
thai the data on the data lines shou ld be programmed _ en 1024 x 32 lI on. The machine starts in stale S. inil'i:liizing the
EEPROM
into the word at Ihe add ress specified by the address ----.. write ~OM address coun ter a to O. and waiting for vibra-
linc . Programming a word in to an EEPROM takes tIOn ~o be sensed. When vibration is sensed. the
_ busy machine pr~eeds to Slate T. which reads the current
time, though, perhaps several. dozens, hundreds, or
even thousands of clock cycles. Therefore. EEPROM I> R~M locatIon. The I11Hchine moves on to state U.
whIch loads the digital-la-analog converter with the
may have a control OUlput busy to indicate that pro- read value fmm ROM. incremems a. and proceeds
Figure 5.10 1024x32 EEPROM
gramming is nOI yet complete. While the device is block symbol. back 10 Tas long as a hasn' l reach d -1095 (remember
bu~y, the EEPROM user should not try writing to a dif- thm the transilion fmm U uscs the value of a before
ferent word. Fortunalely, mOM EEPROMs will load Ihe II1cremenl. so compares 104095. not to -1096).
270 Register-Transfer LevelI RTL) Design
5.7 Qu eues IFIFOs) 271
Because thi:-. do ll's message wi ll neve r change. we l~l i g hl choose, to usc a l~las~- ~~ogrammed
Notice that. unli ke Examples 5.12 and 5.13. this tate machi ne increments d before the state that
ROM or an OTP ROM . We migiu uti lize OTP ROM dUri ng prototyplllg or dUri ng IIlll1al sales of
checks for the last address (state V) , so V"s Lransilions use 4096. not 4095. We how this version JUSI
the doll. :lIld th en produce m3;k-prograllll11cd RO M versio ns during hig h-volume prod ucli on of
for varlely. The version in Example 5.12 may be Slightly bener because that version requires that d_
the doll. and the comparator, only be 12 bits wide (to represent 0 to 4095) rather than 13 bits wide (to repre-
sent 0 to 4096).
EXAMPLE 5.14 Dtgtta l telephone answe rtng machine using a flash memory . Thi~ state machine assumes thal writes to the fl ash occu r in one clock cycle. Some flash memo-
\Ve are to desig n the olltgoing announcement part of a te lep ho ne answerin g mac.hi.ne (e.g .. "We 're nes requi re more ti me for writes, assert ing their busy out put un ti l the write has complered. For such
Ilot home ri!:!llI now, leave a messnge."). That an nounccmcm shou ld be stored d igit all y. should be a flash. we would need to add a slate betwee n stat.es U and V. similar to the state T between Sand U.
recordable by the machi ne owner any nu mber of ti mes. and should be saved even if power is removed To prevent missing sound samples while wa iling, we mi ght want to first save the entire sound
fro m the illlswering machine. Recording begins immediately after th e owner presses a record buncn, sample in a 4096x 16 RAM, and then copy the entire RAM contents to the flash.
which se lS a signa l rec 10 1. Because we must be able I ~ reco rd the anno un cement. we cannot use
a mask·progrnlllllled ROM or OTP ROM. Because removll1g power shoul d not cause the announce- The Blurring of the Distinction between RAM and RO M
ment to be lost. we cannot use a RAM. Th us. we might choose 3n EEPROM or a Aash memory.
We' lI u5e a nash memory. a ~ show n in Figure 5.73. Noti ce th a~ the fl ash memory has the same inte.r- Notice that EEPROM and Hash ROM blur the distincti on between RAM and ROM. Many
face as a RA1\ll. except tha t th e nash memory has an ex tra Inpu t l~aJl1cd erase. eras~ on t~IS modem EEPROM devices are writable just like a RAM . havi ng nearly the arne interface.
panicular nash memory clears the contents of the ent ire flash. \Vhlle the .nash me mory IS erasmg with the onl y difference being longer write times to an EEPROM than to a RAM. How-
itself. the fla sh sets an output busy to 1. duri ng whi ch ti me we cann Ol wnte to the fl ash memory. ever. the difference between those time is shrinking each year.
Funher blurring the distinction are nonvolatile RAM (NVRAM) device, which are
4096 x 16 Flash RAM devices that retain their contents even without power. Unl ike ROM. NVRAM write
times are just as fast as regular RAM- ty picall y one clock cycle. One type of NVRAM
simply includes an SRAM with a bu ilt-in battery. with the battery able to supply power to
the SRAM for perhap ten years or more. Another type of VRAM includes both an
SRAM and an EEPROM- the NVRAM controller automaticall y backs up the SRAM's
contents into the EEPROM . typically just at the time when power is bei ng removed. Fur-
thermore, extensive research and deve lopment into new bit storage technologies are
leading to NVRAM s that are even closer to RAM in terms of performance and density
while being nonvolatile. One such technology is known as MAGRAM. shon for magnetic
RAM, which uses magnetism to store charge. having access ti mes similar to DRMt. but
withoUlthe need for refreshing. and with nonvolatil ity.
Thus, digital de igners have a tre mendous variety of memory types available to them_
with those types di ffering in their cost. performance_size. nonvolatility_ ease-of-use. write
Figure 5.73 Utilizing a fl as h memory in a di gita l answeri ng machine. time_ du ra tion of data retention_ and other factors.

Figure 5.74 shows a hi gh·level stale machine


segment for recordin g the nnnounceme ni. The 5.7 QUEUES (FIFOs)
Mate machine segment begins whe n the record
bUlton i pressed. Slate S activates the erase of the Somerimes our data storage needs specifi-
nash memory (e r =l ), and then state T waits for cally require that we read items in the same back from
the era, ing to complete (bu'). Such erasing order that we wrote them, and that reading
should occur in jusl n few mi ll iseconds. so we removes the item from the list. For example,
shouldn' t mi ss any of the spoken an nouncement. a busy restauranl may mai nlai n a wail ing lisl
The state mnchine then transitions 10 Slale U. of customers-the host writes customer
which copies a digitized sample from the analog- names to the rea r of the list. but when a tabk
di gital converter to the fl a'> h memory. writing to becomes available. the host reads the next write items read (and
the current address a. State U also increments a . customer's name from the fivlII of the list to back nemove) Items
ofthe queue from front of
The next 'tate f II) checks to 'ee if the memory i, and removes that name from the list. Thus. the queue
filled with ,ample, by checking if d( 409 6. Figure 574 State machine for storing the fi rst customer wri tten to the list is the
returning to ,tate U until the memory is fi lled. di gi ti zed .;;ound in a fla ~h memory. first cu -tomer read from the list. A qlleue is Figure 5.75 C'onc'Cp1ual \ ie" of 3 queue.

. _------------------
272 Register-Transfe r Leve l lRTLJ Desig n 5.7 Queues {FIFOsJ 273

a list that i written at the rear o f the list but read from the beginning of the list, with a read Unfortunately, notice that the conditions detecting the queue be ing empty a nd the
also removi ng the read item fro m the list, as illustrated in Figure 5 .75 . The common tenn queue beJllg full are the same- the fro nt address equals the rear address. One way to tell
for a queue in American English is a " Iine"-for exa mple, you stand in a line at the grocery the two conditions apart is to keep track o f whe ther a write or a read preceded the fro nt
store. with people entering the rear of the line. and being se rved fro m the fro nt of the li ne. and rear addresses becoming equal.
PLEASE In Bri tish English. Ihe word queue is used directly in everyday language (w hich somelimes In many uses of a queue, the circuit writing the queue operates independentl y from
QUEUE confuses Americans who visit other English-speaki ng coun tries). Because the firs t item the CirCUli readin g the queue. Thus, a queue implememed wi th a memory may use a two-
FROM wri tten into the list wi ll be the first ite m read out of the list, a que ue is known as beingfirst- port memory havmg separate read and write ports.
THIS ill first-out (FIFO). As such, sometimes queues are called FIFO queu es , although that tenn We can implement an 8-word
END is redu nda nl because a queue is by defin ilion fi rst-in fi rst-out. The term FIFO itself is often queue using an 8-word two-port 8x16 register file
registe r fi le and additional compo- W data 16 16 rdat a
used to refer to a queue. The term buffer is also so me limes used. A wri te to a queue is wdata rdata
someti mes called a push or ellqueu e, and a read i sometimes ca lled pop or dequeue. nents, as depicted in Figure 5.78.
We can implement a queue using a 7 6 5 4 3 2 1 0 A 3-bi t up-counter maintains the
~ waddr ~
raddr
:--1: -1 :--1 :--1 :--1 :--1r--1 :--1 front address, while another 3-bit
memory-either a reg ister fi le or a RA M .
depending on the queue size needed.
Wh en using a memory. the from and rear
::::!:!
,__
l! !1 ii Ii 1
J , __ J , __ oJ ' __ J I __ J ' __ J ' __ J ' __ J
up-coun ter mai ntains the rear
address. Notice that these counters
r I--- ).r
elr
rd h

wi ll move to diffe ren t me mory locations rI will naturall y wrap around from 7 ~ - f- r- elr

as the queue is wrinen a nd read, as ill us- 7 6 4 3 0 to 0, or fro m 0 to 7, as desired


I- inc inc
3·bil 3-bil
trated in Figure 5.76. The fi gure shows an when treating the memory as a ~f+ up counter up counter
.2
initiall y empty eight-word queue with A--- c ircle. An equality comparator e
c:0
> rear > Ironl

.+ I +I
fronl and rear bOlh set to me mory address detects whether the fron t counter
O. The fi rst ac ti on on the queue i a write I equals the rear counter. A con-
~ ~ 0

of item A. whi ch goes to the rear (add ress


0). and the rear increments to address I.
The neX I aClion is a write of item B, B--- :I ::
'
I
,---,--' ,---,--, ,---,---GG
II
II
6
II
It
::
II
II
::
'1
II
::
II
II
2
,
I
:: :I B A
o troller writes the write data to the
register fi le and increments the
rear counter durin g a write, reads
eq -
I lull

II II II II II
1__ .1 1__ .1 I __ J t__ J 1__ .1 ' __ J
em PlY
whic h goes to the rear (add ress I). and Ihe the read data from the register fil e L....- S-wo rd 16·bit queue
rear increme nts to 2. The next acti on is a r I and increments the fro nt counter
read. which comes fro m the front (add ress
7 6 3 2 o du ri ng a read, and determines Figure 5.78 Arehileclure of an S-word l6-bil queue.
0) and th us reads out item A. and the front whether the queue i full or empty based on the equality comparison a we ll as whethe r the
increments to I. A
previous operation was a write or a read. We omi t further de cription of the queue' con-
Subsequent reads and wri tes contin ue troller, but it can be built by starting with an FSM .
r
likewise, except that when the rear or front A user of the queue should never read an empty queue or write a full queue-
Figure 5.76 Writing and read ing a queue
reac hes 7, its nex t value should be O. not 8. depending on the controlle r design. uch an action might ju t be ignored or might put the
implemented in a memory causes lhe front
[n other words. the memory can be thought (I) and rear (r) 10 move. queue into a misleading internal state (e.g .. the front and rear addre ses may cross over).
of as a circle. as shown in Figure 5.77 . Most queues come with one or more additional contro l output that indicate whether
o
Two cond iti ons of a que ue are of the queue is half full . or perhaps 80% fu ll .
interest: Queue are commonplace in digital system . Some example include:
Empty: there are no items in the A computer key board writes the pressed keys into a queue and meanwltile
q ueue. This condition can be requests that the computer read the queued keys. You might at ome ti me ha\'e
=
detected as fro lll rear, as seen in typed fas ter than your computer was reading the key. in whic h ase >our addi-
the topmost que ue of Figure 5.76. tional keystrokes were ignored-and you may have even heard beep, each time
Full: there is no more room to add yo u pre sed addi tiona l keys. indicating the que ue \ as fu ll .
items to Ihe queue, meani ng there A di gita l video camera may write recently captured video frames into a qUeue.
are N items in a que ue of ize N. and concurrentl y may read those fmme.! fro m the queue. compre'. them. 3/ld store
This comes lIbout whe n the rear the m on tape or anotller medium .
wrap; aro und and catches back up Figure 5.77 Implementing a queue in a • A compu te r printe r may store print job in a queue while th se j bs are waiting \0
to the front. mean ingfrollt = rear. memory lreats the memory as a circl e. be pri nted .

- .. - - -- ._-- - - --
27.t Reg ister· Transfer LevellRTLI Design
5.8 Hierarchy-A Key Design Concept 275
A modem stores incoming data in a queue and requests a computer to read .that
5.8 HIERARCHY-A KEY DESIGN CONCEPT
data. Likewi se, the modem writes outgoing data rece ived fro m the computer tnto
a queue and then send s that data out over the modem's outgoi ng medium. Managing Complexity
A computer network ro uter receives data packets from an input pon and writes
those packets into a que ue. Meanwhile. the rou ter reads the packets from the Through?ut this book, we have bee n utili zing a powerful design concept known as hier-
queue. ana lyzes the address information in the pac kel. and the n sends the packet archy. HIerarchy In general is defined as an organi zation with a few "things" at the top.
alo ng o ne of severa l output pons. and each thing poss ibly consisting of several other things. Perhaps the most widely
known type hierarchy involves a Country. At the top is a country, which consists of many
EXAMPLE 5.15 Using a queue 3 2 o states or provinces, each of wh ich in turn consists of many cities. A hierarchy involvi ng a
country,. provinces, and c ities is shown in Figure 5.80. That figure shows all three levels
Show the internal stal e or a S- of the hterarchy-coumry, provinces, and cities.
word queue, and popped data Initiallyemply
queue Figure 5.81 shows the same country,
val ues. after each of the fol-
but this time showing only the top two
low ing sequences of pushes and
levels of hierarchy-cou ntries and prov-
pops. assuming an in itially 7 5 2
inces. Indeed, most maps of a country only
empty queue:
1. Alter pushing show these top two levels (possibly CityF
I. Push 9. 5. 8. 5. 7. 2. and 3. 9, 5, 8, 5,7, 2, 3 showi ng key cities in each province/state,
1. Pop but cenainly not all the cities}-showing
n
3. Push 6 765432 a ll the cities also makes the map far too CD

4. Push 3
detailed and cluttered. A map of a province/ CityG'"
2. Alter popping
data: state, however, might then show all the
5. Push 4 9 ountry
ci ties within that state. Thus, we see that
6. Pop r hierarc hy plays an imponant role in under-
Figure 5.79 shows the 7 6 o standing countries (or at least their maps). Figure 5.80 Three-level hier.rrch y example: a
L' tkewise, hierarchy plays an impor-
country, made up of provinces. each made up of
cities.
queue's internal stales. After the
first sequence of seven pushes 3. After pushing 6 tant role in digital design. In Chapter 2, we
(s tep I ). we see that th e rear introduced the most fundamental compo-
address points to addre s 7. The nent in digital systems-the transistor. In
pop (step 2) reads from the front 7 6 5 3 2 Chapters 2 and 3, we introduced several
address of O. returning data of 9.

8800080G
basic components composed from transis-
The front address increments to 4. Alter pushing 3 tors, like AND gates, OR gates, and NOT
I. Note that although the queue lull
gates, and then some sligh tl y more
might still contain the va lue of 9 rl complex components composed from
in address O. that 9 is no longer gates: multiplexers, decoders. flip-flops,
accessible during proper queue 5. After pushing 4 ERROR! Pushing a full queue
operat ion. and thus is essentiall y results in unknown state e tc. In Chapter 4, we composed the basic Figure 5.81 Hierarchy showing just the top
components into a higher level of compo- two levels.
gone. The push of 6 (step 3)
Figure 5.79 Example pushes and pops of a queue. nents, datapath components, li ke registers.
increments the rear address.
which wraps around from 7 to O. adders, ALUs, multipliers, etc. In Chapter 5, we introduced components composed of data-
The push of 3 (step 4) increments the rear address to I. which now equals the front address, path components, including controllers. datapaths, proces ors (made up of controllers and
mean ing the queue is now full. If a pop were to occur now, it would read the value 5. But instead, a
datapaths). memories. and queues.
push of 4 OCcurs (ste p 5)-this push should not have been performed. because the queue is Full. Use of hierarchy enab les us to manage complex design . Imagine trying to compre-
Thus, this push puIS the queue into an erroneous state, and we cannot predict the behavior of any hend the design of Figure 5.30 at the level of logic gates-that de£ign likel\' con i IS of
subsequent pushes or pops. several thousand logic gates. Humans can ' t comprehend everal thousand thing at on .
But they can comprehend a few dozen things. A the number of things grow beyond 3
A que ue could of course come wi th some error- tolerance behavior built in, perhaps
few dozen. we therefore group those things into a new thing. to manage the omplexity.
ignoring pushes whe n full , or perhaps returning some panicu lar value (li ke 0) if popped
However, hierarchy alone is not sufficient- \ e mu t also associate :lJl underst:lJldable
when empty.
meaning to the higher-leve l things we create, a task known as absrrn ti n.

- -. - .-- ------------- J
276 5 Register-Transfer LevellRTLI Design 5.8 Hierarchy-A Key Design Concept 277

Abstraction 52 selects among group i 0- i3 and i 4 - i 7 while 51 x


Hierarchy may not onl y invo lve grouping thi ngs into a larger thing, but may also involve and 50 select one input from the group. You 'can check iO iO
that select line values pass the appropri ate input through, i1 i1
associat ing a higher-level behavior to that larger thing. So when we grouped transistors to
for example, 525 150 = 000 passes i 0, 525150 = 100 i2
fo rm an AND gate. we didn 't just say that an AN D gate was a group of transistors-
passes 14 , and 525150 = 111 passes i 7. i3
rather. we assoc iated a spec ific behavior with the AND gate, with that behavior describing
the behavior of the group of transistors in an easily understandable way. Likewise, when . One particularly commonly occurring com posi -
ti on problem IS that of creating a larger memory from
we grouped logic gates into a 32-bit adder. we didn ' t just say that an adder was a group
smaller ones. The larger memory may have wider
of logic gates-rather, we associated a specifi c understandable behav ior with the adder: A
words, may have more words, or both. i4
32-bit adder adds two 32-bit number .
For example, Suppose you have available a laroe i5
Associating higher-level behavior with a component to hide the complex inner details
number of 1024x8 ROMs, but you want a 1024x32 i6
of that component is a process known as abstractioll .
ROM . Composing the smaller ROMs into the larger i7 i3
Abslract ion frees a designer from having to remember, or even understand, the low-
one is straightforward, and shown in Figure 5.S4.
level detail s of a component. Knowing that an adder adds two numbers, a designer can
We' ll need four 1024xS ROMs to obtain 32 bits per
use an adder in a design. The designer need not worry about whether the adder internally
word. We connect the 10 address inputs to all four
is implemented using a carry-ripple design, or using some complicated design that is S1 sO s2
ROM s. Likew ise, we connect the enable input to all
perhaps fas ter bu t larger. In stead. the des igner just needs to know the delay of the adder
four ROMs. We group the four 8-bit outputs into our Figure 5.83 An 8x I mux composed
and the size of the adder. which are further abstTactions.
desired 32-bit output. Thus, each ROM stores one byte from 4x t and 2x I muxes.
of the 32-bit word . Reading a location, say location
Composing a Larger Component from Smaller Versions of the Same Component 99, results in four simultaneous reads, of the byte at
location 99 of each ROM.
A common design task is to create a larger version of a
component from smaller versions of the same compo-
nent . For example. suppose you have 3- input AND
gates available to you, but you need a 9-input AND
gate. You cou ld compose several 3-input AND gates to
c:
form a 9- input AND gate, as shown in Figure 5.82. You
cou ld compose OR gates into a larger OR gate, and
"
8
XOR gates into larger XOR gates, similarly. Some
composi tions might require more than two levels-
composing an 8-bi t AND from 2-input ANDs requires Figure 5.82 Composing a
fo ur 2-input ANDs in the first level , two 2- input ANDs 9-inpul AND gate from 1024x32
in the second level, and a 2-input AND in the th ird 3- inpul AN D gales. ROM
level. Some compositions might end up wi th extra
inputs that must be hardwired to 0 or I-an 8-input AND bui lt from 3-input ANDs would Figure 5.84 Composing a 1024x32 32
look sim ilar to Figure 5.82. bu t with the bOllom input of the bOllom AND gate hardwired ROM from 1024x8 ROMs.
to 1. After trying a few examples of composi ng AND gates into larger ones, you can
come up with a general ru le to compose any size AND gates into a larger gate: fill the first As another example using ROM. suppose you again have 1024x ROMs a\'ailable_
level with (the largest avai lable) AND gates until the sum of their inputs equal the desired but this time you need a 2048x8 ROM . So you have an extra addre s line because y u
number of inputs, then fill the second level simil arl y (feeding first level outputs to the have twice as many words to address. Figure 5.85 haws ho\ to use two 1024x ROMs
second level gates), until a level has just one gate (that's the last level). Connect any to create a 2048x8 ROM . The top ROM represent the top half of the memory (10_4
unused AND gate inputs to 1. Composing NAND. NOR, or XNOR gates into larger gates words). and the bOllom ROM the bOllom half ( 1024 words)_ We u e the 11th addre line
of the same kind wou ld require a few more gates to maintain the sa me behavior. (a 1 0) to enable either the top ROM or the bOllom RO 1-the other 10 bilS represent the
Multiplexers ca n also be composed together to form a larger mUltiplexer. For offset into the ROM . That 11th bit feeds into a Ix2 decoder. whose output reed into the
example, suppose you had 4x I and 2x I muxes avai lable, but you needed an 8x I mux . You ROM enables. Figure 5.86 lIses a table of addresses to show ho\\ the 11 th bit selects
could compose the small er muxes into an 8x I mux as shown in Figure 5.83. Notice that among the two smaller ROMs.

-------------
278 Register·Transfer level (RTlI Design 5.10 RTl DeSign using Hardware Description languages (See Section 9.51 279

,...- -- - - -- -- ---- ----- - -------------------,


ACllIally. we could lise any bit 11' , 5.10 RTL DESIGN USING HARDWARE DESCRIPTIO N LANGUAGES (SEE
to scicct between the top RO I and -0 .,-'-rl~=::;::;;::;_-T1 add, SECTION 9.5)
bOllom ROM . Designers com- -g 1024x8
ROM This section. which physically appears in the book as Section 9.5, describes use of IfDLs
mon l) use the lowest-order bit (aO)
during RTL design. One use of this book describes such HDL use immediately after
to se lecl. The lOp ROM would thus
introducing RTL design (meaning now). Another use describes use of HOLs later.
represent all evenly add ressed
ij -....;...
~-------------------'
-- - - --·-1
words. the bOllom ROM all oddl y
5.11 PRODUCT PROFILE: CELL PHONE

':~. - j
addressed words.
Fina lly. since onl y one ROM
A cell phone, short for cellular telephone and also known as a mobile phone. is a portable
will be active at any time. we can
wireless telephone that can be used to make phone calls while moving aboul a city. CeU
tie together the out put data lines 10
L _________ __ __ __ _ phones have made it possible to communicate with di stant peop le nearly anytime and
fO fm Ollr 8-bit ou tput. as shown in
anywhere. Before cell phones, most telephones were ti ed 10 physical places like a home
Figure 5.85. or an office. Some cities supported a radio-based mobile telephone ystem usi ng a pow-
As a tinal example using
Figure 5.85 Composi ng a 2048x8 ROM from
erful central antenna somewhere in the city. perhaps atop a tall building. Because radio
ROM. suppose you needed a freq uencies are scarce and thus carefull y doled out by governments, such a radio tele-
-l096x32 ROM. but had only 1024x8 ROM s.
phone system cou ld on ly use perhaps tens or a hundred di ffe rent radio freq uencies. and
102-lx8 RO Is available. In thi s
thus could not support large numbers of users. Those few users therefore paid a very high
situ ation. we need bot h to creatc
fee for the service, limiting such mobile telephone use to a few wealthy individuals and to
more words. and wider words. The al0a9a8 aD
key government officials. Those users had to be within a certai n radiu of the main
approach is straightForward: fi rst. 0000 0000000
0 0000000001 add, antenna, measured in tens of miles, 10 receive service. and that ervice typically didn·t
create a -l096x8 ROM by using 4 work in another ci ty.
000000000 10 1024xB
ROMs one on top of the other and ROM
by feed ing the lOp two address o1 1 1 1 1 1 1 1 1 0 en data
lines to a 2x4 decoder 10 select the o1 1 1 1 1 1 1 1 1 Cells and Basestations
appropriate ROM. and then 0000000000 Cell phone popularity exploded in the basestation
second. widen the ROM by adding 0000000001 add, I990s, growing from a few million users antenna
3 more ROM s 10 each row. 0000000010 1024xB to hundreds of millions of users in that
Most of the datapath compo- ROM
decade (even though the first cell phone
nents we introduced in Chapter 4 1 o en data
1111 ca ll was made way back in 1973. by
can be composed into larger ver- Martin Cooper of Motorola. the inventor
sions of the same type of Figure 5.80 When composi ng a 2048x8 ROM from of the cell phone), and today it i hard
component. two 1024x8 ROMs. we can use the highest address : ..c:: ..'
for many people to remember life before \ -~ --- . "
bit choose among the two ROMs: the remaining cell phones. The basic technical idea ''- ____ ~---.L.....,.- tollrom

-
(0

add ress bits offset in to the chosen ROM. behind cell phones divides a city into regular
numerous smaller regions. known as ' - - - - - - - ' phone
city
cells (hence the term ··cell phone·). system
Figure 5.87 shows a city divided into
5.9 RTL DESIGN OPTIMIZATIONS AND TRADEOFFS (SEE SECTION 6.5) three cell s. A typical city might actually Figure 5.87 Ph nc 1 in cell can use th same
be divided into dozens. hundreds. or radio frequency as phone _ in cell C. in reasing
Previous sections in this chapter described how to perform reg ister·transfer level
the number of po sible mobile phone u!!ocrs in 3
de,ign to crea te processors consisting of a controller and a datapath . This section, even thousands of ce lls. Each cell has its
city.
whi ch phy,ica ll y appears in the book as Section 6.5. de~cribes how to create proce · own radio antenna and equ ipment in the
so r~ that are beller optimized. or that trade off one feature for another (e.g., size for center. known as a basestatioll . Each basestation can u 'e dozens or hundreds of different
performance). One u~e of this book covers such RTL optimi zati ons and tradeoffs mdio frequencies. Each basestation antenna only needs to transmit radio signal> po\\erful
immediately after introducing RTL design. meaning now. Another use introduces enough to reach the ba, estation·s cell area. Thu . nonadjacent cell. can a 'tuall~ ll"lSc' the
them later. same frequenci es. so the lil11it~d number of radio frequ'ncies ullo\\ro for mob,l phone -

~ ·_oi .. .
~ _ . _ _ _ _ __
280 Register-Transfer l evel (RTl) Design 5.11 Product Profile: Cell Phone 281
can bc thus shared by more th an one phone at onc tim e. Hence. far more users can be indicates that your phone is in cell A. In one Lype of cell phone Lechnology, the swi Lching
supported. lead ing to reduced cos ts per user. Figu re 5.87 illustrates th at phone! in cell A
office computer assigns a specific radio frequ ency supported by basesLaLion A LO the call.
can usc th e sa me radio frequency as plwI/e2 in ce ll C. because the radi o signals from cell Ac tuall y, the computer assigns two frequencies, one for tal king, one for Ii teni ng_ so that
A don't reach ce ll C. Support ing more users means greatl y redu ced cos t per user, and more talking and listening can OCCur simulLaneously on a cell phone-Iet's call that frequency
basestal ions means serv ice in more areas than just major cities . pair a channel. The computer then tell s your phone to carry OUL the cal l over the assigned
Figure 5.88(a) shows a typical channel, and your phone rin gs. Of course, iL could happen Lhat Lhere are so many phones
basestntion antenna. The basestation's already involved wiLh calls in cell A Lhat basestaLion A has no available frequencies-in thaL
equiplllc lll Jllay be in a small building case. the caller may hear a message indicatin g Lhat user is unavailable.
or commonly in a sma ll box near th e Placing a call proceeds similarl y, but your cell phone initiate the call , ulLimately
base of the antenna . The antenna resulting in assigned radio frequencies again (or a "system busy" message if no frequen-
shown actu all y suppon s antennas cies are presently avai lable).
from tWO di fferent cellul ar servi ce Suppose that your phone is presently carrying OUI a call with base LaLion A, and thai
providers-one set on the top. one set you are moving through cell A toward cell B in Fi gure 5.87. BasesLation A wi ll see your
just under. on the same pole. Land for signal weakening. wh ile basestation B will ee your signal strengLhening_ and the two
th e poles is expens ive. whi ch is why basestaLions transmit thi s informati on LO the switching office. AL some point the
providers share. or sometimes find switching office computer will decide to switch your call from base Lation A LO basesta-
existing tall Slnlctures on whi ch to tion B. The computer assigns a new channel for th e call in cell B (remember. adjacent
mount the anten nas. like buildi ngs. cell s use different sets of frequencies to avoid interference)_ and sends your phone a
park light posts. and oLher interestin g command (through basesLat ion A, of course) to switch to a new channel. Your phone
places (e.g .. Figure 5.8 (b)). Some swi tches to th e new channel and thu s begi ns communicaLing wiLh basestaLion B. Such
prov iders try to disgui se thei r antennas swi tching may occur dozens of Limes while a car dri ves Lhrough a city dwing a phone
to make Lhem more soot hing to th e ca ll , and is tran sparent to the phone user. SomeLimes th e swiLching fails. perh aps if the
eye. as in Figure 5.88(c)-th e entire (a) new cell has no available frequencies. resulLing in a " dropped" call.
Lree in th e picture is artifi cial.
All the basesLations of a serv ice Figure 5.88 Basestations found in vari ous locations.
provi der con neCL to a central switching Inside a Cell Phone
office of a ci ty. The switching office not only lin ks th e cell ul ar phone system LO the regular Basic Components
"Iandline" phone sysLem , bUL also assign phone calls LO specific radio freq uencies, and A cell phone requires sophisticated digital circuiLry LO carry OUL call . Figure 5.89 how
handles SwiLching among cell s of a phone moving beLween ce ll s. Lhe insides of a typi cal basic cell phone. The printed-circuit boards include evera! chip
implemenLing digiLal circuits. One of Lhose ci rcu its performs analog-Lo-digital conversion
How Cellular Phone Calls Work
Suppose you are holding phol/e l in cell A of Figure 5.87. Wh en you turn on the cell phone,
Lhe phone listens for a signal fro m a basestati on on a comrol freq uency, which is a special
radio freque ncy used for communicaLing commands (raLher th an voice data) between the
basestation and cell phone. If the phone finds no such signal, th e phone reports a "No Ser-
vice" error. I f the phone find s the signal from basestati on A. Lhe phone Lhen Lransmits its
own identifi ca tion (10) number to base taLion A . Every cell phone has its own unique lD
number. (Actuall y, Lhere is a nonvolatile memory card inside each phone Lhat has Lhat lD
number-a phone user can potentially witch cards among phones. or have mu ltiple cards
for th e ~ame phone. switching cards LO change phone numbers.) Basestation A communi-
cates Lhis ID number to the cemral switching office's computer, and Lhus the service
(a) (b) (e)
provider compuLer database now record Lhat your phone is in cell A . Your phone intermit-
Lently sends a comrol ~ i gn al to remind th e swi tch ing omce of the phone's presence.
Figure 5.89 Inside a cell phone: (a) handset. (b) battery and ID card on left. ke~ pad JJld displJ~ in
If '>omebody Lhen calls you r cell phone's number. the ca ll may come in over the regular
ccnler. digital ircuilry on n printed-circui t board on right , tc) the two side-s of the prinloo<u'Cuit
phone sY'tem. which goes to the switching office . The ,witching omce computer database board. showing severnl digitnl chip package$ mounted on the bo:.ml.
282 Register-Transfer LevellRTLI Design 5.11 Product Profile: Cell Phone 283

of a voice (or olher sou nd) 10 a signal Slream of Os and 1s, and anolher performs digital- FIR ~~I'S ~e some examples of lhe versalilily of an FIR fi lter. Assume we have a 5.tap
lo-analoll conversion of a received digital strea m back (0 an analog signal. Some of the ter. or slaners, 10 Simply pass a signal lhrough lhe filter unchanged, we sel cO 10
I d
circui ls. -lypicall y so ft ware on a microprocessor. exeCUle lasks lhal manage lhe various I ,an hwe el cl=c2-- c3-- c4-0 " amp I'Ify an IOpUI
-. "'0 . .
SIgnal, we can sel cO 10 a number
fealures of lhe phone. such as lhe menu syslem. address book. games, eiC. NOle that any arger t an I, perhaps selling cO 10 2. To creale a moothin o fil ler thai OUlputs the averaoe
daw Ihal you save on your cell phone (e.g" an add ress book. cuslomi zed ring lones, game . Ipresent val ue and lh e pasl ~our IOpUI
of the . va lues. we can"SImply
. sel all the conSlants "10
high score information. elc .) will likely be slOred on a fl ash memory, whose nonvolalilily eq ul va enl valu ~s lhat add 10 I, namely, c!=c2=c3=c4=c5=0.2. The results of uch a filter
en~u res lhe data Slays saved in memory even if Ihe ballery dies or is removed. Anolher applied 10 a nOIsy IOpul Signal are shown in Figure 5.90. To smoolh and amplify. we can
imponanl lask involves responding 10 commands from lhe Swilching office. Anolher task sel all conSlalllS 10 equi val I h .
c!=c2=c3=c4= ' _ enl va ues I at add W omethlOg grealer than I. for example,
carried ou l by lhe digilal circu ils is fi ltering. One lype of filt ering removes the canier . c5-1, resultlOg 10 5x ampllfi callon. To creale a smoothing filter thai only
radio signal from lhe incoming radio freque ncy. Anolher lype of fillerin g removes noise IOciudes lhe prev ious lWO rather lhan four inpul values, we simp ly sel c3 and c4 10 O. We
fro m lhe digili zed audi o Slrea m coming from lhe microphone, before transmitting lhal see that we can build alilhe above different fillers j usl by changing the conSlanl values of
stream on the outgoing radi o frequency. Let' examine fi ltering in more delail. an FIR fi lter. The FIR fi ller is indeed quile versatile.
Filtering, and FIR Filters
Filtering is pe rhaps lhe moSI common task performed in digi lal signal processing. Digilal
1.5
1 - - - - - - - - -- - - - - - - - - - - - - ,
____ original
signal processing operales on a slream of digi lal dala lhal comes from digitizing an inpul
si!:mal. such as an audio. video, or radio signal. Such streams of data are found in count-
1---1Jli/ol!~IIi;;:.---------- ---...-
noisy
-+- fir_av9-out
le;s electronic devices. such as CD players. cell phones. hean monilors, ultrasound
machines, rad ios. engine conlrollers. eiC. Filterillg a dala slream is the lask of removing
panicular aspec ls of lhe inpul signal , and OUlpulling a new signal wilhout lhose aspecls.
A com mon fi llering goa l is 10 remove noise from a signal. You 've cenainly heard Ilil

noise in aud io signals-ii 's thal hissi ng sound lhal 's so annoying on your slereo, cell
phone . or cord less phone. You 've also likely adjusled a fi ller 10 reduce lhal noi se, when -{) .5 r------------S)~~-----_~~
you adjusled the "lreble" conlrol of your Slereo (lhough lhat fil ler may have been imple-
mented using ana log mel hods ralher lh an di gilal). Noise can appear in any type of signal, -1 r------------~~~~~---~
nOI jusl audio. oise mi ghl come from an imperfecl lransmilling device, an imperfecllis- - 1.5 ' - - - - - -_ _ __ _ _ _ _ _ _ __ _ _ _~
lening device (e.g., a cheap microphone), backgrou nd noise (e.g., freeway sounds coming
inlo your cell phone). eleclrical inlerference from other eleclric appli ances, etc. Noise Figure 5.90 R e~u lts ~f a 5·tap FIR filler wilh cO=<:I=c2=<:3=c4=0.2 applied 10 a nois)' signal. The
lypi ca lly appears in a signal as random jumps from a smoolh signal. ongmaJ signal IS a slOe wave. The noisy signal has random jump _The RR output (fir_:l\ ~oUl) i
Anolher common filtering goal is 10 remove a carrier frequency from a signal. A m.uch sm~lher than th~ noisy sig.nal. approaching the original signal. Olice that the FLR output i
carrier freque ncy is a signal added lO a main signal for the purpose of lransmitting thai sllghlly shIned 10 Ihe nghl. mean ing Ihe OUlPUI is slightly delayed in time (probably a riny fra rian
main signal. For example. a radio slat ion mighl broadcasl a radio signal al 102.7 MHz. of a second delayed). Such slighl shifling is usual ly nOI imponanl 10 n particular application.
102.7 MHz is lhe carrier freq uency. The carrier signal may be a sine wave of a panicular
freq uency (e.g" 102.7 MH z) lhal is added 10 lhe main signa l, where lhe main signal is the . Thai versalilily eX lends even further. We can actually filter OUI a carrier frequen y
music signal ilself. A receiving device locks on 10 the carrier freq uency, and then fil!e~ uSlOg an FIR filter, by selllllg lhe coefficiellls 10 different value. carefully chosen 10 filter
OUI a pani cular freque ncy. Figure 5.91 shows a main signal. ill I . thai we \\ am 10 transmit
oul the carrier signal, leavi ng the main signal.
An FIR filler (usually pronounced by saying lhe lellers " P' " I" " R"), shon for "Finite We can add that to a carrier signal , ill2, 10 oblain the composile ignal. ill _lotal. The
Impulse Response," is a very general filler design that can be used fo r a huge varielyof SIgnal III_lOra /IS lhe SIgnal lhal would be the signal lhal i transmined by a radio lation.
fillering goa ls. The basic idea of an FIR fi lter is very sim ple: multiply the present inpul for example. wi lh illl being lhe signal of the mllsic. and ill2 lhe carrier freque nc~ .
va lue by a constan!. and add that re ul! 10 the previous inpul value limes a conslant , and add Now ay a lereo receiver receives that composile signal. and needs 10 filter OUI the
thai result 10 lhe nexl earlier inpul value limes a con lant. and so on. A designer u ing an carrier signal, so the music signal can be sent 10 the slereo peakers. To delermine h \\ I
FIR filter achieves a particular filtering goa l simply by choosillg Ihe F1R filler 's COllslalllS. filler OUI lhe carrier signal. look carefully at the am pies (the small tilled squares in
Figure 5.9 1) of that carrier signal. Olice lhal lhe sampling rale i' such that if \\e lake :10'
Malhematica ll y. an FIR fi lter can be described as foll ow:
sample. and add il 10 a sample from three time lep back. \\ e !!el O. That's be,:au,e f '" ~
Y( I ) = cOx.r(t) + (' I xX(I - I ) + c2xx( I -2) + c3X .« I -J) + c4 xx(I-4) + ". po ilive poil1l. lhree samples earlier wa a negative poinl of the same magnitude. For a
I i, the pre\enl lime slep. x is lhe inpul signal. and y i, lhe OUlput signal. Each lenn negalive poil1l. lhree samples earlier was a positive point of lhe same magnitude. nd for
(c.g., CO*X(I)) is ca lled a lap . So the above equation represenls a 5·1ap FIR filter. a zero poin!. lhree samples earlier was also a zero poin!. Like\\ ise. adding a "artier .ignal

F
28~ Register-Transfer Level (RTL) Design
5.12 Chapter Summary 285
2.5 -+- in1 - 5.12 CHAPTER SUMMAR Y
___ in 2 -
H M ____ in_total -
1.5 In th is chapter, we described (Section 5. 1) that much digi tal desig n today involves designing
[J \ f"+
l in
0,5
j~ ~ .Jr\rI R rI M rI 11 processor-level components, and that design is do ne at what is called the register-transfer
level (RTL). We Introduced (Section 5.2) a fo ur-s tep RTL design method for convening
oIl..r-ll L\ \ fN....1 'J. Jtj \ JT'U.
r,
-0,5
1
1If
.l!
'r
~ .. \"1 J ' l ~
u u ~ ~
~
'i \ \' /~
ffll Joi
\}
RTL behaV ior to a processor implementation, wi th that implementation consisting of a data-
path controlled by a Contro ller. The RTL design method made use of the datapath
components de fined In Chapter 4, and the contro ller des ign proce s defined in Chapter 3,
-1 .5
-2
V. l.-' which buil t on the combinational design process of Chapter 2. We provided several exam-
ples .of RTL design (Section 5.3), while poi nting o ut several pitfall and good design
-2.5
praCllces, and dlSC llSSlng the characteristics of control- versus data-dominated designs. We
Figure 5.91 Adding 3 main signa l. iI/I. (0 a carri er signal. i1l2. res ulting in a composi te signal d iscussed (Secnon 5.4) how to set a circuit 's clock freq uency based on the circuit's critical
ill_fOfa!. path . We de monstrated (Section 5.5) how a sequent ial program. like a C program. could
conceptuall y be conven ed to gates using some straightforward transforma tions that trans-
sample to a sa mple three steps later also adds to zero. So to filt er o ut the carri er signal , we form the C 11110 RTL behavior, which as we kn ow can then be converted to gates using the
can add each sa mple to a sample three time steps back. Or we can add each sample to fo ur-step RTL deSign method. That demonstration sho ul d make it clear that a di"ital
112 times a samp le three steps bac k. plus 112 times a sampl e three steps ahead. We can syste m's functionality can be im plemented as e ither software on a microprocessor or ~ a
ac hieve this using a 7-tap FIR fi lter wi th the follow ing seven coe ffi c ients: 0 .5. 0, 0, 1, 0, c ustom di g ital circuit (o r even as both). The diffe rences among software and custom circuit
0. 0.5. S ince that sums to 2. we can sca le the coefficients to add to I, as fo llows: 0.25, 0, imp lementations are not related to what each can implement-they can both implement any
0.0.5. O. O. 0.25. Applying such a 7-tap FIR fi lte r to the composite signa l res ults in the fun ctionalit y. The diffe rences are instead related to design metrics like system performance.
FIR ou tput shown in Fig ure 5.92. The ma in signa l is restored . We sho uld point out that power consumption, size, cost, design time, and so on. Modem digital designers must there-
we chose the mai n signa l such that thi s ex ample wo uld come o ut ve ry nicely--{)ther fore be comfon able migra ting fun ctionality between software on a microprocessor and
signals m igh t nO! be restored so perfect ly. But the exa mple de monstrates the basic idea, custom dig ital c irc uits, in order to obtai n the best overall implementation with respect to
2.5 r - - - - - - - - - - - - - - - - - -- - ----, constraints on design metries. We introduced (Section 5.6) several memory components
commonl y used in RTL design, including RAM and ROM components. We also introduced
2f---------,--------------- ---- in_total
(Sectio n 5.7) a queue component that can be useful d uring RTL des ign. We took a moment
f-----,..r-+\--f+-~------------ __ fir_out
to di scuss (Section 5. 8) a general technique that we've been using throughout the book.
hierarchy, which helps a designer to manage complexiry.
In C hapte rs I through 5, we have e mphasized straightforward design methods for
increas ingly complex systems, but we have not emphasized how to de ign those sy terns
well. Im proving on Our designs will be the focus of the next chapter.
-1 ~---------~_4+_~~~,~~~~~~
-1.51 - - - - - - - - - - - - - - = ---\:+--I-+---\+_......- - - j
-2 ~--------------------------~~~------~
5. 13 EXERCISES
-2.5L---_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ ---"
Any prob le ms noted with an asteri k (*) represent especially chal lenging problems.
Figure 5.92 Filtering out the carrier signal using a 7-tap FIR filte r wi th constants 0.25, 0, 0, 0.5, O.
0.0.25. The slight delay in the outpu t signa l typicall y poses no problem, SECTION 5,2: RTL DESIGN METHOD
5. 1 (a) Create a high-level Sla te machine that describes the following system beha\-jor. The '} tem
While 5-tap and 7-tap F IR fi lters can cen ai nl y be found in practice, ma ny FIR filters may
con tai n te ns or hundreds of taps. FIR fi lte rs can cenai nl y be im ple mented using software (and h'15 an 8-bil input A. a single-bit input d. and a 32-bit ompUI S. On every clock C) Ie. if
d= 1. the system shoul d add A 10 a ru nning sum and output thut sum on S. If d=O, the
often are). but many applications require that the hundreds of llluitiplications and additions
system should instead subtract. Ignore issues of overflow and underllo\\ , Oon'l forgel to
for every sample be executed faster tha n is possible in so ft wa re, leading to custom di gital
include an initializa tion sta te. H im: Declare and use an internal register (0 keep the sum.
circui t implementations. Exam ple 5.8 ill us trated the des ign of a c irc uit for an FfR filter. (b) Add u I-bit input rs t to the system. When r s t =1. the system hould dear its sum back to O.
Many types of filte rs exist othe r tha n F IR fi lter;. Dig ital signa l fi lte rin g is pan of a ~ 5,2 Crea te a high-level state machine for a simple data encryption/decryption dc\'i c. If:1 bit-input
large r field known as d igita l signa l process ing, o r DSP. DS P has" ri ch mathematical PLUS b is 1. the device stores the data from 3 J2-bit input I as \\ hat is kno\\T1 as an off ~( \"3lue. Lf
foundation and is a field of study in itself. Advanced fi lte ri ng me thods are what make cell b is 0 and another bit-inpu t e is 1. then the devkt! "en [,)plS" its input I b~ adding the stored
phone conver>ations as c lear as they are today. olTsc t value to 1. and OUlput$ this encrypted "'title o\er 3 ~2-bil out Ul J. If ifure':.1d anothi'r

- - _. - - ._-----------
286 Register· Transfer LevelIRTL) Design
5.13 Exercises 287
bit-input d i'\ 1. the device should "decrypt" the data on r by subtr<lct ing th e offset value 5.10 (a) Use the RTL design method of
before outputting the decrypted va lue over J. Be sure to explicitl y IWlldle nil possible cambi- Table 5. 1 to conVert the hi"h. Inputs: slart(bil), datal8 bilS), addr(8 bits), W wail(M)
nation~ of the three input bits. level stale l1l:lchine in Fig~re Outputs: w_dalalB bits). w_addrlB bits), w_.wlbil)
r---.
PLUS
5.3 Crca.tc a hi2h-l evc l stale machine for n digita l bath-water conl roller. The syste m has ::J. 3-bil 5.94 to a COntroller and a data-
input ra t i-O ind icating the desired ratio of cold wate r to hal wa ter. and a bit input on indi- path. Design the datapath to
cating that (he water should flow. The system has two 4-bit o utputs hfl ow and efl ow, Slmcturc. bUI design the con-
conlr~ lIin 1! the hal water now rJte and the cold water fl ow rale. The sum of these two rates troller to an FSM only, as was
should ah~'JYs equnl 16. Your hi gh-level slate machine shou ld dClcnnine the output values for done in Figure 5.26.
h f 1 01,01 and c flo w such that the r3 tio or hot wate r to cold w;lter is as close as possi ble to the (b) "Design the COntroller s FSM
desired rrt ti o. while th e total now is always 16. Him: As there are only 8 possi ble rat ios, a rea· down to structure.
sonablc solution may use one statc ror each ratio. w_wr::1
5.1/ Create an FSM that interfaces
5A Create a high-leve l Slllte mac hine that initializes a 16x32 register fi le's contents to all Os, w_addr=addr
~·S beginning the initial iz..llion when an input rs t is 1.
with the datapath in Figure 5.95.
The FSM should use the datapath Figure 5.94 High-level stJte machine of bus
".-.... 5.5 (a) Create a high-level state machine that adds each register or one 128x8 register file to the to com pute the average value of interface with bus wait signal.
PLUS correspond ing registers or another 128x8 regis ter file. storing th e results in a third 128x8 the 16 32-bit elements of any
register file. The system should onl y begin th e addit ion whe n a bit-input add is 1. and
Ad~ay A;:rray A is stored in a memory. with the first element at address"5 the second at
should not perrOnll the addition again until it has finished adding (onl y adding again if
add is I).
a ress - . ,md so On •.Assum e that putting a new value onto the address line-s Maddr causes

(b) Extend this system to ei ther add or subtract. using an additional bit-i nput OPt where b
l~el' mcmf ory to almosl lI11mediarcly Output the read data on the M_data lines. leno-re the po i.
I lIy 0 overflow. -
op = I mea ns add . and op = 0 means subtrac!.
5.6 Design a hi gh-level state machine ror a 4-bit up-counter with co unt control input cnt. count
clear input C1r . and a terminal count ou tput tc. Use the RTL design method of Table 5.1 to
cOI1\'en the high-level state machine to a controller and :l dn tapath. Use a register and incre·
mcntcr in the d:lIapath. not :l co unt er itself. Design the controller down to a state register and
logic gates.
5.7 Compare th e up-counter designed in Exercise 5.6 with the up-counter design shown in Figure
4.48.
5.8 Creme a datapath fo r the
hig h-level state machine in Inpuls: A, S, C (16 bils) ; go, rsllbit)
Outputs: S (16 bits)
Figure 5.93.
Local registers: sum
5.9 · Slaning with the soda sum<5096
machine di. penser design
described in Example 5. 1,
create a block diagram and
high·level state mac hine for
0-sum: average
sum+C
a soda machine dispenser Figure 5.95 Datapmh for computing the :lverage of 16 elements of an arm) .
Isum<5096)'
that has a choice of t\vo soda
types. and that also provides
change to the consumer. A Figure 5.93 Sample hi gh-leve l state machine. 5.12 Using the RTL design method show n in Table 5. 1. create an RTL desien of 3 reaction timer
coin detector provides th e circuit that measures the time elapsed between the illumin3lion of a ligh; and Ih~ pressing of a
circuit wi th a I-bit input c that becomes 1 for one clock cyc le when a coin is detected, and an button by ;1 user. The reaction timer has three inputs. a clock inpUi elk. 3 fCSet input rsl. and :1
8-bit input a indicating the coin's va lue in cents. Two 8·bit inputs s I and s2 ind icate the coS! bUllon input B. and three OlHpUIS.:1 light enab le output lell. a IO-bit rea tion time output nime.
of the two soda choices. The user s soda selecti on i con trolled by two bUllons b I and b2 that and a slol1' Output ind ic~H ing the lIser was not f:lst enough. The reaction timer \\ orks 3..\ fol-
whe n pushed will ou tput I for one clock cycle. If the user has inserted enough change for their lows. On reset. the reacti on timer waits fo r 10 seconds before iIIuminatine the lieh! b\ scltine
<election. the ci rcuit ~ hould set either outpu t bit dl or d2 to I for one clock cycle. causing the lell to I. The reaction timer then measures the len!.!.lh of Lime in l11i11i~e~n<b ~fore 'the lbe~
,elected soda to be dispensed. The soda dispenser circuit should also set an ou tput bit cr to I presses the button B. ou tputt ing the time as n I_-bit binan number on mme. If me user did
for one clock cycle if change is required. and should output the amount of change requi"" not press the button within 1 seconds (:2CXXl milJisc ond~). the reaction timer \\ ill set the-
using an 8·bit ou tput ca. Use th e RTL design method ,hown in Table 5.1 to conve rt the high· output slow 10 I and output 2O<XJ on rrimt'. ssume) our clock input ha$ :1 fn.~uenc) of I kHz.
level ' tate machine to a controller and a datapu th . Design the da tapath to ,tructure. but design Him: This is " cont rol-dorni nnted RTL design problem. Dc,ign the dat3p;!th to structure. but
the controller to the point of an FSM only. as wa, done in Fi gure 5.26. design the contro ller to un F l\ t only. as W3,'\ done in Figure 5._6.

_ . o' i ~ . . .. ., _ _ _ _ _ __
288 Register-Transfer Level (RTL) Design
5.13 Exercises 289
5.13 Usc the RTL design method shown in Table 5. 1 to conve rt th e high-level stal e machi ne in
Figure 5.74 to a controller and a datapath. Design the dawpa lh 10 structure. but design the con- Inputs : byte a . byte b. bit go
troller 10 the poin t of an FSM only. as was done in Figure 5.26. Outputs : byte ged . bi t done
GCD:
SECTION 5.3: RTL DESIGN EXAM PLES AND ISS ES whi le(])
For the following problems. design the da tapath to stru cture. bU I design the controller to an FSM whi le( !go ) :
on ly. as done in Figure 5.26.
done: 0;
5.1~ Usi ng the RTL design method shown in Table 5. 1. create an RTL dc!\ign thai computes the
While ( !: b )
sum of all positi ve numbers within a 512-word register Hie A consistin g of 32-biL numbers
stored in IWO'S co mpl ement form .
if( b ) I
5.15 Using the RTL design meth od shown in Table 5. 1. create an RTL design that computes the - b;
sum of all positive numbers from a set of 16 separate 32-bit regis ters storing numbers in two's
complement form. Make the design as fast as possible by performing as many computations el se (
concurrently as possible. H im: Thi s is a data-dominated design. b ~ b - a:
5.16 Using the RTL design method shown in Table 5.1. create an RTL design th at outputs the
maximum value found within a regi ster fi le A consisting of 64 32-bit numbers.
5.17 Using the RTL design method shown in Table 5.1. creme an RTL design that outputs a ged = a:
warni ng signal whenever the average temperature over the past fou r samples exceeds a user-
done : 1:
defined value. The circuit has a 32-bit input CT indicating the current temperature reading, a
32-bi t input \VT indicating the user-specified temperature at which the warni ng should be
enabled. and a button input eI,. that will disable the warning. When the average temperature 5.25 Use the RTL desig n method shown in Table 5.1 to convert the high-level state machine you
exceeds the user-specified warning level. the ci rcuit should assert the output W to enable the c re~tcd in Exercise 5.24 to a controller and a datapath. Design the dalap:llh to structure. but
warning. The warning output should remain high unti l th e elr button is pressed. Him: You can deSIg n the COntro ll er to the po int o f an FSM only.
5.26 Conven lh~ f~lIowing C-like code, which calculates the maximum difference between any two
use a right shift to implement the divide within your datapath.
5.18 Using the RTL design method shown in Table 5. 1, create an RTL desig n fo r a di gital filter that numbers wlthm an array A consistin g of 256 8-bi t values. into a high-level Slate machine.
ou tputs th e average of the curren t 32-bit input and the prev ious 32-bit sample. Him: You can
usc a ri ght shift to implemcnt the divide within your datapath. Input s : byte a(256). bit go
Outputs : byte max_di ff. bi t done
SECTION 5.4: DETE RMINING CLO CK FREQUENCY MAX _D I FF:
5.19 Ass uming an inverte r has a de lay of I ns. all other gates have" de lay of 2 ns. and wires have whi I e(]) (
a delay of I ns. dete rmine the cri ti cal path for the full-adder circu it shown in Figu re 4.3 I. while( !go);
5.20 Assuming an invener has a delay of I ns. all other gates have a delay of 2 ns, and wires ha\'e done: 0:
a delay of Ins. detennine the crit ical path for the 3x8 decoder of Fig ure 2.50. i = 0:
5.21 Assuming an inverter has a delay of I ns. a ll other gates have a delay of 2 ns. and wires have max : 0:
a delay of Ins. detennine the cri ti cal path for a 4x I multiplexer.
min - 255 : II largest 8-bit va lue
5.22 Assuming an in verter has a de lay of I ns. and all other ga tes have a de lay of 2 ns. detennine while( i < 256 ) (
the cri ti ca l path for an 8-bit carry-ripple adder:
if( ali] < min) I
(a) assuming wires have no de lay.
min = ali]:
(b) assumi ng wires have a de lay of Ins.
5.23 (a) Convert the laser-based dis tance measurers FSM . shown ill Fig ure 5.21, to a state register
and logic. if( ali] > max) (

(b) Ass uming all gates have a del ay of 2 ns and the 16-b it up-counter has a delay of 5 ns. and max - ali]:
wires ha ve no delay, determine the critical path for the laser-bascd distance measurer.
(c) Calculate the corresponding maximum clock frequency for the circuit. - i + 1:

SECTION 5.5: BE I-IA VIORAL-LEVEL DES IG : C TO GATES (O(yrIO AL)


max_ diff - max - min:
5.24 Convert the following C-like code. which calculates th e greate,t C0l111110n div isor (GCD) of done - ]:
the two 8-bit number~ a and b. into a hi gh-level sta te machine.

- -. # - • - - - - - - - - -
j
290 Reg ister· Transfer Level (RTL) Design 5.13 Exercises 291

5.27 Use the RTL design method shown in Table 5. 1 to conve rt the high-level Siale machine you Exerc ise 5.32. convert the revised C code into a high· level state mac hine. Use the RTL design
cr~ated in Exercise 5.26 to il controller and a datilpillh. Design the dawpa(h to structure, but meth od shown in Tabl e 5. J to convert the high-level Sia le machine you created in Lhe previous
design th e controller to the poi nt of an FSM onl y. probl em to a controller and a datapmh. Design the dalapalh (0 structure, but design the con-
5.28 Convert the foll owing C-likc code. which calculates the number of limes lhe value b is found troller to the poi nt of an FSM onl y..
within an array A co nsist ing of 256 8-bi t values. into a high-level stat e machi ne. 5.34 * Con ve rt the while ( i < 256 ) loop with in the C code description of Exercise 5.26 to
Inputs : byte a[256] . byte b . bit go a for () loop as described in Exercise 5.32. Using the for () loop template you created in
Exercise 5.32, conve rt the revi sed C-like code into a high-level state machine. Use the RTL
Outputs : byte freq . bi t done
design method shown in Tabl e 5. 1 to convert the high-level stale machine you created in the
FREOUENCY : previous probl em to a controller and a da tapath. De sign th e data path to structure. but design
"hi 1 e( 1) ( th e controller to the poin t of an FSM onl y.
while( !go) : 5.35 Compare the time required to execute the following computation using a custom circuit versu
done = 0 : using soft ware. Assume a ga te has a delay of I ns. Assume a microprocessor executes one
i = 0: instrucLi on every 5 ns. Assume th at n:::: I 0 and 01::::5. Estimates are acceptable: you need not
design the circuit, or determine exactly how many software instructi ons will execu te.
freq = 0 :
while ( i < 256 ) ( for (i = 0 : i<n . i++) (
i f ( a [i] == b ) ( s = 0:
freq = freq + 1 : for (j 0 : j < m. j++)
+ c[i]*x[i + j] :

y[ i] s:
done l '

SECTION 5.6: MEMORY COMPONENTS


5.29 Use the RTL design method shown in Table 5. 1 10 conve rt the high- level st ate machine you
5.36 Calcul ate the approx imate number of DR AM bit storage cells th at wi ll fit on an IC with a
created in Exercise 5.28 to a controller and a datapa th . Design th e data path to structure, bUI
capaci ty of 10 million transistors.
design the contro ller to the point of an FSM onl y.
5.37 Calculate the approx imate number of SRAM bit storage ce lls tha t will fit on an IC with a
5.30 Develop a te mplate for converting a dol )while loop of th e fo llowing form to a high·level
capaci ty of 10 million transistors.
state machine.
do ( 5.38 Summari ze th e main differences between DRAM and SRAM memories.

II do while statements 5.39 Draw a complete logic internal Slructu re for :l 4:<2 DRAM (four words. 2 bilS each). clearly
labeling all intern al components and connecl.ions.
) while (cond) :
5.40 Draw a co mpl ete logic intern al structure for a 4x2 SRAM (four words. _ bits each). dead)
5.31 ' Convert th e while ( a ! = b ) loop within the C code descriptio n of Exercise 24 into a labeling all internal components and connections.
doe )",hile loop as described in Exerc ise 5.30. Using the doe Jwhile loop templ ate
SA l * Design an SRAM memory cell with a reset inpUi that when enabled \\ ill set the !TIernoI')
you created in Exercise 5.30. convert th e revised C code into a high-Icvel statC machine. Use
cell's con tents to O.
the RTL design method shown in Table 5. 1 to co nve rt the hi gh· level state machine you created
in the previous problem to a conlro ller and a da tapath . Design th e datapruh to structure, but SECTION: READ-ONLY MEMORY (ROM )
design the con troller to the point of an FSM onl y.
5.42 Summ arize th e main differences between EPROM and EEPRO M memories.
5.32 De ve lop a template for conve rtin g a for () loop of th e fo llowing form to a hi gh· level state
machine. 5.43 SUl11marize the main differences between EEPROM and Hash memories.

for(i=start : i<cond : i++) SECTION 5.7: QUEUES (FLFOS)


5.-'4 For an 8-word queue. show the queue 's intemal state and provide the value of popped datu for
1/ for s ta ements th e fo llow ing sequences of pushes and pops: (I) push A. B. C. D. E. (2) pop. (3) pop. H) push
U, V. W. X. Y. (5) pop. (6) push Z. (7) pop. (8) pop. (9) pop.
5.33 ' Convert the "'hile ( a ! = b ) loop within the C code desc ript ion of Exe rcise 5.24 to a 5..15 Create nn FSM describin g the queue controller of Figure 5.7 . Pa~ careful :JHeution t, I. )r-
f or ( ) loop as desc ribed in Exercise 5.32. , ing the for () loop temp late you created in rcctl y sell ing the full and empty OUlputS.

- -- .- - ._----------
j
292 Register-Transfer l evellRTl J Design
5.13 Exercises 293
5A6 Create an FSM describi ng the queue con tro ller o f Figure 5.78. bIll wilh error-preventing
behavior lhal ignores ;1I1Y pushes when the queue is full. and ignores pops of an empty queue
(outpuuing 0).
Chi-Kai staned co ll ege as
Hi gh-end chips. like those involved in networking, are
SECTION 5.8: HI ERAR C HY-A KEY DESIG ' CO 'CEPT an engineering major, and quite costly. and requi re careful design. "The software
became a Computer design process and th e chip design process are
SA7 Compose a 20- inpul AND ga le from 2- in pu l AN D ga les. Science major due to his fundamemall y differe nt. Software can afford to have bugs
SAS Compose a 16x I IllUX from 2x I l1l uxes. developing interests in because patches can be applied. Silicon is a different
5A9 Compose ::I -tx 16 decoder with enable fro m 2x4 decoders with enable. algorithms and in net- story. The one time expenses to spin a chip are on the
works. After graduating. order of $500.000. If the re is a show-stopping bug. you
5.50 Compose a 1024x8 RAM using onl y 5 12x8 RAM s.
he worked for a Silicon may need to spend another $500,000. This constraint
5.51 Compose a 5 12x8 RAM using onl y 5 12x4 RAM s. Valley stanup company means the verification approach taken is quite different-
5.52 Compose a 1024x8 ROM usi ng onl y 512x4 ROM s. that made chips for com- effecti vely: there can be no bugs." At the same time, these
5.53 Compose a 2048x8 ROM using onl y 256x8 ROM s. puter networking. His first chips must be designed quickly to beat competitors to the
task was to help simulate those chips before the chips were
5.54 Compose a I024x 16 RAM using only 512x8 RAM s. market. making th e j ob "extremely challenging and
buill. For over 10 years now, he has worked on multiple exciti ng:'
5.55 Compose a 1024xl2 RAM us in g 512x8 and 5 12x4 RA Ms. generati ons of networking devices that buffer, schedule,
One of the biggest surpri ses Chi-Kai encountered in his
5.56 Compose a MOx 12 RAM using only 128x4 RAM s. :md switch ATM network cells and Internet Protocol
job is the "incredible imponance of good communication
packets. "The chips required to implement networking skills: ' Chi-Kai has worked in teams ranging from 10
5.57 *Writc a program that takes a parameter ,and 3utomm ica ll y builds an N-inpul AND gate
devices are complex components th at must all work people to 30 people, and some chips require teams of over
from 2-inpul AND gotes. Your program mere ly need indicate how many 2-inpu l AN D gales
toge ther a lmost perfectl y to provide the bui lding blocks of 100 people. "Techni calJ y o utstanding engineers are
exist in each level. from which we could easily detenninc th e connec tions.
tel ecommunicati on and data networks. Each generati on of useless unless they know how to collaborate with others
devices becomes successively more complex." and di ssemi nate their knowledge. Chips are only getting
When asked what skill s are necessary for hi s job. Chi- more complex-individual blocks of code in a given chip
Kai says "More and more. breadth of one's skill set have the same complexity as an entire chip only a few
matt ers more than depth. Being an effective chip engineer years ago. To architect, design. and implement logic in
req uires the ability to understand chip architecture (the big hardware requires the ability to convey complexity."
picture), to design logic, to verify logic. and to bring up Funhermore. Chi -Kai points o ut th at 'just like any social
the silico n in the lab. A ll these pans of the design cycle en tity, th ere are politics involved. For example, people are
interpl ay more and marc. To be trul y effecti ve :1I one worried about aspiration for promotion. financial gain.
part icular area requires hands-on knowledge of the others and job securi ty. In thi greater context. the team still
as well. Also, each requires very different skills. For must work together to deliver a chip:' So, contrary to the
example. verification requires good software programming conceptions many people have of engineers. engineers
abil it y, while bring up requires knowing how to use a logic must have excellent people skill . in addition 10 strong
analy zer-good hardware ski lls:' technical ski lls. Engineering is 3 socia] discipline.

-- - - - ------------
6.1 Introduction 295

A tradeoff
gate-delays, as shown in Figure 6.2(c). Which circuit is bener. that for Gl or for G2? The
impro l'es some

6 criteria at the answer depends on whether the size or delay criteria is more imponant to us. When we

20L:
expellJe of (Jlher improve one criteria at the expense of another criteria of interest to us. we have per-
criteria oj imerest formed a tradeoff.
101lJ. A"
oplimiznlioll
improlres (II/ 14 transistors 12 transistors
criteria of illlereJI :grgate-delays :~3gate-delays '§' 15 eG l
Optimizations and Tradeoffs to liS, or improves
.wme of rhoJe
crirerioll'ir/wllr w
Gl y
z - _ _ _--l
G2
III ~
.~ '!?? 10
eG2

U'orJellillg ril e
y ~ 5
arhers,
z
1 2 3 4
G1 =wx+wy + z G2 = w(x+y} + z delay (gate-delays)
(a) (b) (e)
6.1 INTRODUCTION Figure 6.2 A circui t transformation that improves size bUl worsens de lay. lhal is. a Iradeoff:
<a) origi nal circuit. (b) transformed circuit. (c) plot of size and delay o f each circuit.
The previous chapters descri bed how to design digita l circui ts using straightforward tech-
niques. Thi s chapter will describe how to design belle,- circuits. For our purposes, beller You likely perform optimi zations and tradeoffs every day. Perhaps you regularly
means circuits that are smaller. faster. or consume less power. Real world design may commute by car from one city to another via a particular route. You might be interested in

2°L
involve additional criteria. two cri teria: com mute time and safety. Other criteria. such as scenery along the route.
may not be of interest to you. If you choose a new route that improves both commute
16 transistors 4 transistors time and safety. you have optimized you r commute. If you instead choo e a route that
'il' = D l g ate-delays 1 gate-delay ~15 e Fl improves safety at the expense of increased commute time, you have made a tradeoff (and
perhaps a wise one at that).
y- U Fl W- D F2 .~ .~ 10
x- Figure 6.3 illustrates optimi zations
r

;~: ,~, l~ :~
"' c
'il'::f'\
y=-LJ '"
:::. 5 eF2 versus tradeoffs for three different
staning designs, with the criteria of
F1 = wxy + wxy' F2 = wx 1 2 3 4
delay (gate·delays) delay and size, smaller being beller for
(a) (b) (e) each criteria. Obviously, we prefer opti-
mi zations over tradeoffs, since
Figure 6.1 A circuit tran sformalion that improves both size and delay. (hal is, an optimiza tion: optimizations improve both criteria (or
(a) original eireui !. (b) optimi zed circuit. (c) plot of size and delay of each circ ui!. at least improve one criteria without delay detay
(a) (b )
worsening another criteria, as shown by
Consider the circuit for the eq uati on involvi ng Fl shown in Figure 6. I(a) . The the horizontal and vertical arrows on the Figure 6.3 (a) Optimizations, versu (b) tradeoffs.
ci rcuit 's si ze . assumil/g tlVO t/'{ll/ sistors per gate iI/put (a nd ignoring inverters for left side of the fi gure). But we can't
simplicity), is 8 * 2 = 16 tran sistors. The circuit 's delay, which is th e longest path always improve one criteria without
from any input to the output , is two gate-delays. We could algebraica lly transform worsening another criteria. For example, if a car designer wants to improve a car's fuel
the equation into th at for F2, show n in Figure 6. I(b) . F2 represents the same efficiency, the designer may have to make the car smaller-a tradeoff among the criteria
fun cti on as Fl. bu t requires onl y fo ur transistors (in stead of 16) and has a delay of of fu el efficiency and comfort .
onl y one ga te-delay (instead of two) . The transform ation improved both size and Some general criteria commonly of interest to digital sy tem designers include:
del ay, as shown in Figure 6. 1(c). Wh en we perform transfo rmati ons that improve
Performallce: a measure of execution time for a computation on the stem.
all criteri a of interest to us, we have performed an optimizatioll.
Now consider the circuit for a different fu nction , implementing the equation for Gl Size: a measure of the number of transistors, or si lic n area, f a digital system.
in Figure 6.2(a). The circu it's size (assuming 2 transistors per gate input) is 14 transisto~ PO KIer: a measure of the energy consumed per second f a sy ' tem, direcll~
and the ci rcu it', delay is two gate-de lays. We could algebraica ll y transform the equation relating to both the heat generated by the system and t the bane!) encr:,.!) n-
Into that shown for G2 in Figure 6.2(b). which result$ in a circuit having only 12 transis- sumed by computations.
to". However, the reduction in transiMors comes at the ex pense of a longer delay of three Dozens of other criteria exist.
294

.- -- - - ------------
296 Optimizations and Tradeoffs 6.2 Combinatio nal Logic Optimizations and Tradeoffs 297

Optimi zat ions and tradeoffs can bc made th roughout nearly a ll stages of digital EXAMPLE 6.1 Two- level logic size optimization using algebraic me thods
design. T his c hapter descri bes some common optimi zatio ns and tradeoffs for some Minimi ze the nu mber of litera ls and tenns in a two- level impleme ntati on of the equation:
common c ri te ri a. al various stages o f di g ita l design.
F - xy z + xyz ' + x ' y ' z ' + x ' y 'z
Let's minimi ze using algebraic transfonnalions:
6.2 COMBI NATIONAL LOGIC OPTIMIZATIONS AND TRADEOFFS
F - xy ( z + z ' ) + x ' y , ( z + z ' )
In Chapter 2. wc descri bed how to design combinat io na l logic, name ly, how to conven
desi red combin ational behavi or into a circuit of gales. There are optimizatio n a nd tradeoff F = xy *l + x ' y ' * l
me thods we can appl y 10 make those c irc uits beller. F - xy + x ' y '
There doesn' t seem to be any further min imization we can perform. Th us, we've reduced the circ uit
Two-Level Size Optimization Using Algebraic Methods from 12 literals and 4 terms (meaning 12 + 4 = 16 ga,e inputs. Or 32 transi ,ors), down to only 4
literals and 2 terms (meani ng 4 + 2 = 6 gate inputs. or 12 transistors).
Implementing a Boolean function using onl y two leve ls of gates-a level of AND gates fol-
lowed by one OR gate-usua ll y results in a circ uit hav ing minimu m de lay. Recall from The previous example showed the most common algebra ic transformation us ed to sim-
Chap ter 2 that any Boolean equation can be wrille n in sum-of-products fo rm, simply by pli fy a Boolean equation in sum-of-products form , a tra nsfo rma tio n that generally can be
"multi plying out" the equation- for exam ple, xy ( w+z ) xyw + xy z . Thus, any wril1en as:
Boolean function can be implemented using two levels of gates, simply by converting its ab + a b ' ~ a ( b+b ' ) = a * l = a
equation to sum-of-products fonn and then using AND gates for the products followed by
an OR gate for the sum. Let's call this transformation combining terms to eliminate a variable. More for-
/ " rhe 1970s/ A po pular optimizat ion is to minimize the number of transistors of a two-level logic mally. this transformation is known as the ullitillg theorem . In the previous example, we
1980s. whe"
Transistors were
c ircuit imple mentation o f a Boolean fu nction. Such optimization is tradi tiona lly called two- appl ied this transformation tw ice, once with xy bei ng a a nd z being b. and a second time
costly (l'.g .. cen ls level logic optimiw tion , or sometimes two-level logic millimiw tioll . We 'll re fer to it as with x ' y' being a and Z being b.
each). two-level logic size optimization , 10 d istinguish such optimization fro m the increasingly Sometimes we need to duplicate a te rm in order to inc re ase opportunities for com-
minimi:arion
!!1.fiJ.!J1. si:e
popular optimizations of performance and power, as we ll as other possible optimizations. bining terms to e liminate a variable. as illustrated in the next example.
m;";",;:O/ion. To optimi ze size, we need a method to determine the num ber of transistors for a
which dominated given c irc ui t. We' ll use a simple method fo r dete rm ining the number of transistors: EXAMPLE 6.2 Reusing a te rm dur ing two-level logic s ize opti mizatio n
digllal design.
Today 's cheaper We' ll assume every logic gate in put req uires two transisto rs. So a 3-input logic Minimize the number of literals and tenns in a two-level impleme ntati on o f the equation:
transistors (e.g ..
O.OOO} ufltseach) gate (whether an AND, O R, NAJ\fD, or NO R) would req uire 3 • 2 = 6 transistors. F - x ' y 'z ' + x 'y ' z + x ' yz
make The circuits inside logic gates shown in Section 2.4 sho ul d c larify why we assume
optimi:tJrions of Yo u mi ght notice twO opponu nities to combi ne tenns to eliminate a variable:
two transistors per gate in put.
other criteria
equally or more We' ll ignore inve n ers when determini ng the number of tra nsistors, fo r simplicity. I: x 'y'z ' + x ' y ' z - x ' y '
crilical.
We can view the problem of two-le vel logic size optimi zation algebraically as the 2: x' y ' z + X'yz = x'Z
problem o f minimizing the number of literals and terms of a Boolean equation that is in
Notice that the 'enll x ' y , Z appears in both opponunities. but that tenn onl y appears once in the
sllm-o!-products form. The reason we can view the prob lem a lgebra ically is because, original equation. We ll therefore fi rst replica,e 'he tenn in the original equation (such replication
reca ll fro m Secti on 2.4. we can translate a sum-o f-prod uc ts Boolean equation direcOy to does n' t chnnge the fu ncti on, because a :: a + a) so th ai we can use the tenn twice when rom-
a circuit using a level of AND gates followed by an O R ga te . For exa mpl e, the equation bi ning terms to eliminrue a vari nble,:J. fo llows:
F ~ wxy + wxy ' fro m Figure 6. 1(a) has six litera ls, w. x, y , W, x, and y' , and two
F - x'y ' z' + x ' y'z + x ' yz
terms, vlXy and wxy " for a tota l o f 6 + 2 = 8 litera ls a nd te nn s. Eac h literal and each
term translates approx imate ly to a gate input in a c irc uit, as shown in Figure 6. I(a)-the - x ' y ' z ' + x'y'z + x'y'z + x ' yz
IlIera" translate to AN D gate inputs, and the terms to O R gate in puLs. T he c irc uit thus has F - x ' y , (z+z ' ) + x ' Z (y ' +y)
3 + 3 + 2 = 8 gate inpu ts. With two transistors per gate inp ut, the c irc uit has 8 • 2 = 16 F -x ' y ' +x ' z
transistors. We ca n minimize the num ber o f lite ra ls and te rms algebraically: F - wxy +
vlxy' = wx ( y+y' ) - WX , which ha. only two litera ls. W ;lIld x , resulting in 2 gate After we have combi ned terms to eliminate a varia ble, the res ulring tenn mi!!ht a1s
IOput . or 2 * 2 = 4 tra nsistors. as shown in Figure 6. 1(b). (Note that a one-term equation be combinable wit h other te rms to e liminate a variable. as sho \\ n in the ~ -Uowing
d oc~n ' t require an O R gale.) exa mple.
298 6 Optimizations and Tradeoffs
6.2 Combinational Logic Optimizations and Tradeoffs 299
EXAMPLE 6.3 Repeatedly combining terms to eliminate a variable which is the eq uation from Example 6.1 but wi th
corresponds notice not
Minimi ze the number of literals and terms in 3 two-leve l implementatio n of the eq uati on: terms appearing in a di fferent order. The map has to xyz;ooo, /inorder
G : xy ' z ' + xy 'z + xyz + xyz ' eight ce lls, one fo r each possible combination of F yz or x'y'z'
vari able values. Let's examine the cell in the top i I
\Ve can combi ne the first IWO terms to eliminate a variab le. and the lasl Iwo terms also: row. The upper-left cell corresponds to xyz:OOO, 00 " Ot 1t 10

G = xy ' (z '+ z) + xy(z+z ' ) mea ning x ' y , z ' . The ne., t cell to the right corre- 0 t t 0 0
sponds to XYZ:00 1, meaning x ' y ' z. The nex t ce ll
G xy ' + xy 0 1 t
to the right corresponds to xyz :011 , meaning 1 l0
]\,
We can combine the twO re maining term s to elimi na te a vari abl e: x' yz. And the rightmo t top cell corresponds to '~ -- -- - ---- -------- ----""
xyz:010, meaning x ' yz'. Notice that the trea t left and right
G xy ' + xy edges as adjacent too
orderi ng of those lOp cells is 1I0t in increasin o
x(y ' +y)
G
G: x
binary order. Instead. the order is ODD. 00 l. 01 t Figure 6.4 Three-variable K-map.
010. rather than ODD, 001, 010, 011 . The ordering
IflaK-map.
In the prev ious examples, how did we "see" the opportu ni ties to combine tenms to is such that adjacellt cel/s differ in exactly olle variable. For example. the cells for X ' Y , z
adjacell1 cells
eliminate a variable'? The examples' origi nal equations happened to be wri tten in a way differ ;1/ ('.welly (001) and x ' yz (011) are adjacent. and diffe r in exactl y one variable. namely. y. Like-
that made see in g Ihe opportunitic easy-ternls th at coul d be combined were side-by· olle mri(lble. wise. the cells fo r x ' y , z ' and xy ' Z' are adjacent. and differ only in variable x. The
side . Suppose in; tead the equati on in Example 6. 1 had been written as: map is also assumed to have its left alld right edges adjacellt, so the rightmost top cell
(010) is adjacent to the leftm ost top cell (00 D)- note those cells too differ in exactly one
F : x ' y ' z + xyz + xyz ' + x ' y ' z '
variable. Adjacent means abutted either hori zontally or vertically. but 1I0t diagonal/y.
That's Ihe same fu nction, but the terms appear in a diffe rent order. We might see that because di agonal cell s differ in more than one vari able. Adjacent bottom row cells also
the middle two ternlS can be combi ned: differ in exactly one vari able. And cells in a colu mn also differ in exactly one variable.
We can represent a Boolean function as a K-map by placi ng Is in the cells conre-
x 'y ' z + xyz + xyz ' + x ' y ' z ' sponding 10 the fun ction's mimenns . So for the equation F above. we place a 1 in cells
x ' y ' z + xy(z+z ' ) + x 'y ' z ' correspond ing to min lerms x ' y' z, xyz, xyz ' . and x' y ' z ' . as shown in Fi2ure 6A. We
x ' y ' z + xy + x ' y ' z ' place Os in the remaining cells. Notice that a K-map i j ust anotller repres;ntation of a
lruth table. Ralher than showing the output for every poss ible combination of inputs using
But then we might not see that the left and right lenns can be combined. We Iherefore a table. a K-map uses a graphica l map. Therefore. a K-map is yet another representation
might stop min imizing. thinki ng that we had obtained a full y min imi zed equation. of a Boolean fun cti on. and in fact is another standard representation.
There is a visua l method to help us see opportunities to combi ne terms to eliminate a K-lIIl1PS enable
The usefulness of a K-map for size min im ization is that. because the map is designed
liS (osee
variable. a method we now describe. such that adjacent ce ll differ in exactly one vari able. then we know that (\\,0 adjacent 1s
opportunities to
combine le rlllS ill {I K-map indicate tlia l we can combine the {H'O m;llterms TO eliminate a l'ariable. 10
to eliminate a
A Visual Method for Two-Level Size Optimization-K-Maps other words. a K-map lets us easil y see when we can combine two terms to eliminate a
mrioble.
variable. We indicate such combining by drawi ng a circle around two adjacent Is. and
Kamal/gil Maps, or K- maps fo r short , are a visual method intended to assist humans to then we show the resulting term aft er the differi ng variable i removed. We iJlu ITate in
algebraically minimize Boolean equations having a few (two to fo ur) variables. They actu· the following example.
ally are not common ly used any longer in design practice, but nevertheless, they are a very
effective means for l/Iulersf(lIIdillg the basic opti mizat ion methods underl ying today's auto· EXAMPLE 6.4 Two·levellogic size optimization using a K-map
mated tools. A K- map is essenti all y a graphical representation of a truth lable, meaning a
Mi nimize th e number of literals and le m lS in a two-level F yz
K-map is yet another way to represent a function (the other ways including an equation, imp lement~l li on of the equ:.uion:
truth table. and circu it). The idea underl ying a K-map is to graphica lly place minlenns 00 Ot t1 to
adjacent to one another if those mintenns differ in one variable only. so that we can actually F ~ xyz + xyz ' + x ' y ' z ' + x ' y 'z
oC t 1 ) 0 0
"see" the opportuni ty for combi ning terms to eliminate a variable. Ole that this is the same equation as in Example 6.1. \Ve
Three-Va riable K-Maps creme a K- map represcllI ing the runclion. shown in Figure 1 0 o( t 1 ~
6.5. We see adjacent Is at the upper left of the map. so we x'y'
Figu re 6.4 shows a K-map for the equalion :
circle Ihose Is to yield Ihe Icn11 ' y ' -i n olher \\ orus.
F - x ' y ' Z + xyz + xyz ' x'y ' z ' the circle is II sltorf/Illlld notation for).. y , Z +
I I Y. zI
Figure 6.5 Minimizing
vnriabk fun 'tion u~ing l
""
J thm.~
K-m.lp.
300 Optimizations and Tradeoffs 6.2 Combinational Logic Optimizations and Tradeoffs 301

'" x ' Y I. Likewi se. we see adjacent 1s at the bottom right circle of the map. so we draw a circle Sometimes, we need to draw circles Ihat include H yz
representing xyZ + xyz ' - xy. Thus. F ~ x' y' + xy. the same 1 twice. That's okay. For example, consider
the equation: 00 10
Recall fro m Ex ample 6.3 that someti mes terms can be repeatedly combined to elim- o 0 o
I ~ x ' y ' z + xy ' z ' + xy ' z
inate a variable. res ulting in even fewe r terms and literal s. We can redo th at example
+ xyz + xyz ' o o
using a diffe rent order of simpli fi cations as follows:
G xy ' z ' + xy ' z + xyz + xyz ' Figure 6.9 shows the K-map for that equation 's
fun ction. We can draw a circle around the bottom Figure 6.8 Four adjacenl Is.
G x(y'z ' + y ' z + yz ~ yz ')
four 1s to reduce those four mi nlerms 10 just x. But
G x(y ' (z ' +z) + y(z+z ' )) that leaves the single 1 in the top row. correspond ing
G x (y ' +y ) to minterm x ' y ' Z . We have 10 include that minterm
in the minimi zed equation, since if we left that yz y'z
G x
mintenn out, we would be changing the funcl ion. We 00 01 ) 11 10
Not ice that Ihe second line above ANDs x wit h the OR of all possible combinations could include Ihe minterm itself. yielding I ~ x +
of vari ables y and z. Obviously. onc of those combin ati ons of y and z will be true for any 0 0 1 0 0
x ' y , z. But that'S not minimized, because the ori o-
values of y and z. and thus the subex pression in parentheses will always evaluate to 1. as
~
inal equation included mi nlerm xy , z. and xy ' z 0+ 1( 1 1 1 1
we algebraically affi rmed in the latter lines above. x ' y ' z ~ (x+x ' )y ' z ~ y ' z. On the K-map. we x
K-maps also help us graphicall y see Ihis situa- G yz draw a circle around that top 1 that also includes the Figure 6.9 Circling a 1 twice.
tion. In addi tion to helping us see when we can 1 in the cell below. The minimized function is thus
00 01 11 10
combine two mi nlcrms 10 eliminate a vari able. I ~ x + y ' z.
K-maps give us a graphica l way to see when we can 0 0 0 0 0 It 's OK 10 co\'er a It 's OK to include a 1 twice-that doe n't change the function. Think about it: the
combine fo ur minterms to eliminate two variables. I more thcm ollce funcLion doesn't change if we duplicate a minlerm (don 't forgel. a ~ a + a)_and dupli-
We merely need to look for four adjacent cells. 1 C 1 1 1 1
~ to mi"imi:.e
mulliple terms . cating a minterm can allow for more optimization. In other words:
where the cell s form either a rectangle or a square
(bul not a shape like an " L"). Thosc four cell s will Fi gure 6.6 Four adjacent 15. x ' y ' z + xy ' z ' + xy ' z + xyz + xyz '
have one variable the same. and all possible combi- x ' y ' Z + xy ' z + xy ' Z ' + xy ' z + xy z + xy z '
nati ons of the other two variables. Figure 6.6 shows the earlier function G as a three- (x ' y ' z + xy 'z) + (xy ' z ' + xy ' z + xyz + xyz')
variable K-map. The map has four adjacent 1s in the bottom row. The four minterms cor- (y ' Z) + (X)
respond ing 10 those Is are xy , z ' . xy , z . xy z . and xy z ' - note that x is the same in all
fo ur minterms. whi le all four combinations of y and z appear in those minterms. We draw We duplicated a minteml. which resulted in betler optimization.
a ci rcle around the bottom four 1s to represent the simplification of G shown in the equa- On the other hand. there's no reason to circle 1s more than once if the 1 are alread
lions above. The result is G ~ x. In other words. the circle is a shorthand notation for the included in a minimi zed term. For example. the K-map for the equation:
algebraic simplifi cation of G shown in the five equ ations above. J ~ x ' y ' z ' + x'y ' z + xy ' z + xyz
A/nap drau the 'ate Ihat we could have drawn circles around
lar~~SI Circles G yz appears in Figure 6. 10. There's no reason to draw the yz
posJtble to £"Over
the left IWO 1s and the ri ght two 1s of the K-map.
circle resulting in the term y ' z. The other IWO
the 1.1 In a K·map. as shown in Figure 6.7. result ing in G ~ xy' + 00 0 1 11 10 00 10
circles cover all the I s. meaning Ihose two circles'
xy . Clearly, G can be further simpl ified to 0 0 0 0 0 xz
x (y ' +y) ~x . Thus, we shoul d always draw the
terms cause the equation to output 1 for all the o
required input combinations. The th ird circle JUSt
biggest circle possi ble. in order to best min imize
the eq ualion.
As another exa mpl e of four adjacent 1s, con-
xy .Y 1 1 1 1
xy
~
Draw the fewest
results in an extra term without changing the func-
tion. Thus. we not only wanl 10 draw the large t
o
ci,des possible. 10
Fig ure 6.7 Nonoplimal circles. circles possible to cover all the 1s. but we also want
sider the equati on: mi"i",i:.e ,he
"umber of tenus. to draw the f ewest circles.
H - x ' y'z + x'yz + xy ' z + xyz We mentioned earlier thot Ihe left and right ides of a K- map are adja nt. Thus. we
can draw circles that wrap around the sides of a K-map. For example. the K-map for th
Figure 6.8 shows the K-map for that equation's function. Circling the four adjacenl
equation:
Is yields the min imized eq uati on. H - z. K - xy'z' + yz' + ' y'z
302 Optimizations and Tradeoffs
6.2 Combinational Logic Optimizations and Tradeoffs 303
appears in Figure 6. 11. The IWO cells in the .co r~e rs K yz x'y'z
with Is are adjacenL since the left and nght SIdes of Agai n, notice that every adjacent pair of cells differs by exactly one variable. The left and
00 01 11 10 right sides of the map are considered adjacenL, and the top and bottom edges of the map
the map are adjacenl. and t h~ rc f~ re we can dra ~v one
circle that covers both . resulllllg III the term x z . 0 0 0 are also adjacent- note that the left and right cells differ by only one variable, as do the
xz: top and bOLlom cell s.
Sometimes a I does not have any adjacent Is. In
0 0 We COver the I s in the map with the two circles shown in Figure 6. 14, resulting in
that case. we simply circle the single 1. res l~lttn ~ 111 a
the terms w' xy ' and y z, so the minimized eq uation is F w' xy ' + y z.
term that is a mi ntcfm. The tcrm x ' y ' z 111 Fi gure
Figure 6.11 Sides are adjacenl. A circle covering eight adjacent cells would rep-
6. I I is an example of such a term . .
resent all combinalions of three variables so G yz
A circle in a Lhree- vm'iabl e K-map mu sL Involve
algebraic manipu lati on would eliminate all three'vari- wx
one cell. two adjacenL cell s, four adj acenL ce lls. or 00 01 11 10
ables and yield one tenn. For example, the function
eight adjacent ce ll s. A circle can lIot involve only
Lhree . fi ve. six. or seven cells. The reason IS because
in Figure 6. 15 simplifies to a single lenn, z, as 00 0 /, r;.,
0
yz shown.
the circle l11 ust represent algebraic lra nsform 3t1,OnS 01 0 1 1 0
00 01 11 10 Legal-sized circles in a four-variable K-map are
lilat elim in ate variables appearin g in all possibl e
one, two, four, eight, or sixteen adjacent cells. Cir-
combi nations. since Lhose variables can be facLored 0 0 0 0 0 11 0 1 1 0
cling all sixteen cells yields a function that equals 1.
ouL and Lhen combined La a 1. Th ree adjacenL cells
1 1 1 1 0 Larger K-Maps 10 0 1 0
don'L have all combinations of LwO variabl es-one 1\1
combi nation is mi ssing. Thu s, the circle in Figure K-maps for fi ve and six variables have been pro- ,,?

6. 12 would not be va lid. since iL corresponds La Figu re 6.12 Invalid circle. posed, but are rather cumbersome to use effecti vely. Figure 6.15 Eight adjacent ceUs.
xy , z ' + xy , z + xy z. which doesn'L simplify dow n Thus, we do not discuss them further.
to one (crm. To cover th at functi on. we would need K-maps for two variables also exi st, as shown in
LwO circles. one around the lefL pair of 1s, the oLher Figure 6. 16. However, they aren't particularl y useful ,
around the righL pair. . because two-variable functions are very easy to mini-
If all the cells in a K-map have Is. I1ke for the E yz mi ze algebraically.
funcLion E in Figure 6. 13. Lhen we would have eighL Using a K-Map
adj acent 1s. We can draw a circle around those elghL Given any Boolean fun ction of three or fo ur vari-
cell s. Since thaL circle represents the ORing of all o
ables, the foll owing method summari zes how to use a
possible combi naLions of the funcLion's Lhree van- K-map to minimize the function: Figure 6.16 Two-variable K-map.
abies. and ince obviously one of Lhose combillallons
wi ll be true for any combinaLi on of inpuL values, Lhe L COl/ vert the fun ction's equation into sum-of-minternls fonn.
Fig ure 6.13 Four adjacent 1s.
equ ation would min im ize LOJUSL E = 1. . 2, Place a 1 in the appropriate K-map cell for each mintenn.
Whenever in doubLas La whether a circle is val1d, 3. Co ver all the 1s by drawi ng the 1I1il1i1l1UIII number of largest circle uch that
j usL re member LhaL the circle represents a shorthand every 1 is included at least once. and write the corresponding tenn.
fo r algebraic LransfonnaLions th aL combine Lerms LO
4, OR all the resulting tenns to create the minimized fu nction.
e l i mi n~aLe a vari able. A circle mUSLrepresenL a seLof F yz
Lenns for which all possible combinaLions of some w x The first step. converting to sum-of-'ninternls fonn. can be done algebraically. as was
00 01 11 10
variables appear while other vari ab les are idenLi cal in done in Chapter 2. Alternatively. many people fi nd it easier to combine steps I and ~ . by
all Lenns. The changing variables can be elimin aLed. 00 0 0 1 0 converting the function's equation 10 sum-of-products fonn (where each tenn is not nec-
resulLing in a single Lerm wi Lhout those vari ables. essarily a mintenn), and then filling in the Is on the K-map corresponding to each tenn.
01 1 1 1 0 For example. consider the fo ur-variable function:
Four- Va ria ble K-Maps
K-maps are also usefu l for mini mizing fou r-variable 11 0 0 1 0 F= w' xz + yz + w'Xy'l '
Boolean fun ctions. Figure 6. 14 shows a four-variable
10 0 0 1 0 The term \< ' xz corresponds to the two lightl haded cdl in Figure 6. 17. so \\0 put
K-map for the follow ing equaLion:
Is in tho e cells. The tenn y l corre ponds to the entire dark- haded c lumn in the figure.
F = w' xy ' z ' + w' xy ' z + w' x ' yz + w' xyz yz "'--' The lenn w' xy , z ' corresponds to the single unshaded cell shown on the left with a 1.
+wxyz+,tX ' Yz Figure 6.14 Four-variable K-ntnp.
30"' Optimizations and TradeoHs
6.2 Combinational Logic Optimizations and Tradeoffs 305
Min imi zatio n wo uld proceed by coveri ng Ihe Is F yz w'xz yz left, yielding the lerm a ' bc. Alternatively, we could have drawn a c ircle that inc luded the
wilh ci rcl es a nd ~Rin g allihe lerms. The funclion in wx 1 above, yielding the term a' c d' , resulting in the minimized equation:
00 01 1t 10
Fig ure 6.1 7 is identical 10 Ihe function in Fig ure
6. 14. for w hich we oblained the minimized equation:
F : w ' xy ' + yz.
00 0
1\0\ 1 0 H: b ' d ' + a ' cd ' + a ' bd
NOI o nl y does th ai equal ion represe nt the same fun ction as the previo u equatio n, that
t 1
~~
1 0
equation wou ld also require the same num ber or trans istors as the previo u equation .
EXAMPLE 6.5 Two -level logic size optimization us ing tt 0 0 r 0 Thus, we see th ai Ihere may be mUltiple minimized equations that are equally good.
a three-variable K-map
to 0 0 1 0 Don't Care Input Combinations
Minimi ze the following equation:
G : a + a ' b ' e ' + b* (e ' + be ' ) Sometimes, we are g uaranteed that cert ai n input combinatio ns of a Boolean functi on can
Figure 6.17 IV ' xz and yz terms.
Lel"s begin by convening th e equation to sum-of-products: never appear. For those combinations, we don ' l care whether the functi o n outputs a 1 or
G: a + a ' b'e' + be ' + be ' a 0, because the function will never ac tua ll y ee th ose input values-the o utput fo r those
G be
inputs just does n' l maHer. As an intuitive example. if you became ruler of the world_
\Ve place 1s in a three-vari able K-map corresponding to a 00 01 It 10 would you li ve in a paJace or a castle? Your answer (the output) doesn't matte r. because
each teml. as in Figure 6. 18. The bottom row corresponds the inpul (yo u becoming rul er of the world) simply won't happen .
to the term a. the top left cell to term a ' b ' e ' . and the 0 t 0 0 1
Thus, when given a don't care input combination, we can choose whether to o utput a 1
right colu mn to the teml be ' (whi ch appears (wi ce in the
eq uati on). 1 1 t t t or a 0 for each inpul combination, such that we obtain the best minimization pos ible. We
We then cover the Is using the two circles shown in can choose whatever outpul yields the best minimization, becau e the output for those don' t
Figure 6.19. ~Ring lh t.! resulting tenns yields the mini- Figure 6.18 Terms on the K-map. care input combinati ons doesn' l matter, as those combinations simply wo n'l happen.
mized equation G = a + c '. Algebraically, we can use don 't care terms by introduc ing them into an equation
durin g a lgebraic minimization 10 create the opportuni ty to combine terms to eliminate a
EXAMPLE 6.6 Two-leve l logic size opti mization usi ng G be variable. As a s imple example, cons ider a function F : xy ' l ' . for which we are for
a four-variable K-map some reason guaranteed that the ternlS x ' y , z ' and xy , z can each never evaluate to l.
We no tice thai adding the firsl don'l care lerm to the equation would result in xy , z' +
Min imi ze the following equation : o x ' y ' z' (x + x ' ly ' l ' : y ' z '. Thus, introducing thai don't care term x ' y ' z '
H: a 'b' (ed ' + c ' d ' ) + ab ' e ' d ' + ab ' ed ' into the equation yie lds a minimizatio n benefit. However. introducing the second do n' t
+ a ' bd + a ' bcd ' care term does not yield such a benefit, so we choose not to introduce that term.
Converting to sum-of-prod ucts form yields: In a K-map, don 'I care input combinations can
Figure 6.19 A cover. be easily handled by placing an X in a K-map for
H : a'b'cd' + a ' b ' c 'd ' + ab ' c ' d' + each don't care mintenn. We don'l halle to cover the F yz 'fz'
a b' cd ' + a' bd + a ' bcd '
Xs with circles. bUI we call cover some X if that 00 01 tt to
We fi ll in the Is corres ponding to each term, resulting in H cd he lps us draw bigger circles while covering the 1s.
0 X 0 0 0
the K-map show n in Figure 6.20. The term a ' bd corre- mean ing fewer literals will appear in the term corre-
ab
spo nds to the two cells whose Is are in italics. All the sponding to the c ircle. For the above example, we t 1 X 0 0
other (enns are minterms and thus correspond to one cel l. b'd'
00 would draw the K-map shown in Figure 6.21 , having
We cover the Is using circles as shown. One "circle" one 1 corres po nding to xy ' z '. when the func ti on Figure 6.21 Map with don't cares.
covers the fo ur comers, resultin g in the tern' b ' d ' . That 01 a'be
lilliS/ o utpu l l, and havi ng IWO XS corresponding to
ci rcle may look strange, but remember Lhal the top and
x ' Y , z ' and xy , l, when the function ilia), OUtpUI 1
botto m cells are adj ace nt , and the left and ri ght cells arc 11 a'bd F yz 'fz' unneeded
if thai helps us minim ize the function. Drawing a
adjace nl. Another circle results in the term a ' bd, and a
thi rd circle in th e term a ' be. The minimized two- level 10 s ing le ci rcl e results in the minimized equation F : 00 01 11 10
equation is thererore: y , l ' . (Be careful in Ihis discussion not to confuse 0 X 0 0 0
the uppercase X. corresponding to a don't care. with
H - b 'd ' + a ' bc + a'bd the lowercase x. corresponding to a variable.) 1 1
Figure 6.20 K-mop exa mple. X 0 0
Ole the bolded 1 in Fi gure 6.20. We covered Remember, don't cares don 'I ha re to be cov-
that 1 by d raw ing a c ircle th at included Ihe 1 10 Ihe e red . The cover in Figure 6.22 gives an example of a Figure 6.22 Wasteful u>e of X

- - - ~ --------- --
306 Optimizations and Tradeoffs
6.2 Combinational Logic Optimizations and Tradeoffs 307
wastefu l use of don't ca res. The circle covering the botlom X. yie lding term xy ' , is not
mimerms x ' Y" l ' . xY l " and xy l can ever be true,
needed. That tenn is not wro ng, because we don ' t care whether the output is I or 0 when
because. the switch can on ly be in one of the above-stated
xy ' eva lu ates to 1. But. that term wo uld result in a larger c ircuit. because the resulting
five positions. So it doesn't mailer whether we omput a 1 G
equation is F - y ' z ' + xy ' . Since we do n' t care, why not make the output 0 when or a 0 for those three other mi nterms. We can include yz y
xy ' Z is I . and thus obtai n a smaller circuit ? th ese don~t ca,rc input comb inations as Xs on the K-map.
as shown III Figure 6.26. When coveri ng the Is in the top
EXAMPLE 6.7 Two-level logic size minimization with don't cares on a K-map fi ght. we can now draw a large r circle. res ultin g in the o
term y. When covering the 1 at the bottom left, we can
MinimilC the fo llow ing cqu~ l io n ;
draw a large r circl e also, result ing in th e term z'.
F - a ' be ' + ab c ' + a ' b ' e Although we ended up covering all the Xs in this example.
recall ,that we do not have 10 cover the XS-we onl y use Figure 6.26 With don' t cares.
give n that tefms a ' be and abc are don't cares. Intu itive ly, th ose don', cares mean that be can them If they help us COver th e Is wi th laroer ci rcles. The
~c,cr be 11. minimized eq ual ion that res ults is: G "" yO + z ' .
\Ve begin by c rc~i1in g the 3·variable K-l11ap in Figure F be a'e That minimized eq uat ion lIsing don' t cares looks a lot different than the minimized equation
6.23. We place ls in the th ree cells for the functi on's mi n- without don' t cares. But keep in mind the circu it still works the same. For example. if the witch is
lenllS. \Ve then place Xs in the two cells for the don't cares. 00 01
in position 1. then xyz will be 001. so G - y + z' evaluates to O. as desired.
We c;.m cover th e upper- left 1 using a circle that includes an o o
X. Likewise. includin2 the two Xs in a circle cove rs the (Wo
D OII'I ca res II/IISI be IIsed w;lh call/;Oll. We must balance the criteri a of size with
Is on the right with- a bigger circle. The res uhing mini- o o other criteria, like reli able, error-tolerant, and safe circui ts. when deciding whether to use
mized equation is F ~ a ' e + b.
Wilhom don't cares. the eq uation would ha ve mini- don ' t cares. We must as k ourselves-is it ever possible that the don ' t care input combina-
Figure 6.23 Using do n't cares. tion II/;ghl occur, even if in an error situation? And if it ;s possible. then do we really not
mized to F = a ' b . c + be ' , Ass uming two transistors
per g~lIc input and ignoring invc ncrs, the equation mini- , care at all what Our circu it outputs in that situation? Often. we really do care. and will
mized wit hou t don't cares would require (3+2+2) * 2 ;;;; 14 Inlllsistors (3 gate IIlputs for the first want to ensure Our circuit outputs a panicular value. For example, in the sliding witch
AND gate, 1: fo r th e second AND gate, and 2 for the OR gale, times 2 transistors per gate input), In example above, perhaps temporary values could appear at the xy z outputs as the swi tch
contrast. the eq uation minimized with don't cares requires on ly (2 + 0 + 2)*2 ;;;; 8 lransislOrs. is being moved. We might therefore want to ensure we o utput 0 for the don ' t care values.
Several common situations lead to don't cares. Sometimes don't cares come from
EXAMPLE 6.8 Don't care input combinations in a sliding switch example physical limits on the inputs-a switch can' t be in two positions at once. for example. If
Consider a sliding switch. shown in Figure
yo u've read Chapter 3, then you may reali ze that another common si tuation in whicb don't
6,2~. lhat can be in one of five positions. 3 cares may appear is in controller design, when a controller uses a state register that can
with Ihree outputs x. Y, an d Z indicati ng the 2,3,4, represent more states than the controller requires. For exan1ple. a controller with 17 tates
position in bi nary. So xy Z can lake on the detector may use a 5-bit state register, meaning that 15 of the 32 possible states of the state register
values of 001. 010 . al l. IDa , and 10 1. G would be unutilized. Those 15 states could be treated as don ' t cares (although to be safe.
The other values for xy Z are nOt possible, we might actually want to transition back to an initial tate if we ever enter one of those 15
namely. the values 000. 11 0. and III (or unused states due to noise or some other error). If you've read Chapter 5. then you may
x ' y ' z '. xyZ '. and xYZ ). We wish to Figure 6.24 Slidi ng switch example. realize that another common situation where don ' t cares arise i- in a controller controlling
dCiiign combin:uional logic. with x. y , and Z a datapath. If we aren't readi ng or writing to a particular memory or register file in a given
inpulS, that outputs 1 if the switch is in posi- state, then we don' t care what address appears at the memory or register file during
tion 2, 3, or 4, correspondin g to xy z vlI lues
that state. Likewise. if a mux feed into a register and we aren ' t loading the register in a
of 010 . 011. or 100.
given state. then we rea lly don' t care which mux data input passes through the mux duri.ng
A Boolean equntion describing the
de'ired logic is: G x ' YZ ' + x' y Z +
2 G yz that state. If we aren ' t going to load the output of an ALU into a register in a given statc-
,(y , z ' . We can minimize the eq uation using then we really don ' t care what function the AL computes during that state.
00 01 11 10 x'y
a K-map, a, shown in Figure 6.25. The mi n-
Imi/.ed equati on that rC'I ulis is: G .. xy ' l ' 0 0 0 1 1
+ x ' y. xy'z' Automating Two-Level Logic Size Optimization
However, if we con~ ider dan', carc~. we 1
1\.1/ '11 0 0
Visua l sc of K-Maps Is Rather Limited
can obtain a Simpler minimi7cd cqUlllion, In
part ic ul ar. we "'now th'H nOne of the thrce
Although the visual K-map method is helpful in two-level optimization of three- and
Figure 6.25 Without d n' t cares.
four-variable functions. the visual method is unmanageable for functions \\ ith man> more
308 Optimizations and Tradeoffs 6.2 Combinational Logic Optimizations and Tradeoffs 309
variables. One probl em is that we can' t effecti ve ly vis ualize ~naps beyond 5 or 6 vari- An implicant is a pd
'Ibles b .
·
ro uct term that may Include fewe r than all the function 's vari-
ables. Another problem is that humans make mistakes. and mi ght a~cldcntaHy not draw , , ut IS a term that onl
'.
I .
y eva uales to 1 If the function should evaluate lO I -in other
lhe biggest circl e possible on a K-map. Furthermore. the order 111 whic h a deSi gner beginS Wa rd s, an Implicant of a f . .
. bl I unction IS a l.erm that should evaluale to 1 for a panicular set of
coveri~;g Is ma y resul t in a function that has more terlm than would have been obtamed varia e va ues onl y if at I
. bl
f ' , .
east one 0 the funcuon son-set min terms evaluales to 1 for
using a diffe rent o rder. For example. consider the functi on shown 111 the K-map of Figure h
lose varia e values F l '
. I' . or examp e, the function F = x ' y ' Z + xyz' + xyz has four
6. 27(a). Starting fro m the left. a designer might first draw the circle Yielding the term IInp Icants: X ' Y ' z xy z ' ' . . .
'1 ' , xyz , and xy. Graphically, an Implicanlls any legal [but not
y ' Z '. lhe n the circ le yielding x · Y ' . the n the ci rcle yielding y z . and finaH y tlhe wcie necessan y the bi ggest possible) circle on a K-map, as shown in Figure 6.28. All min-
yielding xy . for a towl of four terlns. The K-map in :i~ure 6.27(~) shows an a tematlve terms are obViously implicants, but not all implicants are minterms .
cover. After drawing the circ le y,eldll1g the le rm y z . the deSigner draws the Circle We ~ay that the implicant xy covers minterms xy z' and xy z of function F. Graphi-
yielding x · z . and the n the circle yield ing xy. The alte rnati ve cover uses on ly three terms cally, an Implicant's circl e enCirc
. Ies the i '
s of the covered .
mlnlerms. Intuitively, we know
instead of fou r. that we can replace the Covered minterms by the covering implicant and still obtain the
same function. In other words, we can replace xy z '+ xy z by xy. A sel of implicants that
yz yz covers the on-set of a func ti on (and covers no other min terms) is known as a caver of the
00 01 11 10 00 01 11 10 function. ~or.the above function. one funclion cover is x ' y' z + xy z + xy z ' : another
cover IS X Y z + xy; yet another cover is x ' y ' z + xy z + xyz'+ xy.
1 1 1 0 0 1 1 1 0
0 RemOVing a variable from a term is known as expanding the term. which is the same
(a) (b )
1 1 1 1 a ex panding the size of a circle On a K-map. For example, for the function in Figure
1 1
\\ 0
1 1
10
I I 6.28, ex panding the term xy z to the term xy (by eliminating z) results in an implicant of
I I I
y'Z: x'y' yz xy y'Z: x'z xy the func ti on: Expanding the term xy Z' to xy also results in an implicant (the same one).
But. ex pandl,n g xyz to x z (by eliminating y) doe not resu1l in an implicant-xz covers
Figure 6.27 A cover is nOI necessaril y oplimal : (a) a four-Ierm cover. and (b) a Ihree-Ierm cover of mlnternl xy z , which IS not In the funclion 's on-set.
the same funclion. A prime implicant of a function is an implicant with the property that if any variable
were elimmated from the implicant, the result would be a lerm coveriJlo a minterm not in
Concepts Underlying Auto ma ted 1\,'o-Level Size Optimization . . . . the functio n's on-set. Graphically. a prime implicant corresponds to ;ircles that are the
Because of the above-me ntioned problems, Iwo- Ievelloglc Size optimi zation IS done pnma· largest possible-enlarging the circie further would result in coverin!! as. which chanoes
rily u ing automated compuler-based tools executing heuristic or exact algorilhms. A the function. In F~g~re 6.28, X • Y . z and xy are prime implicants. Re;;'O\<ing any variable
heuristic is a problem solving melhod lhat IIslial/y yield a good solull on. which IS Ide~lIy from Implicant x y z , say z , would result in a term (x' y . ) that covers a minlerm that is
clo e to the oplimal. but IIOt IIecessarily optimal. An exact algo flthm . or Just algomhm. ISa not In the on-set- x ' y' covers x ' y , z . , for exanlple. which i not in the function ' on-
problem olving method lhat yields the optimal soluti on. An .optimatsollltion i as good or set. Likewise. removing x ' or y ' from that term would cover a minterm not in the fun _
better than any other possible solution. wilh respect to the cri teria of Inte rest to us. tio n's on-set. Re movi ng any variable from inlplicant xY. say y , would re ult in a lerm ( )
We firs t define some concepts underlying heuristic and exact algorithms for two· Ihat covers minterms not in the on- et. On the other hand. xy z is not a prime impli ant.
leve l logic ize optimization . We wi ll illustrate lho e concepts graphicall y on K-map . but because z can be removed from that implicant without changing the function. since y
uch illustration i onl y intended to provide the reader with a n intuition of the concepts- co.vers nllntemls xyz and xy z' , both of which are in the on-set. Likewise. xyz' is nOla
automa ted tools do not u e K-maps. prime Implicant. because z ' can be removed. There is no rea on to cover 3 function with
Recall that a functio n can be written as a um-of-m interm equation. A minterm is a anything othe r than prime implicants, since a prime implicant a hie,'es the same function
product te rm that includes all the function' variable exactl y once, in ei lher true or com· wi th fewer literals than nonprime inlplicants (which is why we n!W:l) dra\\ the bi !! t
plemented form . The on-set of a function is the set c ircles po si ble in K-m np ). =
of minterms that define when the fun ction should F An essential prime implica"t is a prime implicant lhal is the mIl)' prime intplic3Dt
evaluate to 1 (i.e .. when the functi on is '·on"). For yz ':y'z that covers a particular minteml in the fu n tion' on-set. Graphicn!I). an e - ntin! prime
the function in Figure 6.28. the on-set i ~: I x . y' Z. 00 IIllp! lcant I the on ly circle (the largest PO' ible. f course. in e the circle rou ' represent
/,y Z, xy Z ' I. The off-set of a functi on is all the II prime Implicant) that covers a parti ular 1. In Figure 6.2 . x· ' l is IlIl e ' ntial prime
o
remaining minterms. For the functio n in Figure implicant. II i xy . because each i the only prime impli Wit vering n pani -ular 1.
6 .28. the off- et is: I x' y , z ' . x ' y z " x' y Z. o no nessent ial pri me implicunt is a prime implicant \\ hose ,-o\ ered ruintenns are nJso
I Y , z ' . Jl.y . 1 J. V,jng compact mintcrm' rep re~e n­ covered by one r more other prime implicllnts. Fig.ure . ~9 shO\,s II different function
tallon (<oee ection 2.6), the o n-~ct i, 11 .6.7}. and Figure 6.28 Impliennl'. thnt has four prime implicant. but only two of which are e s ntial . ' 'is an e,' ntia!
the off-\et j- 10,2,3.4.5} . prime implicant because it is th' only prime impJi ant that o'crs mint-eml \ '_ ':'. _
310 OptimIZations and Tradeoffs 6.2 Combinational Logic Optimizations and Tradeoffs 311

j" nn c.." ential prilllt! illlplicani bt!cnllsc it is the on ly not essential We'lI demonstrate the ap proach for automated two-level logic size optimization with
prime impiicarll that CO\'\!f" minlenn xY Z ' . y' z i~ a G yz y'z the following exam ple.
nones ... elllial prime implicant because bo th of ItS
00 10 EXAM PLE 6.9 Two-level logic size optimization with the approach of Table 6.1, illustrated on a K-map
co \cred minh::rm", are cOH~rcd by other implicants
(lho;e other prime implicants mayor may not be o Figure 6,30 shows a K-map for the function from Fioure
essential prime implicants). Likewise. Xl i not 6.27, for which we saw thai different covers yielded°dif_ yz
':z
C'!-~entinl. The importance of essentia l prime illlpli- ferent numbers of terms. The first step is to determine all 00 01 1,] 10
x'y'
C3nt" i~ i.l~ roll ow~: we know that we must include all essential xz xy prime impJicams, shown in the top pan of the fi gure. For
0 1 1 1
e~~c nti n l prime impiicanls in n function' s cover. 0111- not essential essential each 1. \~e draw every possible circle involving adjacent 0
(a )
en' i ~e there would be .sO l11e minlcrms that could not Figure 6.29 Essential prime
Is. ensunng that each circle is the largest possible.
1 t
1\0 0 v;-
be cove red . We mayor may nOl need 10 include non- impl icnn t<;,
The second step is to add essential prime impli-
cants to the function's cover. Notice that the 1
,
e"emia l primc implicams 10 completely cover the y'z' ':y'
corresponding to mi11lerm x ' y Z (the top righl 1) is
function. but we must include all essentia l prime yz
covered only by one prime impl icant. namely. x ' z. x'z
implicant <. Th~s, ~vc know we' ll need to usc that prime implicant. so
Given the nOlion of prime implicants and essential prime implicants. a simple we II Include prime implicant x ' Z in the cover. Also
approach for two-level logic optimization is given in Table 6.1. notice that the 1 corresponding to mi11lerm xy z · (the (b)
TABLE 6.1 Approach for automated two-level logic size optimization. bollom right 1) is on ly covered by one prime implicant.
namely, Xl Iso we' ll include thai prime implicant in
,

tep Description the cover 100. We mark all the 1s covered by these essen- y'z' ':y'
For e\cry mintcml in the fun ction'", on-set. maximally expand th e tenn (meaning tial prime impiicants. noted by italicized Is in the fi2ure. yz
Deremlifle prime impliclIIw
elim inatc literal'i from the (eml) such that the term still onl y covers minterms in the The last step is 10 cover the remaining Is wi~ the
fewest number of prime implicant, There is only one 1
function'~ on-se t (like drawing the biggest circle possible around each 1 in a
uncovered. and that 1 is covered by two prime impli-
K·m ap), Repeat for each minterm . If don 't cares ex ist, u ~e them to maximally
cants, \Ve can choose ei ther prime implicant for th e (e)
ex pand mintenn\ into prime implicants (like u:-. ing X's 10 crea te the biggest circles
po~')ible for a given I in a K-map),
cover-Iet's choose y Z Thu s, the final cover is:
I I ,
o o
I = x ' Z + xz' + y ' Z · y'z'
Add euefllial prim€' imp/iclllII_\ Find any minterms covered by only one prime implicant ( i.e.. by an essential prime
to rhe fitllerion's cm'er implic::m t), Add tho e prime implicanlS 10 the cove r, and mark the minterms This example uses a K-map merely to illustr:lte 10 Figure 6.30 liIuSlrntion of [\\0-
co\ered b) tho\c implicanlS as already co vered, the reader the sleps occurring wilhin an automated Ie'el optimization: (a) alJ prime
tool-such :l 1001 does 1/0 1 use K-maps intemalJy, but implic3I11S. (b) including e>.><:ntial
Cm"er remoinint: 11I/ntenllf hil" Co\cr the remaining minterms usi ng the minimal number of remaining prime rather other means of representing the tenns of a prime implicartlS in the C'O\er, tel
noneuellliul prune II11pliclllll5 impiicants. function. co\ ering remainmg :s..
Automated 1\"o-Level Logic Size Optimization
The fir;t 1\\0 "eps are exact. The last tep is a bit tricky. How do we choose which Using the Quine-McCluskey Method
pnme IInplicants to u'e to cover the rema ining minterm5? Reca ll the example of Figure The most well-known. and in fact the original. approach for automated t\\ o-Ie,e1 logic
6.27. in v, hich the cover in Figure 6.27(a) used two prime implicants to cover the two Is size opti mization is the Quine-McCluskey method. sometimes ailed the tabular method.
that would be left after adding cs;ential prime implicant'. while the cover in Figure The first step of thi s method finds all prime implicant . The step stan:. \\ ith thl.' func-
6.27(b) u,ed on ly one prime implicant to cover tho,c remaining two Is . When there are tion 's minterms- if we are minimizing a three-variable fun tion. then \\e mi2ht 'all these
on ly tv,o pO>'ibilitie,. we ca n try each po."ibili ty and pi k the one with fewe t prime three-variable terms. To find all the prime impli ants. the method first CO~l1pare> ea.:h
implicant; In the fi nal cover. But what if there were million, . or billion\. of possibilities? three-variable teml with every other three-variable teml. and if t\\ O tenus :Ire found that
We may not h:lve enough compute time to try al l tho;e po'~ibi liti es. For large functions diffcr by onl) one variable. the method adds a new tenn l\\ ithout thl.' dift'ering ..anablel t
v,lth hundred, of mintcrm, and thou,and, of prime Impl lcnnt '. there moy indeed be mil- a new set of two-variable tenm. For example. xy l ' and y: differ b, one ,mabl :.
lion, of po;"ble cover, to con;idcr III thc la\l ; tep. rc. ulting in a new tenll xy being added to the t\\o-.. ariable <et. nc'c dc;nc ,...'mparing all
If an npproach tnc, :111 ,uch po"ibilllie,. the :Ipproach i, an e~act :ilgorithm . If an three-variable tenns. the method compa~s e,e~ pair of t\\o-"uiable tel111S fer tl.'l11l> that
approach )U,t tne, a few ,uch po"ibllitie,. the overall two-level ,i/e optimi1.lltion differ by only one variable. ~sulting in a :et of one-'mable tel11b. n ';lJ1aN t I11lS
approach may he a hcumtlc (un lc" the approach can guara ntee that the Ignored po sibil- can then be compared for teml> that dilTer b) one' ariable. but tf 'u 'h t'1111' .Ire fc'l1nJ.
1I1C, couldn't IX""bly he pan of an optimal 'olution)

~ #~- - - . _- -
J 12 Optimizations and Tradeoffs 6.2 Combin ationa l Log ic Optimizations and Tradeoffs 313
then Ihe funct ion eva lumes si mply to 1. Actuall y. nOI all terms in a sct need 10 be com-
pared-only tho,c terms whose Ililmber of uncompiemcillcd literals differs by one need
.resulted
r in roughly 1000 m'III Ierms and Ihen millions
. . of computations to find the pnme .
IInp Icants-but such enumeralion and computation are obviously not necessary W mini-
10 be compared. For example. x· y z ' and xy z need not be compared. because the mize thiS equalion .
number of uncomplemented lileral differs by two. not one. and thus can't be simplified
. Modem automated logic opt'ImlzaUon '. . lools therefore don . t try to enumerate aU the
to a new tenn by eliminating a vari able. If at any time in Ihi, step a term can not be com- millterms for fun ~ li o ns wi th many variables. Instead, those lools start with a given sum-
bined wi th any olher term. we mark that term as a prime implicant. Thus. after this step, of-products eq uation of. the.func t'lon, I'k
l ethe descnpuon
" for F above. Those 1001 then try
all marked temlS represent all prime implicants. The method thus provides an approach to transform the equation little by little into a better equation. meaning an equation with
for fi nding prime implicants. more efficient than just maxi mally expanding every term. fewer terms and/or fewer lilerals. Those tools repeal, or iterote. until they find no further
The second step is 10 add all Ihe essenti al prime implicants to the cover, and to mark Improvement or until some maximum time allocated for Ihe 1001'S execution has expired_
as "covered" all minlenns covered by Ihose pri me impl icants. . Heun stlcs for such two-level logic optimization
The fina l step is 10 cover all remai ning uncovered mintemls by select ing the fewest III modern tools can be qu ite complex. However a I yz
remaining prime implicants to cover Ihose mi nterms. Trying all the pOSS ibi lities results in simple heuri stic Ihat is reasonably effective u~es 00 01 11 10
a version of the Quine-McCluskey melhod that is an exact algorithm . Trying just a subset repeated application of Ihe ex pand operation. The
may result in a heuri stic. expalld operation means to remove a literal from a o o
(a)
Methods That Enumerate All Minterms or teml and Ihen check whether the new teml is legal. o o
Comput e All Prime Implicants May Be Inefficient Removi ng a literal makes Ihat term cover more ~n-
The Quine-McCluskey melhod works reasonably for functions wilh perhaps tens of vari- temlS, like drawing a bigger circle on a K-map- r(z xyz
ables. However. for large r funclion s. just listing all the mintemls could result in a huge Ihus the name ·'expand." For example. consider the yz
amoun t of data. A fu nct ion of 10 vari able could have up to 2 10 mintemls-that 's 1024 funclion F = x ' z + xy' z + xyz . We might )(z
10
lry to ex pand the teml x' z by removing x '. or -by
mintemls. wh ich is fairly rea onable. But a funclio n of 32 variables could have up to 232
removing z. Note Ihat expanding a teml reduces the o o ~
mintemls. or up to about fo ur bi lli on mintemls. Represeilling Ihose mintemls in a table
number of literals-the concept that expanding a (b ) ~-t::~=:::i-4- )(
requi res prohibitive computer memory. And comparing those minterms Wilh Olher min- o
temlS could require on Ihe order of (four billi on)2 computat ions. or quadrillions of term redllces the number of literals in a teml may
take a whi le for you to get used to. Thinkin o of K- r(z xyz
computalions (a quad rillion is a Ihousand time a trillion). Even a co mputer performing
10 billion computations per second wou ld require 100.000 seconds to perform all those map circles may help. as shown in Figure 6.31-1he Figure 6.31 bpan ions of term
computation,. or 27 hours. And for 64 variables, Ihe numbers go up to 26-1 possible min- bigger Ihe circle. the fewer Ihe resulling literal . An , Z in the fuoctioo F = x' z -

temls. or quadrillions of mintemls. and quadrillions2 of computalions. which could expansion is legal if the new teml covers only min- xy ' z + xyz: (a) Ie",,!. (bl 001
require a month of computation . Functions with 100 input . which are not that terms in Ihe function's on-set. or equivalently. does legal (because the ex~ded tenn
lIot cover a mimeml in Ihe function's off-set-in cmer.; Os).
uncommon. would require an absurd amount of memory, and many year of computa·
tions. Even computing all prime implicants. without first Ii ling all minterms. is other words, an expansion i legal if the new teml i
still an implicant of Ihe function. Figure 6.3 1(a) shows that expanding term x' : to z for
computationally prohibiti ve for many modem-sized function s.
Ihe given funclion is legal. as Ihe expanded teml covers only 1 . whereas expanding 'z
Iterati ve Heuristic for Two-Level Logic Size Optimization to x ' is not legal . as Ihe expanded teml covers at least one O. Lf an e: pansion is legal. "e
Becau e enumerating all minterm of a functi on. or even ju>t all prime implicant . is pro- replace Ihe ongillal tenn by Ihe expanded teml. and we look for and ~mo\'" an,· OIher
hibitive in temlS of computer memory and computation lime for func tions with many lerm cOI'ered by tile expallded term . tn Figure 6.31 (a).lhe expanded term z co\e~ terms
variables. mo t automated tools u e methods that in stead just iteralively transform the xy , z and xy z. so both Ihose latter tenns can be removed.
original function's equation. in an attempt 10 fi nd improvement 10 the equation . Iterative I ote that we illustrated Ihe expand operalion on a K-map merel) to aid in under-
improvement means repeatedly maki ng small change. to an exisling solution umil we tandi ng the intuition of Ihe operation-K-maps are nowhere to be found in heuri -[j I'\(}-
decide 10 , top. perhap; because we can't find a better ,olution, or perhaps beeau e the level logic size minimi zation tools.
tool has run for en ugh time. As an exampl e of making sma ll changes 10 an ex isting solu- As anolher example. for ti,e earlier inuodu ed function:
tion. con"der the equation: F - ab cdefgh + abcdefg h '+ j lmnop
F - abcdefgh + abcde gh ' + j~ l mno p
We might start b trying to expand the fir.;t tenl1. a bcde f gh _ ne "'pan i< n of th t t ml
Clearly. we can reduce th " equation 'imply by omh,nlng the Iirst two term and b bc de fgh (i.e" we fCmoved the literal a ). Ho\\e\er. thaI term \ __ the teon
rcmov lIlg ' anable h. re\ultlllg III F - abcde f 9 + i ~ 1mnop. However, enumerating a 'b c de fgh. "hi h coven.mintenlls th31 are not in the fun tion' on-: t. $0 thaI ' pan-
the mllltcrm" J S reqUired III the carlier-de,cn1x:d ,ile optllnll311on methods. would have sion i~ not legal. We might try other e. pansions. finding them n t I galt~ '. until \\ e n:
OptimIZations and TradeoHs 6.2 Combinational Logic Optimiza tions and TradeoHs 315

aero" the ~xpan,io ll to abcdefg (i.e .. we removed the literal h ). That term strictly Even .th ough the heu ristic based on expand happened to generate the optimally minimized
CO\of' abcdefgh a nd abcdefgh '. both of whi ch are clearl y implicants because they equallon In the previous exam ple, there is no g uarantee the results from the heuristic will
appear in the origimll function. and thus the new tcrm Illu st also be an implicant. There- a lways be op timal.
fore. \\e replacc the fir,t ten11 by the ex panded term : . More advanced heuristi cs utili ze additional operations beyond ju t the expand o pera-
lion. One such operat ion i the reduce operation, which can be thought of as the opposite of
F = abcdefg4 + abcdefgh ' + jklmnop expand. The redll ce operation takes a tenn. and tries 10 add a lite ral to the tenn _ checking
and wc also rcmo\ c the second term. s ince that ten11 is covered by the expanded temJ: that the equallon wllh the new tenn still covers the functi o n. Addin o a literal to a tenn i like
reduc ing the size of a circle o n a K-map. Adding a literal 10 a te";' reduce the number of
abcdefgh + a6edcf~R' + jklmnop
mlllten11S covered by the tenn , hence the name redllce. Ano ther operation is irredllndant-
abcdefg + jklmnop which tries to remove a term entirely, checking that the new equa tio n till covers the func-
lio n. If so, the rem oved term was "redundant," hence the na me irredlllldalll. Heuristic may
Thus. lI;;ing j ust the expand operation. we have improved the equation.
lIerate among the expand. red uce. irredund ant. and o ther operati ons. uch as in the fol-
lowing heuristic: Try 10 random expans io n operations. then 5 random reduce operations.
EXAMPLE 6.10 Iterative heUristic two-level logic size optimization us ing expand then 2 irredundant operations, and then repeat (i terate) the who le sequence until no
~linlmile tht: fo llo\\ ing equation. which was also minimized in Example 6.4. using repeated appli- improvement occurs from o ne iteration to the next. Modem (wo-Ievel ize optimization
cation of the ~\pand oper:lIion: tools differ largely in their orderi ng of operati ons and their number of iterations.
F = xyz + xy z' + x 'y 'z ' + x 'y 'z Recall that we said that modem heuristics don't enumerate all of a function' min-
terms. ye t in the previous example we did enumerate aU the mintenns- actualh'. we "'ere
In other \\ orcl<. the on-,et consist.'> of the mintemls: 17.6. O. I I. and so th e off-set consists of the
given the mimemls in the initial equation. When we don ' l initially kno" the -rninterms.
mlillerm" I~. 3. ~. 51 ·
Let\ expand the !erms from left to righl. so we ll stan with xy Z. We can try to expand xyz many advanced methods ex.ist to efficiently represent a functi on ' on- et and off-sel
to xy. b that a lega l ex pansion" xy covers minterms xy z ' (mi nterm 6) and xy z (minterm 7). without enumerating the mimenns in those ets. and also 10 quick!) check if a tenn
both III the on-,el. Thus. the expansion is legal. so we replace xy Z by xy. yielding the new covers lerms in the off-set. Those methods are beyond the cope of the book. and in tead
the subject of textbooks on digital design synthes is. But hopefull y you no \\ get the basi
equation:
F = xyz + xyz ' + x ' y ' z' + x 'y ' z idea of heuristic two-level minimizat ion.
One of the original 100is Ihat performed automated heuri tics as well as exacI two-
We al,o look for implicants cmered by the new implicant xy. xyz ' is covered by xy . so we level logic o ptimization was called Espresso . developed at the University of California
ehmillate xy Z ' . yie lding: Berkeley. The algorithms and heuristics in Espresso fomled the basis of man,· modern
F = xy + ~ x 'y'z ' + x ' y ' z commerci al logic optimi za tio n tools. -

Let\ continue ll)lng to expand that first lenn. \Ve can try expand ing it from xy to x. The term
co\ e" mintcrm' xy ' z ' (minlcrm ~). x y' z (minteml 5). xy z ' (m interm 6). and xyz (min-
X Multilevel Logic Optimization-Performance and Size Tradeoffs
Icnn 7). The ICon X thul; covers mintenns "' and 5. which arc not in the on·set. but instead in the
We have thus far discussed two-level logic s ize o ptimization. H we\'er_ in pro rice_ \\e
ofT-",1. Thu,. that expan"on i, not legal. We can al,o try expandi ng xy to y, bUI we' ll find again
may no t need the speed of two levels of logic. We may be \\ illing 10 use three_ four. o r
the c'(pa n ~lOn I ~ not legal. more levels of logic if those additional levels reduce the amount of required log; _ A ' a
We ml ghl then c~n"der the neXl term. x ' y , Z ' . Let"' try expand II to X ' Y , . That teren
s imple example. consider the equ ation :
co\." mlllterm' x ' y' Z ' (mi nterm 0) and x ' y , Z (minterm I). both in the on-sct. so the expan-
"on 1\ leg"1 We thu, replace the term by the expanded ne: Fl = ab + acd + ace

F - xy + x ' y '~ + x ' y'z This eq uation CHn ' t be minimized. The resulting two-Ie\e! ircuil is sh \\n in Figure
6. _(a).
We ched. fur other term, co\ered by the expanded tenn. and find lhat X ' Y ' z is covered by
, e could. howeva. algebraically manipulate the equation a< follo\\s:
/. ' I'. '0 v.e rcmO\c x ' y ' Z. Ica\lOg:
F2 - ab + ac(d + e) = a(b ... C( ... e
F - xy + x 'y' ~
That equation 'Ill be implemented \\ ith the circuit ,ho\\ n in Figtlre tl.32(bl. Th~1 mulu-
\\e Cdn try c'pandlng the term x ' y ' further. hut ""ll hnd th.1I both PO' Ible expansions
Ic\d logic implementation re~ults in fe\\er tmn i$tl)~. Jt th \ l~n~ \"f Ill\. gal
(/ •. or y , ) are not legal Thu~ . the above cquJllon rcprc\Cnl the mUIII1lI/cd equati on. Notice
delays. li.' illustrated in Figure 6.32(c). The multile\ I nnpl'm nUll >n Ihu, rep >'nt,
rh..at Ihl' hJppcn, 10 he Ihe '<'Ole re,ull ..1\ v..c oht.uncd when we "H",fllI/cli the ~ame muial cqua-
/rc/{/eojJ compared to the t\\ l>-k\'d implemem,niOll_
flon In I '(.Impll' 6A
316 Optimizations and Tradeoffs 6.3 Sequential Logic Optimizations and Tradeoffs 317

EXAM PLE 6.12 RedUCing noncritical path size with multilevel logic
a ------------------~

,i::L
F'
Usc multilevel logic to reduce the size of the circuit in Figure 6.34(a). without extending the cir-
b b ------------~ cuit 's delay. Note that the circuit initially has 26 transistors. Furthermore. the longest delay from
any input to the output is three gate-delays. That delay occurs through the path shown by the dashed
.F2 line in the figure. The longest path through a circuit is the circuit's critical path.
'en ~ 10
FI ~ 26 transistors
:. 5 22 transistors
3 gale-delays 3 gate-delays
16 transistors a
4 gale-delays I 2 3 4 25 F'

L
b
delay (gale-delays) -W 20 -F2
FI = ab + acd + ace F2 = a(b+c(d+e))
(a) (b) (c) CD ~ 15
d Fl ~ ·cn
Figure 6.32 ~ing muhilc\cllogic to tradeoff performance and ize: (::I) Iwo· lcve l circui t. - 'l!' - - F2 '" c: 10
(b) muhilcveJ circu it wit h fewer transistors. (c) illustration o f the size ve rsus de lay tradeoff.
e ,,,"'- - ~ 5
umbers in .. ide gales represent transistor CQunts. ,
g 1 2 3 4
delay (gat~elays)
Automaled heuristics for multilevel logic opti mi zation iterati vely transform the FI = (a+b)c + dIg + elg F2 = (a+b)c + (cJ+e)lg
initial function's eq uation. much like for two-level logic optim ization, optimizing one of (a) (b) (e)
the criteria at the expense of another.
Figure 6.34 Multilevel optimization that reduces size without increasi ng delay. by altering a
noncritical path: (a) original circuit, (b) new circuit with fewer tran istors but same dela) .
EXAMPLE 6.11 Multilevel logic optimization (c) illustration of the size optimization with no tradeoff of delay.
Minimize lhe following function 's circuit ize. al the expense of perhaps slowe r performance. using
al gebraic manipulation . Plot the tradeoff of the initial and size-optimi zed circ uits with respect to The other paths through the circuit are only two gate-delay . Thus. if we reduce the size of the
size and delay. logic for the noncritical paths and extend those path to three gale-delay . we would nOl ha'.., extended
FI - ab ed + ab eef the overall de lay of the circuit. We focus on the noncritical pans of the equation for FI in Fig=
6.34(a); the equation has its noncritical parts italicized. We can algebraically modify the noncritical
The ci rcuit corresponding to this equation is shown in Figure 6.33(a). The circuit req uires 22 tran· parts by factoring out the lenn fg , resulting in the new equation and circuit shown in Figure 6.34{b).
sistors and has a de lay of 2 gate-delay. One of the modified paths is now also lhree gate-delays. so we now have tv.'o equally long critical
18 transistors paths. both havi ng three gate-delays. The resulting circuit has only 22 transistors rompared to 26 in
22 transistors
3 gate-delays the original circuit. yet still has the same delay of three gate-delay . as illustrated in Fig= 6.34(c).

°L
F'.
2 gale-delays
2 overall. we've pcrfonned a size optimization with no penalty in perfomlance.
a F2
b C1) ~ 15
c F2 -W 10 Generally. multileve l logic optimization uses factoring (e.g .. abc
d 'El '~ a b ( e+d)) to reduce the number of gates.
FI :. 5
a Multilevel logic optimization is probably more commonly u ed today than two-level
b
c logic optimization. Multilevel logic optimization i also exten ivel) u ed b~ automatic
e I 2 3 4
f delay (gale-delays) tools that map circuits to FPGAs. FPGA will be discu ed in Chapter .
F I = abed + abcef F2 = abc(d + ef)
(a) (b) (c)
6.3 SEQUENTIAL LOGIC OPTIMIZATIONS AND TRADEOFFS
Figure 633 Mululevel IOglc to tradeoff pe rformance and ,i7e: (n) two-level circuit . (b) multilevel
ClrC"'t IIo lih fewer lran, i' tor<. (c) tradeoff of Size ye"u< delay. umbe" in<lde ga te, represent In Chapter 3. we described the design of equential logic. namely. of ntrollers. Wben
Lran \ l \ to r COUOl\. creating the F M. and conveni ng the F M to a tate-register and logi _ we an appl~
We can algebraically manipulate Ihe equauon by factonng out the ab c term from the t~" some optimizations and trndeoffs.
term\. aco follow\
F2 • abed + abee f - abe(d • e f ) State Reduction
The CIrCUli for that equation" ,hown In Figur< 6.3J(b) The CIrCUli requ ire, only 18 transi - lale reduction . also kno\\n as store minimbttion . i an ptimization th~t redu, < the
U",. but hJ • longer delay of ) gate-delay,. The plot In figure (, 13(c) ,ho,", the sile and num ber of F M stme without changing the F ~rs beh'l\ ior. B) mlu -ing th number ~f
performa nce ror ea( h dc~ l gn ;.tates . IYC mny rcdu e th~ size of th~ required state regi,ter that nnplcm nt, th F' ~t .

-- - -- ----------
318 OptimIZations and Tradeoffs 6.3 Sequential Logic Optimizations and Tradeoffs 319

tbtl... rculicing circui t size. In~p_U1_S:_


. x...:;_O_u...:tP_u_tS"
_-'-y_ _ _ _ _ ____ the FSM had staned in 53. Ihe nexl Slale (50) would OUIPUI 1-0. Thus. 51 and 53 cannot be
Reducing the number of ~ I atc :-. is x x' eqUiva leOl , because Ihe same inpul sequence resu lts in a di fferent OUIPUI sequence.

U
po",iblc \\ hen Ih~ F ~ I contains If IWO Slates' OUIPUIS are nol equivalent. Ihe IWO slates clearly are not cquivalenL
"'-latc!'> Ih 31 ar ~ eq uivalent 10 one Funhermore, if IWO Slates' next stales are nOl equi valent for a given inpul value. then the
anolher. For c\ample. consider Ihe y=O y= 1 y=O y=1 IWO Slales are also not eq ui va len t. Using these concepts of nonequivalent talCS. Table 6.2
FS~ l of Figure 6.35(a). having (a) if x = 1,1,0,0 descnbes an algorilhm fo r reducing an FSM's number of stale.
Ihen y = 0,1,1,0,0
inpul x and OU IPUI y . Examinalion
~x'
(c ) TABLE 6.2 Algorithm for state reduction.
reveab Ihm ,laIC, 52 and 53
Slep
appear 10 be Ihe , a me as SlaleS 50 Description
and 5 /. Rc~ard l ess of whClher we Mark pairs havillg difJerem
Slate
sIan in 50 ';,r 52. Ihe OUlpU!!, will y=O y= 1 States having different outputs ob\'iously cannOI be
(b ) OlflpUlS as I/onequivalelll
eq ui valen t.
be idemical. For ex ample. if we
start in SO and th e input sequence Figure 6.35 El il11 in::lIi ng redundant "itatcs: (a) ori ginal For each unmarked SUlle pair.
FSM. (b) cqui \a lenl FSM wil h fewer Slales. (c) Ihe write the "ext st{Jfe pairs for the
for four clock edge, is I. I . O. O. FS 1, arc indi <l inguis hable from Ihc outside. providing
same illPlII \'a lues
the SWle sequence wi ll be 50. 51. idenlical OUIPUI beha\lor for any inpul sequence.
5/.52.52. so Ihe OUIPUI seq uence . ' .
\\ ill be . I. I. D. L. If inslead we SIan in 52. Ihe sa me Inpul seque nce wtll resull III a Slale For each lIllma rketl state pair. States with nonequivalem ne~l stales for the same
mark slate pairs having nOllequil'lllelll input values can 't be equi\'alent. Each time through
sequence of 52.53.53.50.50. so Ihe OUIPUI seq uence \\ ill again be . I. I. O. O. In facl,
lle.rr·Slate pairs as I/ollequi\'a lem. this slep is called a pass.
if we tried all po~sib l e inpul sequences. we would find Ihat Ihe OUIPUI sequence slartIng Repeal Ihis step III/til fl O cluII/ge
from slate 50 wou ld be iden lical 10 the OUIPUI sequence slaning from Slate 52. Slates SO OCcurs, or ullIil al/ SUlles are marked.
and 52 are th us equivalent. Likewise. slates 5/ and 5J arc equivalem for the s ~me reason.
Thus. \\ e can redraw Ihe FSM as in Figu re 6.35(b). The FSM s In FIgure 6.3) (a) and (b) 4 Merge remaining state pairs Remaining state pairs must be equi\aJem.
ha\e exaclly Ihe ,a l11e behavior-for any sequence of inpllls. Ihe IWO FSMs provide
exacll y Ihe same sequence of OUIPUI . If we encapsul ale Ihe FSM as a box as 111 FIgure
When comparing all poss ible pairs of Slates by hand. usi ng a graphical lable en UTe
6.35(c). Ihe ou~ i de wo rld cannOl dislinguish belween Ihe IwO F Ms based on the OUlputs.
Ihal we don'l miss any pairs. Con sider the FSM of Figure 6.35(a). The F ~I has -I tatcs.
Two states are equiva /ellt if; Iherefore Ihere are -12 = 16 possible slale pairs. Figu"; 6.37(a) hows those po <ible pairs
Ihey "'l>ign Ihe same values 10 OUIPUI . A 0 graphicall y in a lable. wilh Ihe Slales lisled along the rO\ and column headings. Ea h :-eU
• for all possible sequences of inpuls. the F M OUlpU L~ will be Ihe same slaning corresponds 10 a Slate pair. We can simpli fy th~ table size b) remo\'ing red-undanl ceU -
(e.g .. row 50. col umn 5/ is Ihe same as row 5/. column 50) and removing meaningless
from either SlalC.
cell s along Ihe diagonal of the lable (Slate 50 is obviou I) equi\'alent 10 :tale 0).- The
For large FSM,. visual inspeclion can nOI g uara nlee Ihal we've removed all redundant red uced lable is shown in Figure 6.37(b).
'laICI-a more ,y'lemalic approach is needed. which we now inlroduce.
(a
Implica tion Tables InputS" x; au/putS" y
) 50
r~~ (b )S1 ~

Intuitively. we know Ihal IWO stales cannOI be x' 51 ~ Redundanl 52


equivalent if Ihey produce dilTercnt OU IPUIS for
tii
m1
52 53 I
Ihe 'a me 'cquence of inpul'. Conl ider the FSM
53 Diagonal 50 5 1 52 ]
in Figure 6.36. which is 3lmo>l identical 10 the
FSM in Figure 6.35 with a Ilighl modificalion - 50 51 52 53
in "ale 52, the FSM now OUIPUI' y - I in"ead of
y O. Stale, SO and S2 Iherefore clearly are nOI
Figure 6.36 f\ \,trialll of Ihe FS f in
cqulvalcOl. becau,c Ihey have dilTerenl OUIPUI Figure (, 15 ~ ,t.lIC' SO and S2 cannO!
value, Stale, 5/ and 53 produce the 'arne OUlpu!. oc cqlll\,llenl occaulc IhC) OUIPUI Figure 6.37 Table of ,Iatc 1'-1i",: ta) original labk comp.1ring JII rJII'. lbl ' "url'r tJN ,~
hUI "'hen we I,"nlilion from either 'WIC 10 the dillerclIl \Jluc'. ,lnd 't.lIe, SI alld 53 only unique and rclCqUH pain.. (c) una initial rililng. 111 \\ ith ~I ,Iak' inf,-'mlJliCln.
corre'p<>ndlng ne'l ,tale. Ihe OUIPUI dllTc",. FOr c.ln 't he equl\lIlcnl Occau-,c they hnH:
c ample. If the FSM Ilan~ 10 ,late S / and x noncllUI\alent nc'l "'laIc .. lor the ~me Figurc 6.J7(c) sleps throug h the .Iate reduction algorithm of Tabl' (:0.2 t, r the ,:'\1
Occamc, r. Ihe nexl 1,lle (S2) oulpul, y . , bUI If !Oput \ Jim:' of Figure 6.35(3).

D
320 6 Optimizations and Tradeoffs
6.3 Sequential Logic Optimizations and Tradeoffs 321
Step I involves looking a! every table cell and marking Ihat ce ll with a large " X" if Ihe (a)
stales for Ihal cell have diITerenl OUlputS. We refer 10 such cell as bei ng marked. The first
stale pair (5/.50) is not equiva lenl because SO OUIPU IS Y - O. while 51 OUIPUIS Y= I. We
Ihen look al laic pair (52.50). (52.5 / ). and so on. and finally (53.52). marking state pairs
(S2,S2)
having differenl OUlpUIS. resu lting in the Xs shown in Figure 6.37(c). (S3.S1)
Step 2 involves wriling Ihe nex i state pairs for each remaining unmarked cell. There are
IWO unmarked cells: (SQ,S2) (SQ,S2)
(S3,SI ) (S3,S3)
(52.50) (ci rcled in Figure 6.37(c»: When x=1. state S2's nex l slate is 53, while
state SO's nexl stale is ' I (we see Ihi s by looki ng at Ihe FSM in Figure 6.35(a)).
so 61 62
Thus. we write " (S3.SI)" in tha! cell (the order doe n'l mailer). meaning thai for
slales 52 and SO 10 be eq uivalent. 53 and 51 muSI be eq ui va lent. We Ihen consider Figure 6.38 Implicalion lable for FSM in Figure 6.36: (a) table after initial setup and steps I and
2. (b) after slep 3's firsl pass through the table. (c) after step 3's second and final pass through the
Ihe case when inpul x=O. in which case Ihe nex l Slales are 52 and SO, so we wrile lable.
"(52.50)" in Ihat cell also.
(53.51): When x=O. the next states are SO and 52. so we wrile (50.52) in Ihe cell. Because the table changed during the first pass (we marked rwo tate pairs). we must
For x= 1. we wrile (53.51) in the cell. make a second pass, because changes in the table may affect state pairs that we already
looked at and left unmarked. In the econd pass, we again look at state pair (S2.5/ ). Nat-
Step 3 involves marking as nonequ ivalent any unmarked cells whose next slate pairs are
urall y, the next state pair (S2.52) is equivalent. The next state pair (53.5/ ). however. is
already marked as nonequivalent. Looking at cell (S2.S0). the nex t slate pair (53,5 / ) is
now marked, and therefore we mark (52 ,5/ ).
nOI marked. nor is next slate pai r (52,50) (which happens 10 be the current cell), so we
With all pair in the lable marked, as seen in Figure 6.38(c), we can conclude that no
can 'l mark Ihis cell. Likewise, for ce ll (53.51), Ihe next state pair (SO.S2) is nOI marked, states in the FSM are equivalent, and thu we leave the FSM unchanged.
nor is Ihe next Slale pair (53,S I), so we can't mark thi s cell. We now provide another example of stale reduction.
Because we made a pass Ihrough slep 3 wi thout any changes. we don'l repeat slep 3.
and inslead move on 10 step 4. EXAMPLE 6.13 Minimizing states in an FSM using an implication table
Step 4 involves declaring the unmarked tate pairs as equivalent. so 52 and SO are equiv, Consider the FSM in Figure 6.39(a). Unlike previous examples. this FSM has 5 Iates. resulting in
alent. and 53 and SI are equ ivalent. To finalize step 4 of the algori lhm. we combine the more possible state pairs than in previous examples. The first task in minimizing the FS.M"s stares is
equivalent tates in the FSM . After combi ning tales 52 and SO. and com bining tales S3 to con struct an implication table so we can compare every state with en h other as a stale pair.
and SI. we oblai n the FSM in Figure 6.35(b).
The method we have ju I employed is known a Ihe implicatioll table method for
state reduction. Inputs: x; Outputs: y
Naturally, not every FSM can have its number of Slates reduced . For example, lei' x'
use the implication table method on the FSM in Figure 6.36. With 4 lale. the FSM's
implicalion table will be the same ize as the previous example. as shown in Figure
6.38(a). Step I marks state pairs wilh different OU lpu ts. shown in Fig ure 6.38(a). Step 2
lisls. for each unmarked cell, Ihe neXI tate pairs for identica l inpul va lues. as also shown
in Figure 6.38(a). ($4.S3)
y=1 y=1 (SO.SO)
In step 3's first pass. we firSI examine Ihe cell for late pair (52. 51). aturally.
(a)
Ihe nexI late pair (52. 52) is equi va lent. The neXI Male pair (S3. S I) is unmarked. so
S3
we cannot mark (52. 51). We then exam ine the cell for ~ Iale pair (53.51). and find th31
(b)
the nexl ~Iate pair (50.52) ha~ il\ cell marked . Thi\ lell, u\ Ihm 3 and I eannol be
equi.alelll (because they could transition 10 noneqUlvalent "ate, for the sume inpul Figure 6.39 n M needing Inte reduction: ta) original ~t. (h) impl; '31100 t3ble.llt<r _
I and _.
value~). ~o we mark the cell for (53.51). Similarly. we mark (53.S2) ,ince its firsl neXI
'tate palf. (50.052). ha~ its cell marked . omplellng ,tep 1\ Ii "I pas re. ults in Ihe
In step I of our ,tatc reduction algorithm. \\c marl \\11h an X !<lJtc plIf' WI" ~an I~ I II
table of Figure 6.38(b).
lire nOI cqUi\"3icOl beenu.c Iheir UIPUI dilfer. as ShO\\l' in Figure 6.3'l<,b\.

- - -- - - - - -
OpttmlZation s an d Tradeoffs 6.3 Sequential Logic Optimizations and Tradeoffs 323

In stcp 2, \\ \.' write in all the next ~talc pairs for unmarked ce ll:-. of the implicati on table, as Inputs: x: Outputs: z
.. ho\\ n in Fi g.ure 6.39( b). Since there arc onl y IWO po:-~ iblc cOlllbin:tti o n ~ of inputs (e ither x=O or
\ = 1), each ulll11 arl-.t:d ce ll \\ ill have twO next slate pain-.
In sirp J's first pass. we ll1ark each SHitc pair if olle of their next stat e pai rs is marked. During
our fi~L pa!\, th roug h the tabl e. we wi ll exa mine four Slale pai rs. Starting wi th (52.51). we see that
both of it:.- nl;':'(l Stal e pairs are unmarked. Looking at (53.50). \\ C ~ce one of its nexl Siale pairs.
(53.52\' i, marked. so lI'e mark (53.50)'s cell. We also mark (5-1.50) bec,,"se ils neXl state pair (S4,S2)
i... marked. \Ve h:~\\ e (5.,f,53) unmarked as both of its next SHih: pairs arc unmarked. th us completing
the 11~ t P3"", Figure 6AO( a) re flects thc results or our fi rst pass through the impl icati on table.
Becam.e we marked new sta te pairs in the first pass. we conduct 3. second pass th rough step 1
During thaI pas~. we find no new cells to mark. Ic:.wing the table unchanged. We thus move on to step 4.
In step . t we decl are the unmarked state pai r (52. 5 I) as equivalent. and the un marked state
pair (S';.53 ) ao;; equi valent. \Ve combine states 52 and 51. and we combinc state s S4 and 53. resulting Figure 6.41 A IS-Slate FSM . z=t
in the nc\\ FS~ I shown in Figure 6...JO(b). Note that the two transitions with conditions x· and X
from SO could be repl aced by one IrJllsitioll with no conditions.
State reduction is therefore lypically performed using automated tools. For mailer
FSM s, the tools may implement the implication table method. For larger FSMs . the tools
may need to reson to heuristics to avoid inord inately large table sizes~or numbers of ne.<t
state pairs. -
Even when we reduce the number of states, we are not guaranteed that such state
Inputs: x: Outputs: y reduction aCluall y reduces the size of the reSUllin g logic. One re';on is because reducin o the
states might not reduce the number of required s~ate-register bits-reducing the States from
15 down to 12 does not red uce the minimum state register size. which is fo~r in either case.
Another reason is because, even if the state reduction reduce the tate re!!i ter ize. the
combinational logic size could pos ibly ill crease with a smaller state re!!i ~er. due to the
logic having to decode the state bits. Thus, automated state reduction t;'1 may need to
actually implement the combinational logic before and after state reduction. to determine if
state reduction ultimately yields improvements for a panicular FS 1.
(a) (b)
Figure 6.40 Implicalion lable and minimized FS I: (a) impl icalion lable afler firsl pas .
(b) minimized "ate machine wilh stales 5 I and S2 combined. and S3 and S-I combined. State Encoding
In Ihi , e.<ample. by reducing the number of slales from 5 down to 3. we have reduced Ihe Stale ellcodillg is the task of assigning a unique bit representation for eacb tate in an
min imum I,lale rcgi, tcr site rrom 3 bits down to 2 bit,. perhaps reducing circuit size. FSM . Some state encodings may opti mize the resulting controller circuit b\ redu im!
circuit size. or may trade off size and performance in the circuit. We now d.isc~ \ e~
Sometimes equi valent states may overlap. For ex ample. assume that for some FSM with
' tates {TO. TI . n. n . T·/}. you find that state pairs (TO.TI ). (TI.n) and (n.TO) are method for state encoding.
equiva lent. How do you deal with the overl apping equivalencies'? The answer is simple: Alternative 1inimum-Bitwidth Binary Encodings
the th ree qates. TO. TI. and n can be combined into a single ~ tate. Previously. we assigned a unique binary en oding to ea h state in an FSM usi ng the
The impl icati on table method is suitable for hand-optimizing small FSMs such as fewest number of bits po sible. representing a lII illiIIIUIII -biI",idlh biliary ell odi;; . If
tho,e introduced in the prev ious cx(lmples. but can qui ckl y become unwieldy for FSMs there were four states. we used twO bit . ' f there were fi\'e. ix. seven. or ~i!!ht st tes. \\~
"'ith more sta te~ . Consider the IS-state FSM in Fi gure 6.41 . I t~ reduced implication table used three bi ts. The encoding represented the state in the ontroller's $t:1t ~!!i -ter. lbere
"'ould req ui re 14 row' and 14 column'> . and 105 , tatc pair'> . With two combinations of are many ways to map minimum-bitwidth binary en odings to :1 ~et of :lal "'$~ 3 \ \\ e J.re
tnput'> (namely. a = 0 or a = 1), e:lch statc pair would have two I1 c ~t Mate pairs. and. in the given four states. A. B. C. and D. One en oding is .-1:00. B: 1. :1. D: 1 . -. n(,th r
"'ON ca,c. wc would need to chcc k 105' 2=2 10 nc t ' tate pair, during our firM pass
"lone. What if the ,ame FSM had four input ('>ny. a, b. C. and d ) in,tcad of one? With
*'
encodi ng is A:Ol. 8: 1O. C.· ll. D: OO. In fa t. there :1re 4*3 _ = 4! = _4 p'-'lS, i I
encodings into twO bits (4 encoding choice ' for th ' lirst stale. 3 for th ne" 'U~ . ~
fo ur tnput'> . there wou ld be 4' = 16 combination, of tnput ' (i .e. a' b ' C ' d '. a ' b ' c ' d,
for the next. and I for the last state). Freight .'tate•. lh're are " . or o\er 40-<)00. po: " i I'
0' > 'rrj' ... . abed ) and up 10 16 nc" , wte pair, III each cell In the implication IUble. If
encodings into three bits. For J states. there are N! (.V facto';;,)) IX , il'lk en :-c'Xling, ---a
tn "citd the FS M had. ,ay. 100 ' latc'> ((I rca,> onable number). the implication wble would
h,,\c on the order of 100* ' 00 = 10,000 '> tal e Pit" " huge \lumber for an) greater than 10 r $" . ne encoding !I1a~ re, ult in I '-'
324 Optimizations and Trad eoffs
6.3 Sequential LogiC Opllnuzotlons and Trodeoffs 325
combi nati onal logic than another encoding. Automated tools may lry several different EXAMPLE 6 15 One-hot encodll1g example
encoding' (bu t not all N! encodings) to redu e combinm ional logic in the controller. InpulS- non , 0u!pu1S:,
Con<ldcr Ihe '1mple 1- M (II h gure 6.4), x 0 x_ 1
\\ hl ch n:pc.ltctJl) gcncnHe the nU lllul
EXAMPLE 6.14 Alternative bll1ary encodll1g for three-cycles-high laser timer ,"quence 0_ 1. 1. 1. 0, 1. 1. 1. elc "
In Example 3.7. we encoded "laic' u.,mg a 'IrJlghtforn.lrd I1llnlln.1I blll<lf) cncoomg I'
~ lr:tiglllr()rw"rd bln~lry encoding .... truting with ,h \\ n. \\ hleh I' then cm"cu OUI and n::plu cd
00. Ihen 0 l. Ihen 10. nnd Ihen II. The v. IIh n one-hOI Cll~odln!l
rc ... ulting dC'lgn hud I grill.: inpuh (ignoring nle bmar) cncOOIl1!!, r'C\uh, III the '1I11e
invencr... ), We can try In,tcad the ;Iltcmauve table \hown In 'PJblc 6.4 'nc f"C,uhlllll C(IUU - x. 1 x_1
lion.. 3n:
binary cncmling \ huw n 10 Figure 6.42.
n1 - 51 'sO + 5150 ' Figure 6,43 FSM II" gi\' ' II ,cqucllec.
Table 6.3 pruvide, Ihe "aIC lable for Ihe
new cnc(xllng. , howlng the difference ... from nO - sO'
TABLE 6 4 StOIO lable usillU hillory
the original CI1COdlllg. x I + sO encedlllg
From the \trw.: table. \\C obtain the fol · Figure 6.42 La"cr timcr ~aatc diagrnm with
The one-hOi cncod lllg rc\uh, III the: t.lle Inl)UIS OUlputs
lowing CClulIlion' for the three combinational altcmall\'C binary ,Iatc encod ing.
lable ' h"" n III Tuhle 6.5 Inc ""ulling <qU,I-
logic output . . of a controller:
lion' arc s I sO nl nO
x - s I , sO (nole from Ihe lahle Ihal x-I TABLE 6.3 State table for laser timer n3 .2 A 0 0 0 I 0
If sl-lor sO-I ) conlrolier with alternative encoding /J 0 I
n2 - 5 I 0
nl - 51 ' sOb ' '51 ' sOb + slsOb ' + slsOb Inputs OutPU15 nl - sO 0 I I
nl - 51 ' sO + 5150 51 sO b x nl nO /) 0 0
nO - s3
nl - sO 0 0 0 0 0 0 - 53 + s2 + 51
nO - sl ' sO ' b + 51 ' sOb + 51 ' sOb ' Off TABLE 6,5 Stale toble uslnu Olio-hOI ollcoding,
0 0 I 0 0 1 Figure 6.4-l \how\ Ihe rc,ulllllg clrcuiL,
nO - sl ' sO 'b + s l ' sO b + sl ' sOb + for each encoding. -Inc binary ellcooillS Yield, Input " Output.,
5 I ' sOb ' 0 I 0 1 1 I
Onl more gate" but more Hnponol11ly. requ ire, 53 52 51 sO n3 n2 nl nO
nO - 51 ' b(sO ' + sO) + 51 ' sO(b + b ' ) 0 1 1 I I 1 Iwo le'Ol, of logic 11,e one-hOI cncoolll8 III
Ih".example require, only one bel of I08 ic . II 000 I 0 0 1 0 0
nO - sl ' b + 5 1' 50 I 1 0 1 1 0
On2
1 1 I 1 I 0
NOllcc Ih ~H the logiC 10 gc ncrJle the ncxt 'LUle 11o o I 001 0 - 0- ]-
1l1C resulting circuit wou ld have on ly 8 gate inpulS: -- I ') jus t Wire!;! in th" example (olhcr example,
2 for x. 0 for n 1 (n I i< connecled 10 sO direclly wilh 1 0 0 I 0 0 may require \Omc logic). Figure 6.44(c) lliu,-
o o o 1 0 0 0 1
wire). and 4 + 2 for nO. 11,e 8 Snle inpul is ignificanlly On3
1 0 1 1 0 0 lraleS.lhal lhe one-hOI encoding ha, les, delay, D o o o 0-
- - -_ _...L-_
0- 0- -
less Ihan Ihe 15 salc inpuls needed for Ihe binary mcanlOg we could Uf\C. a fa\ter clock fre.
encoding of Example 3.7. This encoding reduces size quency for that ci rcuit
wi thout any increase in delay. thus repre enling an
op tillli ~l1 i o n .
Figure 644 One-
hoI encooing can
One-Hot Encoding
reduce delay: (a)
There is no requirement that we encode a set of states using the fewest number of bilS. minimum binary
For exa mple, we could encode four states A, B, C, and D using three bits instead of just encooing, (b) onc-
two bils. such as A:OOO, B: Ol1. C:llO_ D:llI. Using more bits requires a larger state hOi encooing. (c)
register. but possibly less logic. A popular encoding scheme is called olle-hol encoding. though 10lal sizes
wherein we use the same number of bit for encoding as there are states, and each bit may be roughly
equal (one-hoI
corresponds to exact ly one state. For example, a one-hot encoding of four states A, B, C,
encooing uses , 2 3 4
and D uses fo ur bi ts, such as A:OOOl, B: OOI 0, C: Ol 00, D: 1000. The main advantage of fewer gales bUI delay (gale-delays)
one-hot encodi ng is speed- becau e the state can be detected from just one bit and thus more flip-flops). (e)
need not be decoded using an AND gate, the controller's next state and output logic mal one-hOI yields a
involve fewer gates and/or gates with fewer inputs. resulting in a shoner delay. shoner eri lical path.
326 6 Optimizations and Tradeoffs
6.3 Sequential Logic Oplimiza lions and Tradeoffs 327
EXAMPLE 6.16 Three-cycles-htgh laser timer using one-hot encoding O UlpUI Encod ing
In Examp le 3.7. we encoded stales
Some problem descriplion. require us 10 generale a particular ,cq uenee of va lues On a el
using a . . traightforward binary
of OUlpUIS. For example. a problem mighl require u, 10 repctllcd ly oUlpul the following
encodi ng. ~ta rtin g with 00. lhen al.
sequence on a I" " r of OUIPUIS x and y : 00. 11.
then 10 . and then 1 1. Herc. we'll
10, 0 1 .. W~ can caplure Ihe behavior using Ihe Inputs: none; Outputs: x, y
pafonn a one- hOI encoding of the
FS M wllh lour slales, A. B, . and D. as shown xy=OO xy=O l
four !-laICS. requiring four bits. as
shown in Fi gure 6.-l5. in Figure 6.46. A siraighiforward binary
Tabl e 6.6 shows a !- Iale wble for encoding for Ihosc Slates wo uld be; 11: 00.
the FSM of Figurc 6,45. using the 8: 01. C:I 0, D:l1. liS shown in Figure 6.46.
on e-hoI encodi ng of the stales. We Figure 6.45 One-hot encod ing of laser limer. WI~en we design a COntroller for Ihi s syslem.
don', show all possible rows. since the we II have a Iwo-bil SIaIC regisler. logic 10
table wou ld bl.:! 100 large. delennll1e Ihe neXI MaIC. and log ic 10 generale
The 1a~1 step b to design the xy=l l xy= 10
Ihe OUlpul from Ihe present slllle. BUI might il
combillruional logic. Deriving eq ua- make more sense 10 u ~c a !'lIme encoding that is Figure 6.46 FSM for given sequence.
tions for each output direct ly from TABLE 6.6 Slale lable for faser timer conlroller wilh
one-hoI encoding. idenlica l 10 Ihe OUlpul va lues in each Male? If
the table (assuming all oth er input
we use such an encod ing. Ihen we will slill have a Iwo-bi l sWle regisler. and we will still
combinations Jre dOlfi-cares). and InpulS Oulputs
minimili ng th ose equat ions ~)Igcbra­
have logic 10 generate Ihe nexi Mme. bUI we wo n' t have log ic 10 generate the OUlput fro m
icalJ y. results in the fo llowing:
53 52 51 50 b x n3 n2 nl nO the prcselll Slate. Inslead. each OUIPUI will si mply be connecled by a wire to a bit in Ihe

x -5 3 + 52 + 51
a a a 1 a 0 a a a 1 slate regisler- Ihus reducing Ihe requi red number of logic galc •.
Off If an FSM has at Icasl as many OU IPUIS :t~ needed for a binary encod ing, and if each
n3 - 52 0 a 0 1 1 a a a 1 0
Slale has a unique OU IPUI combinalion. Ihen we may consider usin g a st.lie ·s OUIPUI com-
n2 - 5I a 0 1 0 0 1 0 1 0 0 bination as Ihe Slatcs enCoding. Such an encoding may reduce Ihe amount of logic
0111
nl - 50*b a a 1 a 1 1 0 1 a 0 required. by eliminat ing Ihe need for log ic 10 generale Ihe OUlputs from Ihe present Slate
a 1 a a a 1 1 a a 0 encoding-Ihal log ic is reduced 10 jusl wires.
nO - 50*b ' + 53
0112 OUIPUI encoding requires Ihal Ihe syslem have al leasl as many outpulS as il has bits
Thi s circui t would require 0 1 0 0 1 1 1 a a 0
in a minimal binary encod ing. olherwise the OUIPUIS ca n'l re present enough encodi ngs 10
3+0+0+2+(2+2) = 9 gale inputs. Thus, 1 a 0 a a 1 a a a I un iquely idenlify each Slate. Furthermore, we can' l usc o utpul e ncoding if the desi red
lht.! circuit has fewer gate inputs Ihan 0113
the original binary encod ing's 15 gate
1 a 0 0 1 1 0 a a 1 outpul equenee contains Ihe same OUIPUI va lues in IWO different stales, since every
inpuls-but one must also consider
tate's encoding musl be unique. For example. if we wish to repeated ly generate the
thm a one-hOI encod ing uses more seq uence 00, I I. 01. I I. we cannol use OUIPUI encod ing. because if we did, then two
nip-nops. tates would have Ihe same encod ing. Even in such a silumion. though, we might try to
M ore importantly. the ci rcui t with one-hot encoding is slightly fas ter. The critical path for thlll OUlput encode as many slates as possible.
circu il is nO : 50*b ' + 53. The crilical path for the circuil with regular binary encoding is
nO • 51 ' 50 ' b + 5 150' . The regular binary encoded circuil requires a 3-inpul AND gale EXAMPLE 6.17 Sequence generator using output encoding
feeding into a 2-i npul OR gate. whereas the one-hal encoded circuit has a 2-input AND gate feeding Example 3. 10 involved design of a sequence gener-
in a 2-i npul OR gate. Bccause a 2-input AND actually has slighl ly less delay tha n a 3-inpul AND ator. in which we we re 10 gCllcm te the sequence tnputs:none; Outputs: w, x, y, z

y
gate. Ihe one-hot encoded circuit has a shorter critical path. 000 I. 00 11. 11 00. 1000 on a sci orrour out pUIS. wxyz=OOOt wxyz=tOOO
as shown in Figure 6.47. 111 that example. we
encoded th e states lIsing a two-b it bin ary encodi ng.
For exampl es with more states, the cri tica l path red uclions from one-hoI encoding may be wilh II being 00. B being 01. C being 10. and D
even greater, and reducl ions in logic size may also be more pronounced. AI some poinl, being 11. In this example. we ll inslead use OUIPUI
of course, one-hOI encoding results in 100 big of a slate register-for example, an FSM encod ing. The OUIPUIS have enough bit>. four.
wilh 1000 Slales wo uld require aiD-bit Slale register for a bi nary encoding. bUI would whereas we need at least two bi ts to encode the four
require a looo-bil Siale register for a one-hOI encoding, which is probably too big 10 can· Slates. The sequence also has a different output com- wxYZ=OOll Wxyz=ll00
bination for each state. Thus. we can consider output
sider. In such cases, we mighl consider encod ings using a number of bi ts in belween thai Figure 6.47 Sequence generator FSM.
encoding for Ihis example.
for a binary encoding and thai fo r a one-hot encoding.
.328 Optimizations and Tradeoffs 6.3 Sequential logic Optimizations and Tradeoffs 329

Table 6 .7 ~hO\\ ... a panial ,tatc U1ble for TABLE 6.7 Partial state table lor sequence Recall the standard controller architec- o
c."
the ,cquencc:: generator. u~ing an output generator controller using output encoding. ture of Figure 3.48, reproduced in Figure 0 _ ",
cncooinf!. Notice th:!! the outputs them· 6.49. The architecture shows one block of ~~
...e!'e' w~ x. y. and z . don't need 10 appear
Inputs Outputs combinational logic, responsi ble for con-
in the table. a~ tht.!) \\ ill be the sa me as 53. s3 52 sl sO n3 n2 nl nO vening the present state and external inputs
52 . 51. and sO. We u,e a partial table to into the nex t state and external outputs.
avoid ha\ i1H~ 10 ~ ho\ all 16 rows. and we A 0 0 0 I 0 0 I I
Because a Moore FSM 's outputs are
assume Lh:lI~ all un~pct' ificd row~ represent
B 0 0 I I I I 0 0 solely a fun ction of the present state (and
I I
not the external inputs), then we can refine
From the table. we derive equa ti ons C 0 0 I 0 0 0
for c:H.'h output J~ roll ow~:
the archi tecture to have two combinational Figure 6.49 Standard controller
0 I 0 0 0 0 0 0 I logic blocks: the lIexl-Slale logic block arc hitecture-general view.
n3sl+s2 convens the present state and external
n2 - 5 I inputs into a next state, and the outpullogic block convens the preseot stale (but nOI the
nl - 51 ' 50 ex ternal inputs) into external outputs, as shown in Figure 6.50(a).
nO - 51 ' 50 + 5352 ' In contrast, a Mealy FSM's outputs are a funclion of both the present stale and the
external inputs. Thus, the output logic block for a Mealy FSM takes both the present State
\\le obtained those equations by looking
al all the Is for a particular output. and visu- ~ ~;;: alld the external FSM inputs as input, rather than ju t the present state_ as bown in
all\ dClcrminine a minimal input equation ~~~ Figure 6.50(b). The next-stage logic is the same as for a Moore, taking as input both the
present state and the external FSM inputs.

~
th;t "ould gene-rate those I s and Os for the
other ,ho\\ n column enLries (all orner output
\alues. not shown. are don 't cares).
Figure 6A8 ~ hows the final circu it. o
c."
Notice that there is no output logic-me 0i5 '"
SoS:
outpuLS \01 . X. y. and Z connect directly to the '"
Slate register.
Compared 10 the circuit obtained in
53 s2 st sO
>--
H-J
E,ample 3.10 u'ing a binary encoding. the

---b I
output encoded circuit in Figure 6.-l8 actu-
ally appear; to use morc transistors. In olher elk State register
c:<amples. an ou tput encooed circuit might
use fewer transisto~. +nl ' nO
n3 n2
Whether one-hot encoding, binary
Figure 6.48 Sequence generator controller with
enCoding, output encoding, or some Figure 6.50 Controller architectures for: (a) a Moore FSM. (b) 3 Meal) FSM.
output encoding.
\ariation thereof re"ult~ in fewe t tran-
sisto" or a ,honer critica l path depends on the example itself. Thus, modern tools may Graphically. the FSM output assignments of
try a variety of different encodings for a given problem to sec which works best. a Mealy FSM wou ld be listed with ea h transi-
tion. rather than each tate. beenu e each Inputs: b: Outputs: x
Moore versus Mealv FSMs transistion represent a present state and a partic-
ul ar input value. Figure 6.5 1 hows a two-state
Basic Mealy A rchiteclure 1ealy F M with an input b and an output
The FSM, dc'Cribed In this book have thus far all been a type of F M known as a Moore When in state 0 and b-O , the F M outputs =0
FSM A Moore FSM b an FSM who c outputs arc n function of the FSM's state. An and stays in state O. as indi 'atcd by the transiti n
alternatIVe type of F M " a Mealy F M. Mealy FSM is nn FSM who e outputs are a labeled "b' I x-O". \ hen in state 0 and b = 1.
funClton of the FSM\ ,tates alld illl'lIIf. Sometime, a Mealy F M resu lts in fewer SUItes the F M output. - 1 and ~oe to state I. We Figure 6.51 A Me31)
than a Moore I-SM. rcprc-.enttng an opt.mtlallOn Sometime' tho'e fewer states come at usc the .. r ,impl to sepn;'ue the tran iti n'$ output:.. \\ lth tmnoMti
the c'pcn,c of liming comple~ ttie, that mu\{ be handled, repre,cnting a tmdeoff.
330 OptimizatIOns and TradeoHs
6.3 Sequential logic Optimizations and Tradeoffs 331
input cundi tions from the output assignments-the .. r
does not mcan "di vide". here.
Becal"e the tran>ition from 5/ to 50 IS taken no mattcr what the In put value. we list the . The Mealy state diagram in Figu re 6.52(b) uses a convention similar to the conven-
tran~ilion simply a~ "/x'='O :' meaning there's no input conditi on. but there is an output
tion we used for Moore FSMs (Section 3.4). namely. that any outputs not explicitly
aSS igned on a tranSlLlon are implicitly assigned a O. As with Moore FSM . we till Ii tan
assIgnment.
assignment to 0 ex plicitly if the assignment is key to the FSM - behavior (such as the
~ leah' FS~ Is ~ lay Have Fewer lales ass ignment of d=O in Figure 6.52(b».
The ~eeming l Y minor difference between a Mea ly and a Moore FSM. namely. that a
~Iealy F ~I \ output is a functi on o f the state alld the current inputs. can lead to fewer
EXAMPLE 6.18 Beeping wristwatch FSM using a Mealy machine
;tatc, for some behaviors when implemented as a Mea ly machine. For example. conSider Create ~111 FSM for a wristwa tch that can display one of four register by setting two outputs S 1 and
the ,i mplc ,oda dis penser controller FSM in Figure 6.52(a). Setting d= 1 di spenses a 5 O. which contro l a 4x I l11uhiplcxer that passes one of the four registers through. The four registers
>oda. The FSM stans in Slate /Ilir. which se ts d=O and sets an output C 1ea r~ 1. which we correspond to lhe walch 's present time (sls0=00). Ihe alann seILing (01). the dale (10). and a
a,;ume clears a device keeping count of the amoun t of money deposited into the soda dis- stopwatch (11). The FSM should sequence 10 the nexi regisler. in the order listed above. each time
penser machine. The FSM transit ions to state \Vail. where the FSM waits to be informed, a bUlIon b is pressed (assume b is synchronized wi th the clock as 10 be high for only 1 clock cycle
throu2h the enough input. that enough money has been deposited. Once enough money on each umque. bUllon press). The FSM should SCI an OUlput p 10 1 each time the bUllOD i pressed_
c<JuslI1g an aud ible beep to sound.
ha; b~en deposited. the FS M transiti ons to state Disp . which di penses a soda by setting
output d= 1. and the FSM then returns to state /Ilil. (Readers who have re~d Chapter 5 Inpuls: b: Outpuls: s1 , sO. P Inputs: b; Outputs: 51 . sO. P
may notice this example is a simplified ve rsIOn of Example 5. 1: famili ari ty with that
b'
example is not req uired. tho ugh. fo r the present diSC US IOn .). b'/s 1SO=OO. p=O Time
b 5150=00. P=O
InpuIS: enough (bit)
OutpulS: d, clear (bit)
Inpuls: enough (bit)
OutpulS: d. clear (bit) b'/51S0=01 , p=O c:w 5150=00. P=1

/ d=O, clear=1 (a) b'


b'!S150=10. p=O

b'/s1S0=11. p=O

d=1

clk~ elk ...ruuiJuul


Inputs: enough ~ Inputs: enough -----t-i-L-- (b)
Slale: It Iw lw! D! I Slale: I I Iw I Wit I

OutPuIs:clea;~ OUIPuIs:clea; ~ Figure 6.53 FS ,I for 3 wristwatch with beeping


beh.vior (p= I) when bUlIon is pressed (b= 1): (3)
(a) (b) Mealy. (b) MoofC.
Figure 6.52 FS I, for q)(la di'pcn..er controller: (a) Moore FSM h., action, 111 ; t.,O'. (b) Mealy Figure 6.53(a) shows a Mealy FS~I describi ng the desi",d beha\'ior. 1\oti -e thai the ~Ieah
FSM ha~ acllon, on Iran'lition\, rc5tu ll ing in Ihi" cn"'c In fc",cr ,tatcs. FSM e.lsil y caplU ~' the: beeping ~h3vior. simply by setting p-1 on the tr.ln~itions th:n :'Oln!spond
to bUlIonllfc".s. Inlhe ~I oo", F ~I of Figure 6.53(bl. \\0 had 10 add an c,tra "at< 1I1 rem n ea.:b
(Jlt..~ "'"h !.((lflrr Figure 6.52(b) .. how . a Mea ly FSM for the .. nme cont roller. The initi al slate /lIil has pair of M:ttC~ in Figure 6.53. with each t:'.xtra state haying the action p-l and ha\ ing a C\'\[ldio\. nI $~
F\\,I\. kf'/oIlIlK Ir:ln~ilion to the IlC\( slate.
no attlon .. iL<,elf. but rather ha, a conditionle,' tran<ltion to ,tate \Vai/thm has the initial-
'hI" f (ln~"n1"m I a lice that lhe Menl) FS ~I h~ fc\\a M:Hes than th~ ~toore ma..:hioc.:\ dr.l\\ txk: b that \\ .:mm't
11'll1/ unuUIF/nt·" l/allOn action, d-O and cleo r-J. In ,tate Wail . u tran\ltion with condition enou gh' gunr:lIlla::d Ih ~lI a beep \\ illla~1 al least ont' lock C) ck. due to ttming i~ue, that \\ e will :n
'Jurpuli In II return, to tatc Wail without any aClion, It'ted. nother tra''''tion with condition enough
"Iftl/.,I ~t.1 ,It",. ha, the aCllon d-I. and take, the FSM back to the /"il 'LUte. oti c thut the Mealy F M
d ,t.Jf r UlJ1urr Ti min!,! Issues \\ illt l eu l~' F i\ ls
ITYJpIUIi/\ doc nlll need the Dllp 'tate (0 ,ct d I. that aCllon occur, on a tfan"ti"n . Thus. we "ere Icul), F 1 ou tputs are not ,,~nchronized \\ ith ci<:l<:k Ig~,_ bUI rather 'un 'hang in
aU'j(flt'(/(1 ahle to crcatc a MC<lly FSM wllh fewcf tatc, thitn '" n Moore F I l~ t\\' ce n dod edges if an iltput l'h,l1tge~ . For e\JlIlrle. )It, id'r Ih lImtng dt.\gr.ull

- - - ---
331 OpllmlZations and Tradeoffs 6.4 Datapath Component Tradeoffs 333
sho\\n in Figure 6.52(a) for a soda dispcnse r s Moore FSM . Note that the out put d EXAMPLE 6.19 Beepmg wristwatc h FSM .
. usmg a combined Moore/Mealy machine
become, 1 1I0r rig hr (lfter the inpu t enough became 1. but rather UII rhe fi rSI clock edge FIgure 6.54 shows a combined Moore/
ajrer enough became 1. In cont rast. the timing diagram for the Mealy FSM in Figure Mealy FSM state diagram describing the InpulS: b: Oulpuls: s 1_ sO, P
6 . 5~tb) ,hows that the output d becomes 1 righl (1{ler the input eno ug h becomes 1. beeping wnstwatch of Example 6.18. The b'/p:O
~ I oore outputs arc synchroni zeu wi th the cl ock: in panicu lar. Moore outputs onl y change F~M has th e same number of states as
on entaing a new , tatc. which means Moore outputs only change slightly after a rising did the Mealy FSM in Figure 6.53(a)_
clock edgc loads a new state into the state register. In contrast. Mealy outpu ts can change because the FSM sull associates the beep
b'/p=O
not just on entering a new S I ~He. but also any lime an input changes. because Mealy behaVIOr p= 1 W i th transitions. avoiding
the need for ex tra Sla tes to describe the
outputs are a fun ction of both the state and the inputs. We took advantage of this fact to
b~ep . BUI the combined FSM Slale
eliminate the Disp state from the soda di spenser s Mealy FSM in Figure 6.52(b). Notice,
diagram IS easier to comprehend than the b'/IT-O
howe\cr. in the timi ng diilgrall1 that the d output of the Mealy FSM does 1101 SlaY lfor a Mealy FSM state diagram, because the
complele clock c.\'Cie. If we are unsure as to whether d's hi gh time is long enough, we assignments to s I s 0 are associated wi th
could inc lude a Disp state in the Mealy FSM . That state would have a single transition, each ~tale, and not duplicated on every Figure 6_54 Cambinin.
\\ith no condition and wi th action d=1. poin ting back to state Illil. In that case, d would ou tgoing transition. b'/p:O Moore and Mealy -
be 1 fo r longer than one clock cycle (but less than two cycles). FSMs yields a simpler
The Mealy FSM feature of outputs being a function of states and inputs, which wri twalch FSM_
enables the reduction in number of states in some cases. also has an undesirable charac- 6.4 DATA PATH COMPONENT ffiADEOFFS
teristic-the outputs may glitch if the inputs glitch in between clock cycles. A designer
~~e:~d~: 4, we created several components that are useful in datapath . In that chapter. we
u,ing a Mealy FSM shou ld determine whether such glitching could pose a problem in a
panicular circuit. One solution to the glitching is to inse n flip-fl ops between an asynchnr describe m n;,:
basIC, easy to understand versions of tho e components. In this section_we
et s to bUI ld faster or smaller versions of ome of those components.
nou Mealy FSM's inputs and the FSM logic. or between the FSM logic and the outputs.
uch flip-fl ops make the Mealy FSM synchronous, and the Outputs will change at predict-
able interva ls. Of course. such flip-fl ops introduce a one clock cycle delay. Faster Adders

Implement ing a Mealy FSM TABLE 6.8 Mea ly state table lor soda Add"1I1~ two numbers is an extremely common operation in digital circuits, so it mak
We create a controller imp lementing a Mealy dispenser ~:n se_ .or us to try to cr.eate an adder that is faster than a carry-ripple adder. Recall that a
FSM in nearly the identical way that we created a rry npple adder reqUIres that the carry blls ripple throu2h all the full-adders bef.ore all
In puts Outputs
controlle r for Moore FS Ms in Section 3.4. using ~e outputs are co:r~ct. T~e longest path through the c; uit, shown in Fi2ure 6_- -. i
the method of Table 3.2. The only difference is sO enough nO d cle ar ~nown as the CirCUli s crlflcai path . Since each full- adder has a delav of ( \\"0- 2ate-delav
that whe n we create a state table. the FSM out- Inil 0 0 1 0 1 en a 4-bll carry-npple adder has a delay of 4 • 2 = 8 "ate-delay -A ~~ - bl' t
add ' d I . 3? ' '= -' ~ -npp c
-I' -c·- -
puts ' values fo r all the rows of a panicu lar Slate 0 1 1 0 1 er s e ay IS ~ 2 =. 64 gate-delays. That 's rather slow, but the nice thin2 -about a
J _ ..

won -t necessarily be identical. For examp le_ Table carry-npple


4 b" adder IS thal li doesn't require
very- man)' '=oale ' If a fuji -add
h . . er uses- - !!at - ,
6.8 show~ a state table for the Mealy FSM of 110;1 1 0 1 0 0
t en a - 11 carry-npple adder reqUIres only -l • 5 = 20 2ate . and a 3_-bit --ri I
1 1 a 1 0
Figure 6.52(b). Notice that the output d should be adder would on ly require 32 • 5 = 160 gates. - ruT) pp e
a in state Wail (50=1 ) if enoug h-a . but should
be 1 if enough= 1. In contrast. in a Moore state table. an output"s values were identical a3 b3 a2b2 at bt ao bO Q

wIthin a given state. Given the state table of Table 6.8, we would proceed to implement the
oll1binational logic in the same manner as descri bed in Section 3.4.
\'''''/fllt Int' (}HI Combining 100re and Mea ly FSMs
lit ""/tt'M/In!
Dc, igne" often utilit.e FSMs that arc a combination of Moore and Mealy types. Such a
'-100ft" 11' HuJt's
mu hdp flU comblllatlon allow~ the de\igner to specify some actions in tate _ and others on transi-
rt'mrmhu /lUll U 11 0n'>. Sueh a combination provides the reduced number or state advantage of a Mealy
W.,.,rr F51,,/ I
FSM. yet avoid, having to replicate a , tatc', action. on every ou tgoing trnnsition of a
,~ 114"'.f Of IJ' tn
1~1'il'''1 ... 1,,/,. Itate_ Thl l , implificatlon i, rea ll y ju,t a convenience to u designer describing the FSM:
\ 1,./11 "on Ilu' the underl YIng implcmentatlon wi ll l i ~cly be thc arne as for the Menly FSM having rep- 51
Il/lIlllillft
heated actionl on a 'tate'" ou tgoing tranl;ti nl Figure 6 55 ~ -bit carry-ripple adda_\\ith th,' I,>ng sl P.1th (th,' ,-nl1,'all'1thl ,oo\\n.
3J~ OptimIZations and Tradeoffs
l 6.4 Oatapath Component Tradeoffs 335
We \\oldd like to de,i~n an addcr thut i, much closcr to the dday of just a few gates,
pcrhap . . abollt 5 or 6 gatc-dda)!'-. at the po~~ibl c l!xpcn~c of morc gales. a3b3 __ cm
a2 b2 al bl aObO cin carries: c4 c3 c2 cl cO
T\\ o-Level Logic Adder 4-bit adder
coul
B: b3 b2 bl bO
One ob\ iOll':" way to crC(l tc a faster adder at the expense or morc gates is to Li se our 53 52 51 sO
A: + a3 a2 al aO
earlier-deli ned two- level combinational logic design process. An adder designed using
twO Ie"el> of logic has a delay of onl y twO gate-de lays. ThOl 'S certa inl y fast. But recall caul 53 52 51 sO
(a)
from Figure ~ . 15 that buildin g an N-bit adder using twO leve ls of logic results in exces- (b)
a3 b3 a2 b2
she ly large ci rc uil~ as N increases beyond 8 or so. To be ~lI rc you gel thi s point, let's al bl
res tate the previous sentence sli ghtly:
Building an tV-bit adder 1I,ing twO levels of logic re~ lIlt~ in sJ/Ock i,,~/y large circuits as N
incre~c~ be) and or so.

For example. we estimated (i n Chapter ~) that a two- level 16-bit adder would
require about 2 milli on transistors. and that a two-Icvel 32-bit adder would requi re about
100 bi II ion transistors.
On the other hand. a7 a6 as a4 b7 b6 b5 b4 a3 a2 al aO b3 b2 bl be
building a ~-bit adder using
twO le"el, of logic results in a a3 a2 al aO b3 b2 bt be
big. but reasonabl y sized ci (c)
adder-about 100 gates. a was
sho\\ n in Figure ~ . 15 . We could Figure 6.57 Adding 1\\'0 binary numbe b "
looks ~1 all earlier bi ls and computes w~e~e~ ~h~\'e me~~ci~nt carry-lookahead scheme-each rage
bui ld a larger adder by cas- co delay ISslage 3 which has 2 100 ic I I f carry In bUIQ mal stage "QuId be a 1. The lon<test
cading such fast ~ -b it adder> Figure 6.56 8-bil adder built from 1\\'0 fast 4-bi t adders. of only four ga;e-delays. e eve S or the lookahead. and 2 for the full-adder. for a total d;lay
together. Say \: c wa nted an 8-
bit adder. We cou ld build this
by cascading "'0 fast ~ -b it adders together. as shown in Figure 6.56. If eacb 4-bit adder
A Naive Inefficient Carry-Lookahead Sellen .
i, built from two le'cls of logic. then each 4-bit adder has a delay of 2 gate-delays. The of carry-lookahead is as fo llows. Recall that~' One Impk but nOt "et) effi ient way
-I-bit adder on the right take 2 gate-delays to gcnerllle the , um and carry out bits, after II1pUlS a b and c ad e output equauon for a full-adder ba\ino
. . . n outputs co and s . are: =-
\\hich the ~-bit addcr on the left take another 2 g31 e-dclay~ to gencrate it outputs,
re~ulting in a IOta I delay of 2 + 2 = 4 gate-delays. For a 32-bit adder bu ilt from eight s = a xor b xor c
-I-bit adde". the delay wou ld be * 2 =
16 gate-delay,. and the -ize would be about CO = bc ae + ab
8 • 100 gates = 800 gates. That's mucb bener lhan the 32 * 2 = 6-l gate-delays of a carry-
ripple adder. though lhe improved speed co m c~ at the expen,e of more gates than the So we know that lhe equations for el. e2 . and e3 in a ~-bit adder will be:
32 - = 160 gate, of the carry-ripple adder. Which de,ign is bcner? The answer depends el coO bOcO + aOeO + aObO
on your requirements-lhe de,ign w.ing two- leve l log ic 4-bi t adders i bener if you e2 col blel aIel + albl
require marc ,peed and can afford the ex tra gate" whe re,,, the dc,ign using carry-ripple e3 co2 b2e2 + a2e2 + a2b2
-I-bit adde" i, better if you don'l need the speed or can't afford the ex tra gates. It' a
tradeoff. In
. other
r words. the equation for the carry
' ' ta a pano. ular stage i- the same a- the equa-
-In
t,on ,or the carry-out of the pre"ious stage.
Carry-Looka head Adde r We can substitute the equali n ~ r e-l ,'nt e 2 - equati II. resulling in:
A carry-Iookahead adder imp")\c; on the ,peed of a carry-ripple adder. but without using as
many l!ate, as a t"o-Ievel logic addeL The baSIC Idea" to "look abead" into lower stages to e2 - blel + aIel + albl
determine whelher a carry "ill be cremed in tbe pre,elll , tage. -1l1i, lookabcad concept i e2 - bl(bOeO + aOeO + aObO) ~ al(bOcO T a cO ~ aO
'cry elegant and general lie, to other problem,. We will therefore 'pend ,ome time intro- c2 - blbOeO + blaOcO + blaObO ~ albOeO ~
dUCIn!! the IntUIlU)I1 unoerlYlng lookabead on,"der the ,"dellt"on of t\\O 4-bit numbers albl
,h,,"'n In I lgure 6.57(b). WIth lhe carne, In each column I.,hclcd O. ( I. t2. e3. and e4.
\ e can thell ,ub,tilllte the equ:lli n for c2 into c3 ', equal1oll. re,uhlOg in:
-
336 6 Optimizations and Tradeoffs 6.4 Datapath Component Tradeoffs 337

e3 : b2e2 + a2c2 + a2b2


e3 = b2(blbOeO + blaOeO + blaObO + albOeO + alaOeO +
alaObO + albl) + a2(b lb OeO + blaOeO + blaObO
+ albOeO + alaOeO + alaObO + albl) + a2b2
carries: c4 c3 c2 cl

::I· :~ :
/~ ........ cin
: \ Cl---
bl 1 bO :---- _______ ~
al l aO ;
~. .
1

-- ~ 1
lE ljl:'
!
,''''

1 "
:
1
1
i .:!i...:J....:
1 ']01
+ 1
...
1----
t . ~bO
-
o ~_ao
cO

e3 = b2blbOeO + b2blaO eO + b2blaObO + b2a lbO eO + cout 53 s2 51\SO/ 0: -+~~


b2alaOeO + b2alaObO + b2albl + a 2b l bOeO ' ..../ ,,' ' ,: 0 0
+ a2blaOeO + a2blaObO + a2albOeO + a2alaOeO (a) i'-~ObO =l' if-~O xor bO t =
then cl = 1 then cl = 1 ~ cO = 1
+ a2alaObO + a2albl + a2b2 a3 b3 (call thiS G: Generate) (call this p . Propagate)

,J"·:~~:::U~---J -f~
- - - - - - ~ ij.~
L f~
We'lI omi t the eq uation for e 4. in order to save a few pages of paper.
We could create each stage with the needed inputs. and include a lookahead logic
component implementing the above equations. as shown in Fi gure 6.57(c). Notice that
there is no rippling of carry bits from stage to stage-eac h stage computes its own carry·
in bit by ""looki ng ahead"" to the val ues of the previous stages.
While the above demonstrates the basic idea of carry· lookahead. the scheme is not
very efficient. e I requires .j gates. e2 requires 8 gates. and e3 requires 16 gates, with 00
each gate requiring more inputs in each stage. If we count gate inputs . e I requires 9 gate , G' " _____
inputs. e2 requires 27 gate inputs. and e3 requires 7 1 gate inpu ts. Building a larger
~ .•. --- ... •. • - -. - ----.---_. .J
adder. sayan .bi t adder. using this lookahead scheme wou ld thu likely result in execs·
sively large size. While the pre ented scheme is therefore not practical. it serves to cout 53 (bl
P3 G3 52 sO
introduce the basic idea of carry·lookahead: by having each stage looking ahead at the PO GO
inputs to the previous stage and computing for itself whether that stage's carry.in bit
r
"1~"1Jft(jl~lI~~~~il~li~~~~~~;;~~~:t~~
should be I, rather than waiting for the carry·in bit to ripple from previous stages, we get
"
a fo ur·bit adder with a delay of on ly 4 gate·delays. "
""
::
A ll Efficient Carry-Lookahead Scheme. A more efficient carry· lookahead scheme is ""
as follows. Consider again the addition of twO 4·bit numbers A and B. hown in Figure """"
6.58(a). Suppose that we add each column's tWO operand bit (e.g .. aO + bO) using a """
half.adder. ignori ng the carry·in bi t of that column . The resulting half·adder outputs """"
(carry.out and sum) give us some useful informati on about the carry for the next stage. [n '.
""
panicular: "
"
: : /I ,

If the addi tion of aO wi th bO resul ts in a carry·out of 1. then we know for sure


that e I will be 1. regardless of whether cO is a I or O. Why? Because considering l~=== ============~~~~~=~~~=== =...------------------
-~~.~P:-~~ ,-._. --------------
adding aO+bO+eO . then 1+1+0=10. and 1+1+1 - 11 (the ""+" represents add cl = GO + POcO
here, not OR}-both cases generate a carry·out of I. Recall that a half·adder com· c2=Gl +P1GO PfPOcO
c3 = G2 + P2Gt + P2PtGO + P2P1POcO
putes its carry·out as a b. cout;, G3 + P3G2 + P3P2Gl + P3P2P1GO + P3P2P1POcO
If the additi on of aO with bO re ults in " su m of 1. then e I wi ll be 1 only if cO is (cl
1. In panicular. con idering aO+bO+eO. then 1+0+1 - 10 and 0+1+1-10. Recall Figure 6.58 Adding IWo binary numbers using 3 fasl cam ·Iookah d h . ' .
propagate and -. ea "" eme. (al ,dea of a
that a half·adder computes i L~ sum as a xor b. genernte tenns. (b) computing lhe propagate and c.ener-He U!mh d -ill
'IO
~
In other word~ , el wil l be I if aObO-l. OR if aO xor bO - 1 A D eO- !. So 10Ihe c;rrY .lookahe..1d logic. (e) using Ihe prop.1gale ';;d gene"';;e lerntS I q~':h ~~ ng::;m
c~mes or each olumn. The correspondence bel\\een e I in fi2Urt" tel and bl -·sOO put.
we get the following equation, for the carry bits: cn'CIcs connected by the line: ~imilar ('o~pondences eAist fo; c ... and C . ~ I' ~ wn b~ ~o
cl - aObO + (aO xor bO)eO
e2 - albl + (al xor bllel Let's include a hnl f·ndda in en h tage to add th~ ( \\ 0 """mnd b'I" t' tha I
<h . F . 6 -8 - -.~- '" ,r t, unm ,
c3 - a2b2 + (a2 xor bZ)eZ . ~vn 1~1 Igure .~ (b). En h half·adderoutpulS a cam-uut bit {\\ luch 1<0 \ .1 • •
(which IS a or b) . ote III . t he figllre
. . :'Olumn. \\C Ju,t
thut for a gi\ cn , JIlu 'hwn 11
nU'd r the
e4 - a3b3 (a3 xor b3le3 I<
338 Optimizations an d Tradeoffs
6.4 Oatapath Component Tradeoffs 339
half-addcr'$ ~ l11l1 Olltput with the colu mn 's carry-in bit to compute that col umn's sum bit, propagale, and generate b' I '
because Ihe sum bil fo r a column is jusl a xor b xor c (see Secli on 4 .3. page 188). 6.58(b) thaI each SPG b ns- el S call1hose "SPG blocks," and you'll recall from Figure
UJn '/UHCIWmt'.\' Let"s re name Ihe earry-ou lpUI of Ihe ha lf-adde r gel/ erate. symbolized as G-so GO use the pro lock conSIsts of JUSI three gates. TIle 4-bn carr)'-Iookahead logic
\\ 'ht'1/ clObO=J . lit' mean, aObO . Gl mca ns alb!. G2 mea ns a2b2 . and G3 means a3b3 . Le l's a lso rename using only I paglate a nd generale bils 10 precompute the carry bits for high-order stages.
kfl(}\I\lt'sllould wo eve ls of gates.
gel/ frail! d I felr
Ihe sum OUlpU I o f Ihe half-adder as propagate-so PO means aO xor bO o PI means The complele 4-bil I
c/. H ht'II"Oxor al xor b!. P2 means a2 xo r b 2. and P3 means a3 xor b3. In sho rt: Ihe nonlookahead 10 ' carry- ookahead adder require onl y 26 gates (4 *3=12 gate for
bO= 1.11£,1..,1011 g lC, and Ihen 2+3+4+5= 14 gates for the lookahead logIC).
Gi ~ aibi (gel/erale) TIle d e Iay of IllIS 4 bil dd .
- a er IS onl y 4 gale-delays- I gale Ihrouoh the half-adder 2
propagate lhe cO Pi ~ ai xor bi (propagme) gates Ihrough the carry lookah d I . " .
thos I . F - ea OgIC, and I 10 finally gene rate Ihe sum bil (we can see
fa/m'l/rcllt' I'll/lit'
oJcl. mf'Cl1Img c / I de gah es tn Igure 6.58(b) and (c». An 8-bil adder buill usi ng the same carry-looka-
When we pe rfor m carry- Iookahead. ralher Ihan look ing direc ll y al Ihe operand bils of lea sc eme wou ld still hav . d I
sholild t'qlUlll,.'O. (8*3-?4 f e.r e ay of onl y 4 gate-delays. bUI would require 64 gate
previous slages as we did in Ihe naive look ahead scheme (e.g .. slage I looking al aO and
bOlo lel 's look inslead at Ihc half-adder oulPUIS of Ihe previous slage (e.g .. slage I looks
k I -d I gales or Ihe nonlookahead logic. and 2+3+4+5+6+7+8+9 = 44 gales for the 100-
a lea 0lgd'C). A .16-bil carry-Iookahead adder wo uld still have a delay of 4 gate-delays.
at GO and PO). Wh y? Because the lookahead logic wi ll turn OUI 10 be simpler Ihan in the b
2 ul3+4
wou- 6 requIre ?OO - gales ( 16*3 =4 8 gates for Ihe non lookahead 10glc. . and
nai ve lookahead scheme. 3 + . +)+ +7+8+9+ 10+11+12+1 3+ 14+15+16+1 7= 152 gates fo r the lookahead logic . A
We ca n Iherefore rewrile our equations for each carry bil as fol lows:
2-bn carry-lookahead adder would have a delay of 4 gale-delays. but would require 656
cl GO + POcO gates (32*3=96 gales for the nonlookahead logic. and 152+18+19+20+21+22+23+24+25
c2 Gl + Plcl +26+27+28+29+30+3 1+32+33=560 gales).
c3 G2 + P2c2 Unfort unately. Ihere are prob lems thaI make
,,
r _ ____ __ ___ _ _ _ _ _ ____ _ _ _

c4 G3 + P3c3 the size and de lay of large carry-Iookahead


adders less attractive. Firs!. Ihe above analysis ,,,
Substiluling like we did for Ihe na ive sche me, we gel Ihe fo llow ing carry-Iookahead ,,
equatio ns: counts gates, bU I nO! gale inputs. whereas gale ,,
tnpUIS belter lell us the number of lransistors
,, ,,
cl GO + POcO ,
c2 Gl + Plcl Gl + Pl(GO + POcO)
needed . NOlice in Figure 6.58 that the gales keep
: ~~:::::=:!""'-.J ,,,
getting wl~er in higher stages. For example, stage
+ PIGO + PI POcO ,
c2 Gl
+ P2c2 G2 + P2 ( GI + PIGO + PIPOcO)
3 has a 4- tnput OR gate and 4- inpul AND gate. ,,,
c3 G2
+ P2Gl + P2P1GO + P2PlPOcO
wh tle slage 4 has a 5-inpul OR gate and 5-inpul ,,
c3 G2
c4 G3 + P3G2 + P3P2Gl + P3P2PlGO + P3 P2PlPOcO
AND gate as hi ghlighted in Figure 6.60. Siage 32 ~ ___ ____________________ ~~~~_~J
of a 32-bil carry-Iookahead adder wo uld have 33-
Re me mber. Ihe P and G symbo ls represent simple lerms: Gi ~
ai * bi, Pi ~ ai input OR and AND gates, along wilh other large Figure 6.60 Gate size problem.
xor bi . gates. Since gates with more inpuls need mo;e
Figure 6.58(c) shows the circuits implementing Ihe carry- Iookahead equations for transistors, Ihen in lerms of tran istors. the carry-Iookahead design is actuall y quite large.
compuling each slage's carry. Furthermore, those huge gales would nO! have Ihe same delay as a 2-input AND or OR
Figure 6.59 shows a high-level view of Ihe carry- Iookahead adder's design from gale. Such huge gates are Iypically built u ing a tree of smal ler gates. a we would ha\'e
Figure 6. 58(b) a nd (c). The four blocks o n Ihe lOP are responsible for generating the sum, more gate-de lays.

Hierarchical Carry-Lcokahead Adders. Building a -I-bit or even -bil carrv-lookahead


adde r using the previous sec lion·s method may b; reasonable with respecl I~ gale sizes.
bUI larger carry- Iookahead adders begin to involve gates with 100 many inputs.
We can build a larger adder by connecting smaller adders in a carry-ripple manner. For
example. suppose we have 4-bil carry-Iookahead adders available. We can build a 16-bit
adder by connecling four 4-bil carr)'-Iookahead adders. as sho\ n in Figure 6.61. Lf each
4-bil carry-look ahead adder had a -I-gale-delay. then the lotal dela) o f the l6-bit adder
wou Id be 4~+4~ = 16 gale-delays. Compare this to the delay of a 16-bil :lIT) -ripple
adder-if each fu ll-adder has a IWO gale-delay. then a 16-bil calTy-ripple adder would ba\e
a delay of 16*2 = 3_ gate-delays. Thus. Ihe 16-bil adder built from ~ ur !lIT) -1<X) ' ahead
Figure 6.59 Hlgh· level view Or" 4·bil earr)'-Iookahc:rd adder. adders connecled in a carry-ripple manner is Iwice as fasl as the 16-bit :lIT) -ripple udder.
3-'0 6 Optimizations and Tradeoffs
6.4 Data path Component Tradeoffs 341
(Actually. careful observat ion of Figure 6.55 reveals that the carry-out of a four-bit carry- To understand these equ [' •
lookahead adder would be generated in three gate-delays rat her than fo ur. resulting in even ~olumn should e ual th ~ Ions, recall that propaga te meant that the o utp ut carry fo r a
the COlumn) F qh e '"put carry of the column (hence propagau ng the carry through
faster operation of the 16-bi t adde r built from four carry-Iookahead adders-but for sim-
plicity. let's not look inside the compone nts for such detai led ti ming analysis.) Sixteen gate- stage of the ~-:i: ~da~ to be the case for the carry in and carry out of a 4-bit adder, the first
' . er must propagate Its '"put carry to its output carry, the second sta"e
delays is good. but can we do bener? Can we avoid having to wait fo r the carries to ripple must ropagate Its '"put ca t '
[ othP
e
d rry 0 Its output carry, and so on for the third and fou r stages
from the lower-order 4-bit adders to the higher-order adders? n er wor S each internal . I .
P3P2P1PO. ' propagate signa must be 1. hence the equation P
bl l.bB a7a6a5a4 b7b6b5b4 a3a2al aO b3b2blbO
Likewi se • recall
. that g enerate meant that the output carry of a column should be a 1
( Ilence generating
G a carry of 1) . Generate should thus be 1 If . the first stage generates
a carry ( 0) and all the higher stages propa"ate the carry through (P3P2Pl) yield'
the term .P3P2P1GO . G enerate s houId also be" a 1 .If the second stage ge nerates ' a carry
Lng
a nd a ll hlgher stages propaga te the carry through, yie lding the te rm P3P2Gl. Likewise
cout s15-s12 sll·s8 for the third stage, whose term is P3G2. Finall y, generate should be 1 if the founh stage
generates a carry, represented as G3. ORing all four of these terms yields the equatio
Figure 6.61 l6-bit adder implemented using fo ur 4-bit adders connected in a carry-ripple manner.
G - G3 + P3G2 + P3P2G1 + P3P2P1GO. n
In fact. avoidi ng the rippling is exactl y what we did in developing the 4-bit carry-looka- We ~ould then revise the 4-bit carry-Iookahead logic of Figure 6.58(c) to include
head adder itself. Th us. we can repeal Ihe Sallie cany- Iookahead plVcess all/side of the two additIOnal gates in stage four, one AND gate to compute P - P3 P 2 P 1 PO. and one
4-bit adders. to quickly prov ide the carry-in value to the highe r-order 4-bit adders. To OR gate to compute G - G3 + P3G2 + P3P2G1 + P3P2P1GO (note that sta"e four
accomplis h this. we add another 4-bit carry-Iookahead logic block outside the four 4-bit a lready has AND gates fo r each term, so we need only add an O R gate to OR the ~erms).
adders. as shown in Figure 6.62. T he carry-Iookahead logic block has exactl y the same For conCiseness, we 0 11111 a fi gure showing these two new gates.
internal design as was shown in Figure 6.58(c). Notice that the lookahead logic needs prop- We can introdu ce additional levels of 4-bit carry-lookahead gene rators to create even
agate (P) and generate (G) signals from each adder block. Previously. each input block larger adders. Figure 6.63 illustrates a high-level view of a 32-bi t adder built using 32
output the P and G signals j ust by ANDing and XORing the block's a i and bi input bits. SPG blo?ks and three levels of 4-bit carry-lookahead logic. otice that the carry-looka-
However. in Figu re 6.62. each block is a 4-bit carry- Iookahead adder. We therefore must head logiC form a tree. Total delay for the 32-bit adder is only two gate -delays fo r the
modify the internal desig n of a 4-bit carry- Iookahead adder to output P and G signals. so
that those adde rs can be used with a second level carry-Iookahead generator.

b11-b8 a7a6a5a4 b7b6b5b4 a3a2a l aO b3b2bl bO

Figure 6.62 l6-bit adder im plemented usi ng four CLA 4-bit adders and a second level of lookahead.

Thus. le t\ extend the 4-bit carry-look ahead logic of Figure 6.58 to o utput P and G
signals. The equations for the P and G outputs of a 4-bit carry-Iookahead adder can be
written as follows :
P - P3P2PIPO Figure 6.63 Vicw of multilevel carry-lookahead. showing tree stru lure. \\ hi h erutbl fast .<!din n
G - G3 + P3G2 + P3P2Gl + P3P2PIGO with reaso nable numbers and sizes of gUles. En h level adds nly 1\\ gate-<iel )s.
3~2 Optimization5 and TradeoH5 6.4 Oatapath Component TradeoHs 343

SPG blocks. and two ga te-delays for each level of can'y- Iookahead (CLA) logic. for a se lect. USing a multiplex h .
o ut would the n I er. t e appropnate adder for Nibble I. Nibble I '5 selected carry-
total of 2+2+2+2 = 8 gate-delays. (Actuall y. c loser exami nati on of gale delays within
wo uld fin all y sel;~t~~~ t:e app~opnate adder for Nibble2. Nibble2's se lected carry-out
each component wo uld demo ns trate thm total de lay of the 32-bit adder is actuall y less
be 6 gate-dela 5 ~ N' ppropnate adder for Nibble3. The del ay of s uch a n adder would
de lays for Nib~le3?: I, bblel . plus 2 gate-delays for Nibble2 ' election. plu 2 oate-
than 8 ga te-delays.) Carry- Iookahead adders buill from mul tiple le ve ls of carry-lookahead
logic are kn own as IIIlIltile,'el or hierarchical carry-Iookahead adders . adders would hav se eCllon-for a total of on ly 10 gate-del ays. Cascading fo ur 4-bit
In summary. the carry- Iookahead approac h res ulLs in faster add itions of large binary select . e required 4+4+ 4+4 = 16 gates-delays. The peedup of the carry-
numbers (more than 8 bits or so) than a carry- ripple approach. at the ex pense of more gates. verSIOn over the cascaded version wo uld be
However. by c lever hie rarc hical design. the carry- Iookahead gate size is kept reasonable. 16 / 10 = 1.6. TOlal size would be 7*26 = 182 gates.
plus the gates for the three 5-bit 2x I muxes. That 's
Carr y-Select Adders prenyefficient size fo r prelly good speed. • ca rry·lookahead
AnOlher way to build a larger adder from smalle r adders is known as carry-select. Con-
. F'gure 6.65 illustrates the tradeoffs amo ng adder • multilevel
sider bui lding an 8-bit adde r from 4-bit adders. A ca rry-se lect approac h uses two 4-bit
deS igns. Carry-ripple is the smallest but ha the . '"
~ carry-lookahead
adders for the hi gh-ord e r four bits. whic h weve labe led H14 _ 1 a nd H14_0 in Figure 6.64.
longest delay. Carry-Iookahead is the fastest but has • carry-select
HN_ I assullles the calTy- in w ill be I. whi le HI4_0 aSSllllles the carry-i n wi ll be O. so both • carry-
the largest size. Carry-select is a compromise
ge nerate s table o utput at the same time that LO.J ge nerates stab le o utput-after 4 gate- ripple
between the two. involvi ng some lookahead and
de lays (ass uming the -I-bit adde r has a de lay o f four gate-dela ys). We use the L04 carry- delay
some rippling. The choice of the most appropriate
o ut value to se lect among H14_ 1 or HN _O. using a 5-bit-wide 2x I multiplexer-hence
adde r for a des ign de pends on the speed and size
the tenll carry-selecl adder. constra tnts of the design. Figure 6.65 Adder tradeoff .
a7a6a5a4 b7b6b5b4
Smaller Multiplier-Sequential (Shift-and-Add) Style
An array-s tyle multiplier can be fast but may require a 101 of aa te Cor 'd b ' .
a3a2al aO b3b2bl bO I ' I' . . ' e W I e- It\vldth
mu tiP lers. itke 32-blt mu lti pliers. In this section . we c reate a sequential I '·
. t d f b" mu Ilplier
ttlS ea 0 a com ttlallonal one. in order to reduce the size of the multiplier Th 'd
. I '" . e I ea of a
ci sequentla mult'piter IS to keep a running sum of the panial products and compute ea h
pantal. product one at a time. rather than computing all the pani al produc t at once d
UIll.llllll g them. an
Fi gure 6.66 provides an example of 4-bit multiplication. As ume we s tan with a
runlltng of SUIll of 0000. Each step corresponds to a bit in the multiplier (the second
number). In step I. we com pUle the partial product a 0110. which we add to the runnin a
sum of 0000 to obtattl 00 II O. In step 2. we compute the panial product as 0 I 10. which
we add to the propercolumns of the runmng sum of 00 11 0 to obtain 010010. In ste '
co 57565554 53525150 P
we compute the pantal product as 0000. which we add to the proper co lu f .
. L' k ' . runs 0 the
Figure 6.64 8-bil carry-seleci adde r implemented usi ng Ihree 4-bil adders. runlltng sum. I 'eWlse for step 4. The fi nal runlltng sum is 00010010. whi h i
correct product of 0110 and 0011. the
The de lay of a 2x I mu x is 2 ga te-delays. so the lota l de lay o f the 8-bit adde r is 4
gate-delay; for H14_ 1 a nd 1-114_0 to ge ne rate correCI sum bits (L04 executes in parallel).
Step 1 Step 2 Step 3 Step 4
plu~ 2 ga te-delays for the mu x (w hose se lect line is read y after o nl y 3 gate-delays). for a
0110 0110 0110 0110
tOlal of 6 ga te-delays. Compared w ith a carry- Iookahead imple me ntation usi ng two 4-bit
x 001 1 x 00'1 x 0 0 11 x 0011
adder~. wc've reduced the total de lay fro m 7 gale-de lays down to 6 gate-delays. The cost
is one ex ira 4- bit adder. If a 4-bi t ca rry- Iookahead adder requires 26 ga tes. then lhe design o000 I" 00 1 lO r 0 10 0 10 I" 00 1 0 0 1 0 (running Sum)
with two <I-bit adde" requires 2*26=52 ga tcs. while the carry-select adder requires ~~ ~~ ~OOOO ~ ~OOOO (P8noalptOduct)
3*26= 78 gate,. plu~ the ga tes for the 5-bit 2x I mu x. o0 t 10 0 10 0 10 00 1 00 1 0 0 0 0 1 0 0 1 0 (new runlllng sum)
We could " 1,0 bu ild a 16-bit carry-se lect adder u,in g 4-bi l ca rry- Iookahead adders.
by u,ing multiple levels of multiplexing. Each nibble ( four bits) wou ld have IWO 4-bit Figure 6.60 Multiplication done by generuling n p:u-tial produ'l for ~3l-h bil in the multipher (the
number on the boIlOI11 ). nccul1lulatlllg the part ud products III a rulllllllg ~um.
adder~. one a"umi ng a carry- in of l. the other a~, umin g O. ibb leO':. carry-out would
344 Optimizations and Tradeoffs
6.5 Rll Oesign Optimizations and Tradeoffs 345
Computing each partial prod uct is easy-we just AN D the current multiplicand bit
with every bit in the multiplier to obtain the partial product. So if the current multiplicand
bit is 1. the AND creates a copy of the mult iplier as the partial product. If the current
mdld
multiplicand bit is O. the AND creates 0 as the partial product.
We need to de termine how to add each partial product to the proper columns of the mrld
running sum. Notice that the partial product should be moved to the left by one bit rela- mr3
tive to the running sum afte r each step. We can look at this ano ther way-the running mr2
mr1
sum shou ld be moved to the right by one bit after each step. Look at the mu ltiplication
mrO
illustration in Figure 6.66, unti l yo u "see" how the ru nning sum moves one bit to the right rsload
relative to each partial prod uct. rsclear
rsshr
Therefore. we can compute the running su m by in it ia lizi ng a n 8-bit register to O. In
each step we add the partial product for the current multiplicand bi t to the leftmost four
bits of the runni ng sum. and we shift the ru nning sum one bi t to the ri ght, shifting in a 0
start
into the leftmost bi t. So the run ni ng sum register should have a clear functi on, a parallel
load function. and a shift right fu nction. A circu it showing the run ning sum reg ister and figure 6.68 FSM describing the conlroile r for the 4-bil multiplier.
an adder to add each partial product to that register is shown in Figu re 6.67.
In terrnsof performa nce, the sequenti al multipl ier requires two cyc les per bit. plus I
cycle for IOItlall zatlon. So a 4-bi t multiplier would require 9 cycles. while a 32-bit multi-
multiplier
phe r would req uire 65 cycles. The longest regis ter-to-register delay is from a n!gister
through the adder to a register. II we built the adder as a carry-Iookahead adder havin a
o nl y 4 gate-delays, then the total delay for a 4-bit multiplication wou ld be 9 cycles * ;;
gate-delays/cycle = 36 gate-delays. The tOlal delay for a 32-bi t multiplication would be
65 cycles.* 4 gate-delays/cycle = 260 gate-delays. While slow, notice that this multiplier's
size IS qUIte good, requiring only an adder, a few registers, and a state-register and some
contro l logic for the controller. For a 32-bit multiplier, the size would be far smaller than
an a rray-style mu ltiplier requiring 3 1 adde rs.
The mult iplier's design can be further improved by using a shifter in the datapath, but
~
e mrld we omIt details of Ihat improved design.
c:
8 mr3
mr2 6.5 RTL DESIGN OPTIMIZATIONS AND TRADEOFFS
mr1
mrO In Chapter 5, we described the process of RTL design. While creating the datapath durina
f----':rs:::foa:::d~---------_1load RTL design, there are several optimizations and tradeoffs that we might make to creat:
f----':rs:'c:;:le'7a"-r_ _ _ _ _ _ _ _ _--4~clear running sum
f---~ rs~s"- h~ r _ _ _ _ _ _ _ _ _ __1 shr smalle r or fas ter des igns.
register (8)

Pipelining Without pipelining:


start Microprocessors continue to become smaller. faster.
a nd less expensive. and Ihus designers use micropro-
product cessors whenever possible to implement desired
With pipelining:
digital system behavior. But designers continue to
f igure 6.67 Internal design of a 4-bit by 4-bit sequential mu lt iplier. c hoose 10 build thei r own digital circu its to imple- I WI I W2 I W3 1 . Stage 1"
The last thing we need to figu re o ut is how to con trol the circu it so that the circuit
ment desired behavior of ma ny digi tal systems. wi th
the mai n reason for that c hoice being speed. One
§] §] Ej . Stage 2 "
does the right thing during each tep-that 's exactly what conlro llers are for. Figure method of obtai ning speed from digital circuits is Figure 6.69 pplying pipelining [0
6.68 hows a n FSM describing the desired controller behavior of o ur seque ntial thro ugh the lise of pipelini ng. Pipelilling means to d~sh\\ashmg-washing and ~;ng
multiplier. break a large tusk down into a sequence of stages dIShes can be done n WTentl) .
3~6 Optimizations and Tradeoffs 6.5 RTl Oesign Optimizations and Tradeoffs 347
such Ihat data moves th rough lhe stages like parts move Ihrough a factory assembl y line. Latency versus Throughput
Each stage produces output used by the next Slage, and all stages operale concurrently, The lerm "performance" ne d b fi
F 670 b e s 10 e re ned due 10 lhe pipelining concept. NOlice in
resulting in beller performance than if data had to be fu ll y processed by the lask before ~g~e . . () .lhallhe firsl result 5(0) doe n' t appear umil after IWO cycles. whereas
new dala cou ld begi n being processed. An example of pi pelin ing is washing dishes wilh a I e eS lgn 111 FIgure 6.70(a) outputs Ihe fi rst res ull after only one cycle. Thal's because
friend. wilh you washing and your friend drying (Figure 6.69). You (the fi rsl slage) pick data
c.
mu st now pass lh ro uoh .
<> an eX ira row of regISters. The term latency refer to delay
up a di sh (di sh I) and was h it. Ihen hand il to your friend (Ihe second stage). You pick up lor new II1pUl dala 10 res ult · . .
B '. 111 new OUIPUI data. Lalency IS one kll1d of performance
Ihe nexl dish (dish 2) and wash il cOl/currelltly 10 your friend drying dish I. You then oth deSIgns 111 Ihe fi gure have a lalency of 4 ns. Figure 6.70(b) also hows that a ne~
wash di sh 3 while your frie nd drys dish 2. Di shwashing Ihi s way is nearly lwice as fasl as value fo r 5 appears every 2 ns, versus every 4 ns fo r Ihe design in Figure 6.70(a). The
when wa shing and drying aren' t done concurrent ly. term throughput refers 10 the rale at whi ch new dala can be input to lhe sy tern and
Consider a syslem wi lh data inputs H. X, Y. and Z. lhal should repealedly outpullhe slm.ll arly, lhe rale al whi ch new oUlpulS appear from Ihe syslem. The th roughpUl ;f the
sum 5 = \, + X + Y + Z. We could impl emelll lhe syslem using an adder tree as deSIgn In Fl gur.e 6.70(a) IS I sa mple every 4 ns, while lhe lhroughpul of lhe desion in
shown in Figure 6.70(a). The fastesl cl ock for thi s design must not be faster lhan the ~gure 6. 70(b) I I sa mple eve ry 2 ns. Thus. we can more precisely describe the p:rfor-
longesl path bel wee n any pair of reg islers, known as lhe crilica l palh . There are four pos- ance Improvemenl of our plpehned de ign as hav ing doubled the throughpllt of lhe
deSIgn.
sible palhs from any regisler OUIPUl 10 any regisler inpul , and each path goes thro ugh two
=
adders. If each adder has a delay of 2 ns. then each path is 2+2 4 ns long. Thus, the
EXAMPLE 6.20 Pipeline d FIR filler
crilical path is 4 ns. and so the faslesl clock has a period of at leasl 4 ns, meaning a fre-
quency of no more lhan I 14 ns = 250 MH z. Recall the 100-lap FIR fi lter from Example 5.8. We
estimated that implcmcnl31ion on a microprocessor
would req uire 4000 ns, while a custom di caital circuit
implementati on would requ ire only 34 115. That
custom digi tal circ uit utili zed an adder trce, wi th
seven levels of adders-50 addilions. Ihen 25. then 13
(roughly), then 7. Ihen 4, Ihen 2. then I. The IOlal
delay was 20 ns (for Ihe multiplier) plu seven adder-
elk elk delays (7*2ns= 14ns), for a lotal delay of 34 ns. We
can funher improve Ihe Ihroughpul of Ihat fi lter using
So mininum clock
~~~~O_~!~~_~~
elk - 1 L - - i L
*
~

elk~
So mininum clock
period is 2 ns
,-----,
pipel ining. NOlicing Ihal Ihe multipliers' delay of 20
ns is roughly equal 10 Ihe adder lree delay of 14 ns,
we might decide to insen pipeline registers (50 of
them since there are 50 mullipliers feed ing into 50
N
Q)
0>

'"
;;;

adders at Ihe lOp of the adder tree) belween Ihe multi-


S~ S(O) ~ S~ pliers and adder tree. resulling in dividing the
compu tation task into two stages. as shown in Figure
(a ) (b)
6.71. Those pipeline regislers shonen the critical path
Figure 6.70 Nonpipelined versus pipelined dmapalhs: (a) four regisler-Io-regisler palhs of 4 ns each, from 34 ns down to only 20 ns. mean ing we can clock Figure 6.71 Pipelined FIR filter.
so longe'l palh is 4 n . meani ng minimum clock period is 4 ns. or 114 ns = 250 MHz, (b) six the ci rcuit faster and hence improve the throughput.
rcgisler-to-regi ster paths of 2 ns each, so longest palh is 2 ns, meaning mini mum clock period of The Ihroughpul speedup of Ihe unpipelined design
2 "', or 112 ns = 500 MHz. co m par~d t~ the mi c.rop~ocessor i l1lpl emcntali~n was. 4000/3.4. ;;: 117. while the throughput speedup
of Ihe plpehned deSIgn IS 4000/20 = 200. QUJle a nice nddJllOnnl speedup for jusl insening orne
Figure 6.70(b) shows a pi pelined version o f lhis des ign. We merely add regislers registers!
belween lhe fi"l and second row of adde rs. Sin ce Ihe purpose of lhese registers is Although we could pi peline the adder tree also, that would not gain u higher throughput. since
solely relaled to pipelini ng, lhey are know n as pipelil/e registers. though lheir internal the multiplier stage would still represent the critical path. \ Ve call' t cI k a pipelined ~)stem an\
des ign is Ihe sa me as any ol her register. The compul atio ns bel ween pipeline regislers fas ler than the longes t stage. since otherwise that stage would fail to load COrre't \aJues into th~
stage's outpu t pipeline registers .
are known a, stages . By inserting lhose regi sters and lh us c reating a lwo-slage pipeline,
The I.Hency of the nonpipclined design is one cycle of 34 ns. or 34 n:-. totai. The latenc~ of the
we've reduced Ihe critical palh from 4 ns dow n 10 on ly 2 ns. and so the fastesl clock has
pipclincd design is two cycles of 20 ns. or 40 ns total. Thus. we ~ee IhJt pi~lining impro\~s the
a period of al leasl 2 n,. mea ni ng a freque ncy of no more Ihan 112 ns = 500 MHz. In throughput at the expense of hHt:n y.
olher words. jusl by inserting lhose pipeline regi lers. we've doubled the perfo rmollce
of our de,ign!
J.l8 6 Optimizations and Tradeoffs
6.5 RTL Desig n Optimizations and Tradeoffs 349
Concurrency The data path consists of 16 b .
lowed by 16 ab olute I su tractors operating concurrentl y on the 16 pixels of a row. fol-
A key reason for de igning a c ustom digi ta l c ircuil , ralher than wri ling software that exe- ue
res ult ge ts added 'thvalh components. The 16 result ing differences feed into an adder tree, whose
WI . e present sum for w 'f b k '
c utes on a micro processor. is 10 ac hieve improved performance. A common method of pares its COunter i with 16 since . n Ing ac. Into the sum register. The datapath com-
ac hieving perfomlance in a custom dig ital circ uil is th rough conc urre ncy. COllcurrellcy in difference between rows 16' . there are 16 rows In a block, and so we must compute the
digital design means to divide a las k into several independent su bpans, and then to ences of each row and th ~'m~s. The contro lling FSM loops 16 times to accumulate the differ-
.' en oa s the final result into the register sad reg. whkh connects to the
execule those subpan s simultaneo usly. As an ana logy, if we have a stac k of 200 dishes to SAD component s output sad. -
wash. we mighl di vide the slack into 10 s ubslac ks of 20 di shes each, and then give 10 of . In Exa~ple 5.:, we esti mated that a software solution would require about six c des r '
ou r neighbors each a subsl3ck. Those neighbors simultaneo usly go home, wash and dry i~;6c~;i:nson. Sillce there are 256 pixels in a 16x l6 block, the software would r~uire is6~~~
their respecti ve substacks, and return 10 us the ir compleled dishes. We wou ld get a ten J s to compare a pmr of blocks. Our SAD circui t with concurrency instead requires only 1
times speedup in dishwas hing (ignoring Ihe time to di vide the slack and move substacks ~y~ e It~:lom~~e ea~h row of 16 pixels. which the circuit must do 16 times for a block resulting in
from home to ho me). ny = . eye es. Thus. the SAD circuit's speedup over software is 1536 I 16 =' 96. In other
We have used conc urre ncy in several examples already. For example, the FIR filter words, the .relatlvely Simple SAD Circuit using concurrency runs nearly 100 times faster than a soft
datapalh of Figure 5.38 had three multipliers executin g conc urre ntl y. ware solulion. Thm son of speedup eventually translates to beller qu ality digitized video fro~
whatever Video appliance we are designing.
LeI's use conc urren cy to creale a fasler ve rsion of an earlier example.

Pipelining and concurrency can be combined h'


EXAMPLE 6.21 Sum-of-absolute-difference component with concurrency improvements. to ac teve even greater performance
In Example 5.7. we designed a custom circuit for a sum-of-absolute-difference (SAD) component, and
we estimated that component to be three times faster than a so ftware-an-m icroprocessor solution. We
can do even bener. Notice that comparing one pair of corresponding pixels of two frames is indepen- Component Allocation
dent of compari ng another pair. Thus, such comparisons are an ideal candidate for concurrency.
When the same operation is used in two different states of a hi gh-level state machine, we
We firsl need 10 be able 10 read the pixels concurrentl y. We can do this by redesigni ng the
can choose to ell her tnstanUale two functional units. one for each s tate. or one functional
block memories A and B. whic h earlier were designed as 256-byte memories. Instead, let's design
them as 16 word memories. where each word is 16 bytes (the total is still 256 bytes). Thus, each uOJt: whtch Will be shared .among the two states. For example , Figure 6.73 shows a
memory read corresponds to reading an entire pi xel row of a l 6x 16 block. We can then concur- poruon
. . . ' A and B, that each ha ve a mu I'
of a state machtne with IWO states ttp li catton
. oper-
rently determine the differences among all 16 pairs of pi xels from A and B. Figure 6.72 shows a auon. We can choose
" 10 use IWO dlSttnct multipliers as shown 'tn F'tgure 673( . a) (we
new data path and controller FSM for a more concurrent SAD component. aSS ume the t vanables represent regtsters). The figure also shows the control si !ffials that
wou ld be sel tn each Slale of the FSM contrOlling thaI datapa th. with the t 1 reoi;ter bein2
AO 80 Al 81 A14 814 A15 815 loaded tn the first state (tll d-I), and the t4 register beino 10 d d ' th e -
(t4l d= I) . " a e to e second state

"0---8
t = t2 • 13 14 = t5 • t6
F5M-A: (tlld=l) 8: (t4Id=1)
iii

¥¥
e2mul
=1 ~2
d i ll16
- -A8_rd=1
.~
.,
e1 mul

53 sum Id=1
i Ino; l
54 sad_regJd=l
(8)

Figure 6.73 Two different component allocations: (a) two multipliers. (b) one multiplier (c) the one
multiplier allocation represents a tradeoff of smaller size for slightly more delay. .
ConI roller Dalapalh

sad However. because a slate machine can ' t be. in. IWO s tates at the ame time_ then we
know that Ihe FSM wtll perform only one multtphcation at a time. 0 we an <hare ne
Ftgure 6.72 SAD datapalh using concurrency fo r speed. along with Ihe contro ller FSM . multiplier among the IWO states. Because fas t multipliers are big
_, su h h mug
' -ould sa'.
350 Optimizations and Tradeofts 6.5 RTl Design Optimizations and Tradeofts 351

a 101 of gates. A datapath wi th on ly one multipli er appears in Figure 6.73(b). In each state gates. wi th no performance loss . . .
that bindin o not onl y - an opuml za uon. as shown in Figure 6.74(c). ote that
of the state machine. the comrolle r FSM wo uld confi gure the multipl exer select lines to map to which co maps operators to components. but a lso chooses wh ich operand to
pass the appropriate operands through to the multiplie r. as well as loadin g the appropriate mponent IIlput · If we had d t3
Figure 6.74(b). the n MULA w • mappe to the left o perand of MULA in
destination register as before. So in the first state A. the FSM wo uld set the select line for M " ould have reqUi red two muxes rathe r th an just one
the le ft Illultiplexer to 0 to le t t2 pass through (s 1=0) and wo ul d set the select line for applllg a give n set of operations to a . I .
operator billdillg. Automated to I . II partlcu ar component allocation is known as
the righ t multi plexer to 0 to le t t3 pass through ( S r=O). in addition to selling tll d=l to given component allocation. 0 s typlca y ex plore hundreds of different bindings for a
load the result of the mutliplicati on in to the t 1 reg iste r. Likew ise. the FSM in state B sets
the muxes to pass t5 and t6 . and loads t4. Of course the tas ks of co .
demo If we ail ocate onl y one mponent allocallon and ope rator binding are interdepen-
Fi gure 6.73(c) illustrates that the one-multiplier design wo uld have smaller size. at
component. If we allocate tw component, then all operators mu t be bound to that
allocate many components. t~e~o:~~~:~t% then we have some choices in binding. If we
the expense perhaps of slightl y more de lay due to the multiplexers.
TI,e remlS A component library mi ght consist of numerous different functional units that could
will perform allocation and binding s· I any mlore bllldlllg chOIces. Thus. some tools
"opu(I!or" {/rid pOlenrially imple ment a desired operation- for a multipli cation. there may be several Imu taneous y, or the tools will iterate a th
"oper-arion"
mu lti pl ie r components: MULl might be very fas t but large. while MUL2 might be very two tasks. TOg,ether. component allocati on and operator binding are sometimes r~~:~d t e
refers 10 belial'ior;
small but slow, and MUL3 ma y be somewhere in between. There may also be fast but as reSOurce S IQrll1g . 0
like addition or
lIIuttiplicatioll, large adders. small but slow adders. and several choices in between. Furthermore. some
TI,e/enll
"compolllfli/"
components might support multiple operations. like an adder/s ubtractor component. or an Operator Scheduling
(aka 'jimcriollal ALU. C hoosing a panicu lar set of fu nction al units to implement a set of operations is
unit") refers 10 kn own as compoll ellt allocatioll . Automated RTL design tools conside r dozens or hun- Given a high-level state machine, we may in troduce add lliona
'. I sta tes to enable u to
hard\\'ore, like (III create a sma II er datapath. For example consider the h'gh-I I '..
adder or (I
dreds of possible component a llocations to find the best ones that represent a good 675( ) Th h' • I eve state machine III Fl oure
multiplier. tradeoff among size and performance. . a . e state mac lIle has three states. with State B having rwo multipl ications. Since

Operator Binding '"0---0--0


Gi ve n an allocation of components. we still have to choose whic h operations to map to (some tl = t2 • t3 (some (some 11 = t2 • 13 t4 = 15 • 16 (some
operat ions) 14 = 15 • t6 operations) operations) ~ _______ _
which components. For example. Fi gure 6.74 shows three multiplication operations. one operations)
in state A, one in state B, and one in state C. Figure 6.74(a) shows one possible compo-
nen t bind ing to two multipli ers. resulting in two multiplexers. Figure 6.74(b) shows an
3-state schedule
alternative binding to two multipliers. whi ch results in onl y o ne multipl exer, since the
same operand (t3) is fed to the same multiplier MULA in two differe nt states and thus

that mUltip lier's input doesn' t require a mu x. Thus. the second binding results in fewer •
4-slate schedule

(a) delay
(e)

t4 = t5 \\t6 t~,~,~~/ t3 Figure 6.75 Schedu ling: (a) initial 3-state schedute requires two muttipliers. (h) new 4-smte
schedule requires onl y one multiplier. (c) new schedule trades off size for delay (extra late).

t5 ~ t8~\ (f 6 f 3 • Binding 1
• Binding 2
those
. two multi plications occur
. in the .same state. and we know th a t eac h state Will
slllg ie clock cycle. then we wIll need two mUltipliers (at least) in the datapath to u
. be a

the two Slltlultaneous


. . multiplications III state B. But what if we 0 n Iy h ave enouoh ppon Oates
sl - T ii f s r
for o ne muluplier? In that case .. we can reschedule the operations so that there i " at 7no t

~~~,
onl y one multiplication needed III anyo ne state. as in Figure 6.75(b). Thus. when we allo-
cate compone nts. we need ? nl y allocate one multiplie r as hown. a nd as was also done in
Figure 6.73(b). The result IS a smaller but slower destgn. a illustrated in Fioure 6 -
(a)
That schedulin g example a sumed that the computation of t4 uld no t be moved
'" t . Sl3te
).
A or state C. perhaps because .those states already used a multiplier. r perha beenu
Figure 6.74 Two different operator bindings: (a) binding I uses two muxe., (b) binding 2 uses only
t 5 and t 6 were not ready yet III state A. and the new re ult in t4 \\ as no ed0 d PIII ' tate C.
se
one mu x. (c) binding 2 represents an optimization compared 10 binding I .
352 Optimizations and Tradeoffs
6.5 RlL Design Optimizations and Tradeoffs 353
Convening a computation fro m occurring concurrent ly in one state to occurring A dalapalh for this Slate h' . h . .
across several states is known as serializing a computation. . I' d mac Ine IS s own III Figure 6.77. The data path requires only one mul-
up Jer an one adder beca th . . . .
Of course, the inverse resched uling is also possible. Suppose we staned with the in Figure 6.76. The' . use erc IS at r:nOS( one multl~hc.atl on and one addition in any given state
panlcular configurallon of Ihe multlpher. adder. and register in Figure 6.77 is
high-level state machine of Figure 6.75(b). If we have plenty of gates avai lable and want ex tremely .common In single p~ocessing circuits. and is generally known as a multiply.accumulaJe
to improve our design's perfomlance, we might reschedule the operations such that we (MAC) unll. The dalapalh multIplexes the inputs 10 the MAC unit.
merge the operations of state B2 and B into the one state B, as in Figure 6.75(a). The
result is a faster but larger design. requiri ng two multipliers instead of one.
Generally. introduc ing or merging states, and assigning operations to those states, is
a task known as operator scheduling.
You may have noticed that operator scheduling is interdependent with component
allocation. which yo u may recall was interdependent with operator binding. Thus, the
tasks of schedu ling, allocation , and binding are all interdependent. Modem tools may
combine the tasks somewhat. andlor may iterate among the tasks several times, in search
of good designs.

EXAMPLE 6.22 Smaller FIR filter using ope rator sc heduling


Consider the 3-lap FIR filter of Example 5.8. That design had no controller. meaning the high-level
state machine actually had just one state containing aU the datapath actions. as shown in Figure
6.76(a). We could reduce Ihe size of the datapath by scheduling the operations across several stales.
such that at most one multiplication and one addition occurs per state. as shown in Figure 6.76(b).
The first stale loads the x registers with samples-note that the ordering of those actions nextla the
state doesn '( matter si nce all the actions occur simultaneously. That state also clears a new register
named sum. which ~ve had to introduce to keep track of the intermediate tap sums to be compUled
in the laler Slales. The second state compules Ihe firsl lap of Ihe filter resu lt. the neXI stale computes Figure 6.77 Serial FIR filter datapath. The components in Ihe dashed box comprise whal is known
the second tap. and the next Slate computes the third lap. The laSl state OUlputs the result, and then as a multiply-accumulale (MAC) component.
the state machine returns to the first state again.
One fu nher difference belween this datapalh and the concurrent datapath of Example 5.8 is
Inpuls: X (N bits) Inputs: X (N bits) Ihat Ihis datapath has load lines on the XregiSlers and on yreg. The conCUrrent design loaded those
Oulpuls: Y (N bits) Outputs: Y (N bits) registers every clock cycle. but Ihe serial design onl y loads those registe rs during particular tates-
Local registers: Local registers:
other Slates compu te intermediate results.
xtO. xtt . xt2 (N bits) xtO. xt1, xt2. sum (N bits)
sum =0 We estimated Ihe performance of the concurrent design of Example 5.8 assuming I os per gate.
xtO =X 2 ns per adder, and 20 ns per multiplier. The design had a critical path of 20 ns for the multiplier aod

W
xto = X
51 xt1 = xto xt1 = xtO then 4 ns for two adders In senes, for a total of 24 ns. That was al 0 the time between new results
xt2 = xt1 xt2 = xt 1 being laken in al tlle inputs and generated al the output: 24 ns. Using the more precise performance
Y = xtO' cO
+ xt1 ' c1 measures of lalency and throughput defined in Section 6.5. the concurrent design has a lmeney of 24
+ xt2· c2 sum = sum + xtO· cO
ns (delay from inpul 10 OUlput), and a throughput of I sample every 24 os. The serial d iro has a
(a) critical palh equnl to Ihe delay through a mux. multiplier, and adder. Assuming two gate-<lelaY; for the
mux, we obtai n a delay of 2 ns + 20 ns + 2 ns, or 24 ns. The latency from input to oUlput is five states.
meaning 5 • 24 ns = 120 li S. The throughput is I sample every 120 ns. Thus. the concurrent 3-mp FIR
sum = sum + xt' · cl filter has 120/25 = 5 times faster lalency. as well as 5 times fasler throughput. companed to the serinl
FLR filter. Recall from Example 6.20 that a pipelined concurrent FlR filter has even fasler throughput.
The performance difference between serial and concurrent become even more pronounced if
we look at an FIR fi lter with more laps. We estimated the latency of a concurrent lOO-tap FIR filter in
Figure 6.76 High-level state machine for 3-tap sum = sum + xt2·c2
Sectioll 5.3. after Example 5.8 to be 34 ns (the delay I grealer than the concurrent 3-tap filter becau
FIR filter: (a) original one-state machine, (b) Ihe lOO-tap fi lter needs an adder. ,:"e). The senal desIgn would till have a _4 os ritical path. but
fi ve-stale machine with at moSl one add and would require 102 states (I to lIuuahze, 100 10 compute the taps. and I to oUlput). for a lateOC) f
one multiply per state. We ignore the writing Y = sum 102 • 24 ns =2448 liS. Thus. Ihe latency speedup of the concurrent design would be ~44 1 34 = _.
of the constant registers (c O. c 1. c 2) for We should also consider the size difference between the serial and ncurrent design . Let's
simplicity in the example. (b) ass ume for illustralive purposes Ihat an adder reqUIres nppro"matel) 500 gates and a multiplier
35-1 Optimizations and Tradeoffs 6.6 More on Optimizations and Tradeoffs 355
require ... 5000 gates. The se ria l design' s o ne lllulti~l ic: and one .ad~ler wou ld thu s require only 55?O compared to ~oncurrent operations in a single state. Example 6.2 1 and Example 6.22 both
~at~!! . For a 3-tap FIR filter. the co ncurrent des ig n s 3 muillpi lers and 2 a~d~rs would r~q~lfe ".Iust~ated senal versus concurrent computation tradeoffs. for an SAD circuit and an FIR
5000*3 + 500*2 = 16.000 gates. For a IOO-tap FIR filt er. the concurre nt design s lOO multlphers CtrCUIl. respectively.
alone \\Quld requ ire 100*5000;. 500.000 gates- I00 times more gates than the sen al deSign.
Trading off between serial and concurrent computation is a fundamental concept
Intu itive ly. these numbers make sense. A concurre nt
dcsi2.n ror 100 lapS uses about 100 limes more gates (du e to
spanmng all leve ls of dig ital design. As a general rule, a concurrent design is faster but

1
• concu rrent FIR larger, whde a sert al design is smaller but slower.
usin~ 100 multi plie rs instead of just I) compared to a serial
design. ye t achieves about 100 limes bc tl cr performa nce (due Typicall y, numerous design options ex ist that span the ranae in between fully serial
compromises and fu lly concurrenl designs. <>
10 compu ting 100 multiplicati ons concurrentl y rath er than
computing one multiplication at a time). .
Depe ndin g on our pcrfonn ance need s and Size con- Optimizations and Tradeoffs at Higher versus Lower Levels of Design
senal
sLIaints. wc mi ght co nsidcr designs in betwee n the two - FIR
ex tremes of se rial J nd co ncurren t. s uch as a design with two As a general ru le, the.optimi zations and tradeoffs made at the higher Ie els of design may
multipliers. whi ch would be roug hl y tw ice as big and tw ice delay have a much greater tmpaci on design cri leria than the optimizations and tradeoffs made
as fasl as Ihe serial design. o r len multiplie rs. whi ch wou ld be at lower levels of design. For example, imagine wanting to dri ve to a city on the other
roughly ten tim cs a~ big an d ten tim es as fast as th~ serial
Figure 6.78 FIR design
tradeoffs. side of the country in as lill ie time as possible. We could reduce time by reduci ng the
design. Fi!2 ure 6.78 illustr.1tcs tradeoffs among se n al and
number of stops we make to eat. meaning we carry our own food in the car. We could
con;urrent~designs for an FIR filte r.
also reduce time by reduci ng stops for fue l, meaning we use a car wi th the lonee t driviD e
capacity per gas lank . Some people (nO! you. of course) might even consider driving
The above sections should have made it quite clear that RTL design presents an enor-
faster than Ihe legal speed limit. But those are nO! Ihe fir t things you typically think of
mous ran2e of possible soluti ons to the designer. A single hi gh-level state machine can be
when trying to reduce driving time for a cross-country trip. The most important decision
impleme;ted as any of a huge variely of possible implementations thai differ tremen-
is which route to lake. One route might be 4000 miles long. while another route may be
dous ly in their sizes and performance.
onl y 2000 miles. The high- level decision of which route to take has far more impact than
all the lower- leve l decisions mentioned previously. Those lower-level decision are only
M oore versus M ealy Hi gh-Level State Machines really useful 10 us if we made the ri ght hi gh-level decision, and then if we till want to
reduce the time furt her.
In the same way that we can create either a Moore or a Mealy FSM (see Section 6.3), we In digi tal design, optimi za-
can create Moore or Mealy high -level state machines. In Ihe case of a high-level slate tion/tradeoff deci sions at the
mach ine. a Moore Iype can only have actions associaled with the states, while a Mealy higher levels (e.g., RTL decisions)
type can have actions as ociated with the transiti ons. As was the case wilh FSMs, a may have a much larger impact
Mealy type may result in fewer stales. Mi xing Moore and Mea ly types IS commonly done than decisions at Ihe lower level
in high-level state machines. (e.g .. datapath component decisions
or multilevel logic decisions). For
example, the RTL decision to bui ld
6.6 MORE ON OPTIMIZATIONS AND TRADEOFFS a serial or concurrent FIR fi ller delay land
(Example 6.22) wi ll have a far (a) (b)
Serial versus Concurrent Computation greater impacl on circuit size and Figure 6.79 Higher- \'e~us lower-level deci"ions:
perfonnance than Ihe datapath- (n) higher-level decisions (denoted by the larger two
Having seen in this chapler numerous examp les of Iradeoff techniques at various levels of
component- level decision 10 use a circles) focus the design into a region. while 100\cr-lc\'c l
design, we can detect a common theme underl ying some of Ihose Iradeoffs. The common
carry-ripple or carry-Iookahead decisions tune withi n the region. (b) spotlighl analog~ .
Iheme is that of serial ver us concurrent compulali on. Serial mea ns to perform lasks one
adder, or Ihe combi nalional-Iogic-
at a lime. COllcurrellt means 10 perform lasks aI Ihe . ame lime.
level decision to u e two-level or mullilevel logic. Those lower-level decision mereh rune
For example. in combinalional logic design , we can reduce logic size by faclOring
the size and performance of Ihe higher-level decision. Figure 6.79(a) illustrates thi co-n cpt.
out teml~. By factoring OUI lerms. we are essenti all y seriali zi ng the compulalion. by com-
An analogy might be a spotlight shining down on land. illu trated in Figure 6. 9(b>-
pUling the factored out terms firsl. and then combining Ihe resul ts with other terms. In
movin a Ihe spotlightlefl or right at high altitude (higher-level decisions) has a larger impact
datapalh componenl design, we can improve an adder's speed by compuling carries can-
on which land region (possible solutions) is illuminated than d I wer-allitude mo\ emems
currenlly. rather than wai ling for the carry to ripp le !.eri all y. In RTL design, we can
(lower-level decisions).
schedule operation, across ,evera l . Iates . scri aliling Ihe opcralions 10 reduce size

_ _ _ -..,..,.._....-- --c'- .__ __ ~ __


356 6 Optimizations and Tradeoffs 6.6 More on Optimizations and Tradeoffs 357

Algorithm Selection A faster algorithm for searching


first sort the list and th h '
r f. .
~ ISla Items In a memory is known as binary search . We
en store l e list In the memo (\ . d I .
When attempting to implement a system as a digital circuit , perhaps the highest-level we start in the middle of the memo . ry \Ie nee on y son once). To look up an Item.
design decision. havi ng therefore the most signficant impact on design cri teria like size. the key. If the content's val ue is les ry. mealllng add ress 128. and compare that word'~ contents with
s than 128. Ihen we know that the key. If 1\ eXists III the memory.
performance. power. etc .. is the selectio n of an algorithm . An algorithm is a set of steps must bc somcw here between 0 and 127 S
and aga'ln com If h . 0 we go to the middle of that range, meaning address 64.
thai solve a problem. The same prob lem can be solved by different algorithms. Algo· pare.. h t e value
65- 127 So afl ' there is less th an the k·ey. we search 0 to 63; if greater. we search
rithms for the same problem, when implemented as a d ig ital circuit, may result in h k 'I' b cr cae companson. we decrease the remaining possible range of addresses in which
tremendously different perfomlance andlor size. Some algorithms may simply be bener t e ey les y one half. Halving 256 repeatedly can onl y be done 8 times' ?56 128 64 32 16 8
4. 2. I. In other words. after at most 8 o · . . .- . . . . . .
than mhers (optimization without much tradeoff). while other algorithms may represent to 1, meanin the ke can' be . c mpansons. w~ ve eIther seen the key, or shrunk the range
tradeoffs between perfomlance, size, and other criteri a. Se lect ing an algorithm for a . g y t found III the memory. Billary search is 256/8 = 32 times faster than
digi tal desig n problem is perhaps the highest level of design, and ca n have the biggest Im,car ~earch when the key does nO{ exist in the memory. and roughly th at much faster when the key
eXIsts 10 the memory too. : ct binary search only requires a slightly smarter controller.
impact on desig n criteria. For example, earlier examples showed various implementations
. We sec Ihat the chOice of the ri ght algorithm makes a big difference in performance for
of an FIR filter. But Lhere are many other algorithms for fi ltering very different from the
thiS example-m uch bigger a differe nce than determined by. say, the speed of the comparator
algorithm used in FIR. Some algori thms may provide hi gher-quality filteri ng at the belllg used.
expense of more required computation. others may provide lower quality but need less
computation.
We illustrate algori thm selection using an example. Power Optimization
Power is becoming an important design cri teria, both in high-end computing as well as in
EXAMPLE 6.23 Data compression using different table lookup algorithms embedded computtng. The un it of power i watts , which represents the energy per second
We wish ( 0 compress data being sent over a long-distance computer network in order to achieve (I.e., Joules per second) .. ln high-end computi ng. like desktop PCs. servers, or video-game
faster communicalion by sendi ng fewer bilS. One method for such compression is to use short codes consoles. the chtps tnstde a computer consume a 101 of power. causing the chips to
for frequently appeari ng data values. For example, suppose each data item is 32 bits long. We mighl become very hoI. For example, a typical chip inside a PC may consume 60 wans-thiok
analyze the data we expect to send and fi nd the 256 most frequently appearing data values. We could abo ut touchtng a 60 wat! Itght bu lb (but don ' t actually touch one) to understand how bot
then assign a unique g·bit code to each of lhase 256 values. When send ing data over the network, we that is. Designing low-power chi ps reduces the need for expensive chip cooling methods
first send a bit indicating whether we are about (0 send an encoded 8-bit data item or a raw 32-bit beyond si mple fans in hi gh-end computing, and also reduces the eleclriciry costs. which
data item-if the first bil is 1. that might mean encoded. and a a might mean raw. If all the data can be quite significant for companies operating large number of computers.
ilems being sent happen to be among the lOp 256 most frequent ones. then we'd be sending 9 bits per
In embedded computing, even simple cooling methods like fans are often not avail-
data item ( I bit indicating wheLher encoded, plus 8 bits of encoded data) raLher than 32 bits per data
able-for example, your cell phone does not hav; a fan (if it did. people might find their
itcm-3 compression of nearly 4x, which could translate to about 4 limes fasler communication.
We might design the encoder usi ng a 256-word
tie or scarf getting stuck in that fan). Portable embedded devices might have chip that
memory that stores the 256 most frequent values in sorted run at only I wa tt or less.
0: OxOOOOOOOO
order. from smallest to largest in binary. The code would FurthemlOre, portable devices typically get
1: OxOOOOOOO1
then be the address of Lhal word in the memory. Figure 6.80 2: OxOOOOOOOF their energy from batteries, and Lhus low power
shows sample contents of such a memory, in hexadecimal. 3: OxOOOOOOFF chips are necessary to extend battery life-espe- 8
The contents vary depending on the commu nicating applica- cially consideri ng the fact that batteries are not 0
tion s we are considering. improving fas t enough to keep pace with ~ energy
One algorithm for searching a list of values in a memory 96: OxOOOOOFOA demand
increasing power consumption. By some mea- .S;
is known as linear search . Starting at address O. we compare 128: OxOOOOFFAA sures. energy demand per chi p is doubling abo ut "
each memory word's contents wi th the data item we are every three years (going along with Moore' ~"
look.ing for (known as the key), incrementing the addre", and ~ . Law). Figure 6.8 1 plots such energy demands ,., " banery energy

~
repeating unLil we find a match. at which point we treal the
'"~" compared to battery energy densities improving ~ 2
add res at which there was a match as the encoded value. If
we get to address 255 and don't find a match. we will transmit
255: OxFFFFOOOO c
. at their present rate of only about 8% per year.
c
"
:0 The increa ing gap shown translates to shorter
the raw data. The linear search algorithm is a slow way to 256x32 memory
search a sorted list in memory. The algorithm requires 256 battery lifetimes for a device like a cell phone. 2001 03 05 07
reads and compares for data items that aren 't in the memory. Figure 6.80 Searching a sorted or translates to bigger batteries.
Figure 6.81 Battery energy densit), is
which may translate to 256 cycles. For data items that arc in memory for the key OxOOOOOFOA The most popular IC technology today use
impro\'ing slo" er than the in reasing
the memory. we would require on average 128 reads. - linear search requi res 97 reads! CMOS transistors. and the biggest contributor to
compares. binary search onl y 3. energ) demands of digital chips.
358 Optimizations and Tradeoffs
6.6 More on Optimizations and Tradeoffs 359
power consum pti on in CMOS is the switching of values from 0 to 1. The reason for this is ri sing clock edge Preventi no the I k d
thm wi res aren't perfect. having capaci tance (we don't put a capacitor there on purpose-it fl' fl . Id' . o · C oc e ge from appearing keep the same values in the

is simply a result of the fact that wi res aren 't perfect conductors of electricity). Switching IP- ops. Yle . II1g the same net result-the register's COlllents don't change.
Clock gmll1g. IS not someth II1g ' th'at d'IgllaJ
. desloners
. .
ryplcally do themselves. Rather.
the wire from 0 to 1 requires charg ing that capacilOr. Switching from 1 back to 0 causes modern sYlllhesls tools may aII ow us to speC .ify clock
0
that charge to be discharged to ground. That switching results in power being consumed. . enable and disable u ing pecial
commands .
111 each state These t
.
I . .
00 s must u e extreme cautIOn. becau e addin e a gate on
This power is known as dY ll amic power. since thi s power comes from the changing of a . clock slonal delays the clock signa.
' I resulting' .111 clock Signals
. .111 different parts
-
signals (dynamic means changing). Dynam ic power consumption of a CMOS wire is pro- . .0 . of the
portional to the size of the capacitance (C) of the wire. multipli ed by the voltage (V) C" cull bell1g slightl y different from one another, an effect known as clock skew. The tools
squared. multiplied by the freque ncy at which the wire switches (f), namely: must perform carelul tim ing analysis to ensure that the clock skew doe not chanee
overall C"CUIl behav ior. Furthermore, pUlling gates on a clock sional can reduce the
(equati on for CMOS dy nami c power consumption) sharpness

of the clock cdoes 0 ."
and. so must. be done careFull y. somellmes
0 . .
uSll1g .
speCial
gates. Nevertheless. the technique IS widely used by low-power tools in practice.
where k is some conswn t. To compu te the dyna mic power of a circuit. we would add up
We de monstrate clock gati ng with an example.
the power computed by the above equation for every wire.
Looki ng at the above equation. one can clearl y see that lowering the voltage will
EXAMPLE 6.24 Serial FIR filter with clock gating to reduce power
cause the grealC t reduction in dynamic power. because of the vo ltage having a quadratic
(sq uared) contribution 10 dynamic power. Low-level circuit designers seek to reduce We designed a serial FIR fi lter in Example
power by creating transistors that operate at the lowest vollage possible. 10 reduce the V 6.22. A five-Slate state machine controlled
term. and that have the smallest wire capacitance poss ible. to reduce the C term . Digital the dalapath. The state machine loaded the
designers can therefore choose to uti lize gates that operate with a lower voltage. three X I registers only in me first slale. tale
Unfort unate ly. lower voltage gates have a longer delay than hi gher voltage gates. SI . and loaded Ihe y reg regisler only in the
last Slate. state 55. Yet. lhe design routed
resulting in a tradeo ff between power and performance.
the clock signal 10 all four registers utilizing
Another way 10 reduce the dynamic power consumed by a circuit is to reduce the cir- four wires. labeled n I-n4 in Figure
cuit's clock frequency. which obviously reduces the f term for all the clock wires in the 6.82(a). Notice from lhe liming diag~ at
circuit. as we ll as for the many other wires that chan ge on each clock edge (like register the lOp or the figure Ihat n1-n4 change
wires and the logic connected to those registers' output s). But again. reducin g the clock identicall y a the clock signal changes. and
Frequency slows performance. resul ting in a tradeoff between power and performance. remember that every such change consumes
The chief technica l officer at a major ch ip des ign company IOld me in 2004 that, for dynamic power.
thei r company. " Power is enemy number one." The reason is that they had scaled their
voltage down nearly as low as poss ible. yet are pUlling more transistors on each IC every
year due to the shrinking of transistor sizes. meaning more wires switching. And capaci-
tance i n't decreasi ng at the same rate as transistor sizes. The resull is that an Ie
consumes more power as we put more transistors on the IC. which can result in problems
due 10 100 much heat and due to fast banery energy consumption.

Clock Gating (Ad va nced Technique)


Assumi ng the C and V term have been reduced to the ex tent possibl e using transistor-level
de ign techniques. power can be reduced furt her by reduc ingf . the Freq uency at which wires
swi tch. One method for reducing such power is known as clock gating. Clock ga/i/lg is the
disabling of the clock signal in regions of the ch ip that we know are not computing anything
at a given time. Clock gat ing aves power because a signifi cant percentage of the wires
sw itching in a chip are the wires that distribute the clock to all the reg isters and flip-flops- Figur.6.82 Clock gating: (n) the lock
perhaps 200/c-30% of the power consumption is due to the clock signal switching signal switches e\ el) cycle n all the
heJvily bolded \\ ires. but the \ t reQi ters
throughout the chip. Clock gati ng reduce f without slowing the clock frequency itself.
are only loaded in state J.:J.IlO the Yreg~in
In clock gating. the clock ~igna l is di sab led by A Ding the clock signal with an SI31C 5-so mm·t of Iht.~ doc\... ~\\ Itchin2. is
enable signal that is ~et in the ~la t e machine. Recall that a register with paralle l load inter- \\'Ilsled: (b) gnling the dock redu 's th~­
nally reload, the ,ame va lue from the regi' ter', flip -fl ops back into the fl ip-flop on a n4 _ _ _--' "--_ _--' '--_ _ -'rL ~\\'it hing on the lock. \\ In:,.

j
360 Optimiza tions and Tradeoffs 6.7 Produ ct Profi le: Digital Video Player/Recorder 361
Figure 6.S2(b) shows 0 design using clock gati ng. The controller gates th e clock to the xt reg- 6.7 PRODUCT PROFILE: DIG ITAL VIDEO PLAYER/RECORDER
isters by selling si lO a in all states bu t 51. Likewise. the controller gates the clock to the yreg
register by scning 55 to 0 in all states but 55. Notice the significant dec rease in signal switching on Digital Video Overview
Lhe clock's wires n} - n4. shown at the baLtom of Fig ure 6.82.
In the 1990s, the di git izat ion of video became practical due to faster, s maller, and lower-
Low-power gates o n noncritical paths power dig ital circuit . Previously, video was large ly captu red, stored, and played using
at a ll gates are equa ll y ras t. E ngineers th at buil d gates rrom tra nsistors can make a gate a nalog methods. Di gi tized video works by samp li ng an a nalog video signal and trans-
faster by increasing the s ize of the gate's tra ns istors, o r by operating the gate at a higher fo rmtn g the samples to digital values. Such d igitizati o n is simil ar to the audio digitization
voltage, or by any or several o ther means. Thus, one exa mple fro m Figure 1. 1, but with some add itional wo rk.
two-i npu t AND gate m ight have a I ns de lay, w hile A video is ac tu all y a series of
another two- in put AND gate might have a 2 ns de lay. qui ckl y displayed still pictures, known as
T he laner AND may consume less power, due 10 its • hig h-power gates frames, as shown in Figure 6.85(a). One
smaller size or lower voltage. second of video might consist of abo ut
Q) • low-power gates
If we want 10 reduce the power consumed by a 30 frames-the human eyes and brain
~ on noncritical path
circui t, we can build the e nti re circu it using low- a. see such a rapid sequence or frames as a
(a)
power gates 10 achieve low power a t the expense of smooth, conti nuo us video.
low-power

I~DG]
slower perfomlance. as ill ustrated in Figure 6.83. • gates A d igital display may be di vided
Altematively. we can put low-power ga tes o nl y into several hund red th ousand tiny "pic-

-.
on the no ncritical paths. suc h that we lengthen th ose ture ele ments," or pixels. A typical size
paths 10 have delays no longer th an the c ri tica l pat h, Figure 6.83 Using low- power m ig ht be abou t 720 across and 480
as shown in the fo llowi ng example. gates down. For each fra me, a d igitized sam ple


1 P P
captures several values fo r each pixel,
EXAMPLE 6.25 Reducing noncritical path power with multilevel logic li ke the intensity of the red, blue, and
(b)
In Example 6. 12. we reduced the size of a noncriti cal pat h by usi ng multilevel logic. In this green compo nents of the light at th at
Figure 6.85 Video: (a) is a series of pictures. or
example. we instead reduce the power co nsumed by the noncritical path by usin g low-power gales. pi xel, convening analog measure ments
frames, with muc h interframe redundancy. (b)
A ssume that nonnal ga les have a delay of I ns and consume I nanowatt of power, and that low- of th ose intensities into di gital num bers.
can be constructed from I (intra) frames and P
power gates have a delay of 2 ns and consume 0.5 nanowatts of power. The res ult is the represe ntation of a digi- (predicted) frames. shown with relative bit
The left side of Figure 6.84 shows the sa me circuit from Example 6. 12. havi ng a critical path ti zed frame as a (large) series of as and encoding sizes.
of 3 gate-delays. Assume that all the gates are nom1a l gates, meani ng the critical path delay is 3 ns, Is, and the representat ion of a dig itized
and th e IOtaI power consumption is 5 nanowallS. video as a large seri es of digitized frames. Digitized video can be transmined. stored.
re played, and copied with much hig her quality than analog video. Funhennore. digitized
video can be compressed, resulti ng possi bly in higher quality video than analog video
transmitted or sto red using the same medi um.

d
DVD-One Form of Digital Video Storage
e Di g ital video d iscs (also known as d igital versatile discs). or DVD . store video in a
di gi tal form at. First sold in 1997. DVDs replaced the analog video technology known as
VHS tape. DVD players appear in ho me e ntenainment centers, personal computers. auto-
mobi les (es peciall y fa mil y-orien ted vehic les). and even as stand-alone portable units. In
Figure 6.84 Using low-power ga tes on noncritical paths. Numbers inside a gate represent the gate's 200 I , consum er electro ni c companies introduced the first DVD recorder to market.
delay in nanoseconds, and the gate's power consumplion in nanowallS. a llowing ind ividuals to record television shows to special recondable DVD . The popu-
larity of DVDs compared to the prev iously popular analog-based VH technology terns
The bottom two AND gates lie on two noncritical paths having delays of only 2 ns. We can from several advantages. including bener q uality video. no deterioration in "ideo quaJit)
thus replace th ose AND gates by low- power A D gates. The res ult is that the two paths' delays over time. and the abili ty 10 jump directly to panicular pan in a ideo without having to
lengthen to 3 ns. so become equal to the criti cal path delay, but not longer. The result is also that th.
seq uentially forward or rewind.
total power becomes onl y 4 nanowatts instead of 5 nanowatts (a 20% reduction).

-- - - -- -== .".- -. ----- ~- -


362 Optimizations and Tradeoffs
6.7 Product Profile: Digital Video PlayerlRecorder 363
DVDs store large amOu11ls of data on a thin reflective layer of metal. Although the
metal layer with in a DVD looks fl at From our perspecti ve. there are actu ally billions of A DVD is only one of many different digital video storage media. Digitized video may
tiny pi ts on the meta l layer that store the data. These pits, or lack of pits (called lallds), be stored on any storage media capable of storing Os and 1S in some form. such as on tape
store the binary data on the DVD. Figure 6.86 shows how a DVD player reads the infor-
(~sed rn many digital video cameras). on a fl ash memory (used in digital cameras and cell
pones wtth Video recordlllg capability), on a CD. or on a computer hard drive. All such
mation off a DVD . Using a very prec ise laser. the laser's light is focu sed onto the metal med ia are typically still quite limited and thus require compression methods.
layer withi n the DVD . The metal layer refl ects the light onto an optical sensor that can
detect iF the light is reflected off of a pit or a land . By detecting the difFerent regions, the
MPEG-2 Video Encoding-Sending Frame Differences Using 1-, po, and B-Frames
optical sensor creates a stream of binary values as it reads the DVD.
MPEG:2 video compression was defined and standardized by the Motion Picture Expert
Optical Group 111 1994 (as an Improvement over the 1992 MPEG- I standard). and is used in DVDs
Pickup digi tal television, and numerous other digital video devices. MPEG-2 compression ratio~
range from 30: I to 100:1. or more. The compression ratio i determine by dividing the
number of btts of the dtgtttzed Video before compression, by the number of bits after com-
pression. So if a digitized video requires 400 gigabytes uncompressed but only 4 gioabyles
compressed. the compression ratio would be 400/4 = 100: I. ate that packing 1500 Gbits of
a movie into 37.6 Obits would require a compression ratio of 1500 Gbitsl37.6 Gbits = 40: I.
. The key observation leading to MPEG-2's compression method is that typically very
htlle dtfference eXlsts between two successive frames in a video--in other words. video
typically has much interframeredu ndancy. For example. a frame may consist of a person
standlllg 111 front of a mOuntalll, as in Figure 6.85(a). The next frame (which represents
... 010 100101 100 perhaps 1/30 th of a second later) may be almost identi ca l to the previous frame, except
---_.-/ that the person's mouth has opened slightl y. The next frame may till be almost identical.
with the person's mouth opened lightly more. And so on.
Therefore, MPEG-2 does not merely encode each frame a a distinct picture. Instead.
figure 6.86 How a DVD player reads a DVD. The DVD player's optical pickup element shines a to take advantage of the interframe redundancy, MPEG-2 may choose to encode each
laser on the surface of the DVD. The DVD refleclS the laser back to an optical sensor. and the frame as one of the fo llowing:
optical sensor use the intensity of the reflected laser to output the sequence of Os and Is stored on
An I-j rome, or Intracoded frame. i a complete picture.
lhe DVD. A video decoder circuit convens lhe bi nary data (0 a sequence of frames that humans
interpret as a moving picture. A P-jrame, or Predicted frame, is a frame that merely describes the difference
between the current frame and the previous frame. Thu . to derive the picture for
The DVD's binary data is organized into a eries of tracks that spiral outward from the this frame, one must combine the P-frame with the previous frame.
center of the DVD. As the DVD player is reading the data, the laser and optical sensor must
For example, Figure 6.85(b) shows P-frames that contain only the differences from
slowly move outward from the center of the DVD to the outer edge. [f a DVD is dual-lay- the previous frame. A P-frame will obviou Iy require fewer bit than an I-frame. Example
ered. the data on the disk 's second layer is stored in a spiral that moves from the disk's outer frame sizes might be about 8 Mbits for an I-frame. but only 2 Mbit for a P-frame. Thu .
to inner edge. The motivation for the second layer'S reverse spiral is to prevent the laser and instead of representing 30 frames as 30 complete pictures (30 [-fran,es). a compre ion
optical sensor from need ing to reposition itse lf to the center of the disk after focusing on the method might represent those frames using the following equence of frames: I P P P P P
second layer during a layer change. (You may have noticed a DVD pause momentarily at a P P P P P P P P PIP P P P P P P P P P P P P P. The compression ratio in this example
certain point in a movie during a layer change.) would thus be 8 Mbits * 30 I (2 • 8 Mbtts + 28 • 2 Mbit ) = 240 I 72 = 3. ': 1. Obviou Iv.
A single-layer single-sided DVD can store 4.7 gigabytes of data (meaning 37.6 giga- a picture created by combined predicted frames with a previous frame won't be a perf~t
bits), but that amoun t i not enough for a movie unl ess the dala is compressed. Consider representation of the ongrnal ptcture, espectally tf there is a lot of motion in the video.
a video wi th a resolution of 720 pixels by 480 pixels, using 24 bi ts of information per MPEG-2 thu s trades off some quality for compression.
pixel. and di splayed at 30 frames per second. One frame would require 720*480*24 = To achieve even further reduction . MPEG-2 uses a third frame type :
8,294.400 bits. or abou t 8 Mbits. One second of video. or 30 frames. would req uire • a B-jrome. or Bidirectional predicted frame. is a frame that can store difference
30*8.294.400 = 248,832.000 bits, or about 250 Mbits. A 100-mi nute movie would thus from previous and jl/Illm frames.
require abou t 250 Mbits/sec · 100 min • 60 seclmin = 1500 Gbit . But a DVD can only B-frames can thus be even smaller than P-frames. n example B-frame size might be
hold 37.6 Gbits. To 'tore a movie. a DVD must ~tore the video in a compressed format. just I Mbit.
31H 6 Optimizations and Tradeoffs 6.7 Product Profile: Digital Video Player/Recorder 365
EXAMPLE 6.26 Computing compression ratios involving 1-, P- and B-frames as a binary number, perhaps S-bits wide. One second
Assume a 30-frame MPEG -2 sequence has Ihe following frame sequence: I B B P B B P B B P B would thus result in 1000 • S = SOOO bits. On the
B P B B I B B P B B P B B P B B P B B. Assume average frame sizes of 8 Mbils for I-frames. other hand. we could just store the fact that the signal
2 MbilS for P-framl?s. and I Mbit for B-frames. Compute the compression raLio. IS a ~osme wave with a frequency of I Hz and an
The compression ralio in Ihis example would be 8 Mbils • 30 I (2 • 8 Mbils + 8 • 2 MbilS + amplitud~ of 10. If we store each of those numbers
20' I Mbils) =240 I 52 =~.6: I. as a~ S-bl,' value, then we only need to store S + S =
The example sequence of frames is in faci fairly Iypical for MPEG-2 video. wilh I-frames 16 bIts. Sixteen bits is far less than SOOO bits. time (s)
occ urring abou t every 12- 15 frame s. . Of course. nol all signals that we want to digi-
ttze are SImple cosine waves . But-and this is the Figure 6.87 Digitizing signals by
MPEG-2 video e ncoders may seek 10 c reate abo ut 30 frames per second. With hun- key idea underlying freque ncy domain representa- translaling 10 the frequency domain.
dreds of Ihousands of pixels per frame that must be compared with another frame, non-lVe. call applVx;lI/Q/e allY origillal sigllal as a SII/II of cosille lVaves of diffe I
MPEG-2 encoding requires a large amo unt of computation to determine which frames freqllell cles alld all/plillides. If we break the original signal into small regions we ob:
shou ld be I. P. and B. and what should be the values for the P- and B-frames. Further- even better a '. F ' n
a I . pproxltnat lon. or example, we might approx imate one region as the sum of
more. much of tha t computation will consist of the sallie computation performed between Hz. cosme wave of amplitude 5 plus a 2 Hz cosine wave of a mplitude 3. We mioht
corresponding regions of two frames. Thus, many MPEG-2 encoders utilize custom approxImate another region as the sum of 50 differe nt cosi ne waves of different frequ;n-
digilal ci rcuits to parallelize those computations at the expense of more hardware size. cles and amplitudes. The smaller the region we consider. and the more different cosine
For example. Example 6.2 1 built a sum-of-absolute-differences circuit using more paral- wave frequencIes we conSIder, the more accurate wi ll be ou r approximation to the real
leli sm Ihan in Example 5.9. at the expense of a larger circ uit size. Such a circuit would be sIgnal.
useful in a video encoder needing to quickly determine w hether a frame should be Rather than storing the actual frequencies along with the ampli tudes of the cosine
e ncoded as a P- or B-frame. or instead should be encoded as an [-frame. Addi tional cir- waves. we could mstead deCIde only to consider using panicular frequencies. such as:
cuits migh t compute the actual values of P- and B-frames. I Hz. 2 H~. 4 Hz, S H z, 16 Hz, and so on. Then. we can simply send the amplitudes of
Likewise. an MPEG-2 video decoder mig ht use circuit s to quickly recompose 1-, P- those pmlcular cosme waves: (5, 3, 0, O. 0, ".). Let's refer 10 these amplitudes as
and B-frames back into full picture frames-although decoding MPEG-2 video is easier coeffiCIents.
than encod ing because the ac tual determination of P- and B-frame conte nts is only done . The OCT in MPEG-2 convens an input 8xS block, whose val ue represent pixel
duri ng encoding; decoding merely needs 10 combine P- and B-frames with their sur- IntenSItIes. to an Sx8 block representing the coefficients of predetennined " frequencies ."'
roundi ng frames. In the VIdeo domam , each frequency represents a di fferent block pattern . with low fre-
quency bemg an almost constant pattern and high frequency being a changing pattern
(li ke a checkerboard). The OCT determines a set of coefficients such that adding the pre-
Transforming to the Frequencv Domain for Further Compression detemuned patterns together wi th each pattern multiplied by it coefficient yields ODe
DCT-Discrete Cosine Transform resultmg pattern very similar to the ori ginal input block.
We saw in the prev ious secti on that sending a frame (P or B) that is just the difference The equation for a two-dimensional OCT applied to an 8x8 block of numbers i :
from a previou or fu ture frame can result in some compression. However, the compres-
8 8
~(Il) C(v) I I
sion ratios ac hieved we re o nl y about 4; I. Reca ll earlier that a OYD needs perhaps a 40; I
F(II , v) = D[x. )'lcos (lt(2~; I )Il)c0s ( lt ( - .'; ; I) \')
compre ion ratio to slOre a full le ngth movie. Thus. funher compression is needed.
MPEG-2 therefore funher compresses each 1-. P- and B-frame individually. The com- .r =0)' =0
pression method invo lves appl ying what is known as a d iscrete cosine transform to SxS

f ~, h = 0
blocks of pixel values within each frame. The discrete cosi ne transfornl is also used in the
well-known ]PEG standard for compressing still images. like those in a digital camera. The C(hJ
discrete cos;lle trall sform . or DCT, transform s infonnatio n from the spatial domain to the
frequency domain. (The OCT is similar to another popu lar technique known as the Fast 11. olherw;se
Fourier Transform , or FFT, also used for translating to the freq uency domai n.)
Trans lating to the frequency domain is a powerful concept. which is widely used in The input is an 8x8 block. Drx. yj. The outpul is another x block. \ ith F(II," ) com-
puting the coefficient at row u. column I' for the output block.
digital signal processing. To understand thi s concept. consider wa nting to digital ly store the
An MPEG-2 encoder may utilize custom digital circuit' for fa t OCT computati n
analog signal shown in Figure 6.S7. using the fewest bits possible. The signal is a I Hz
cosine wave with an amplitude of 10. To store the signal digitally. we could sample the
Notice tlmt computing each coefficient requires evaluating the rightmost teml (let' ali
that term the inner ternl) 64 times. and that must be done for each of the IH c ffi ien
signal at frequent interva ls. perhaps every mi llisecond . and record the measured signal value

... - -- . --- - -- -.
366 Optimizations and Tradeoffs
6.7 Product Profile: Digital Video Plaver/Recorder 367
mea nin n M *6-1 = -1096 eval uatio ns of the tenll . And that inne r term itse lf requires several
a lso don' t notice mino d ' f~ . .
multipl ~ati ons. Funhermore, the OCT ope rates on 8x8 bloc ks. bu t in a 720x480 I-frame . . r I erences In the high-freq uency components of sound-Qur
h eanng Isn't that precise Th ' k f "
there will be 5-100 such bloc ks. Thus. the OCT for one I-frame could require 5400*4096 b uk I . In 0 a very high-pitched sound, so hig h it could perhaps
= 22 millio n computations o f the inner te ml. And that e ncodin g may have to occur at 30 re ' g ass .. You probabl y couldn ' t tell the difference be tween two s uc h high-pitched
ounds of slightl y different f ' . . . .
frames per second . You can begin to see why an MPEG-2 e ncode r may need to use . requencles-they are both Just hi gh. LikeWise. Our eyes can't
CUStolll digita l circuit s to compute the OCT quic kl y. using exte nsive para ll e lism and pipe- detect sli ght rounding of color values in a highl y complex scene. So MPEG-2 applies
lining to obtain the necessary performance. quanti zatIOn more aggressively on the OCT output block 's high-frequency coefficients
than on the low-frequency coefficient .
The OCT computation can be sped up funh er by precomputing the cosine terms of
the inner term . Notice that the OCT computes two cos ines based on the input values of /I After quanti zation, the 64 va lues in the 8x8 block are treated as a list of 64 numbers
and x and the in pu t values of v and y. However. because the OCT operates on 8x8 blocks, Those 64 numbers are then run-length encoded. RIIII-leng th ellcodillg is a compres io~
lhe vari ables Ii, v, x. and y only range in value from 0 to 7. Therefore, we can precompute method that reduces consecutive occurrences of zeros by a numbe r indicating the number
the M poss ible cos ine va lues needed for the OCT computati on and store those values in of consecuti ve zeros rather than representing those zeros the msel ves . For example. con-
an 8x8 table, whi ch may be programmed into a ROM . We can then rewrite the OCT Sider wanung to represent the following 5 numbers: 0, 0, 0, 0, 24. If each value is 6 bits
transfoml as follow s: the 5 number require5 *6 = 30 bit . On the other hand, we could just send a pair of
numbers, the first IIldlcatlng the number o f leading zeros, the second indicati ng the
8 non zero number. So 0, 0, 0, 0, 24 would be e ncoded as (4, 24)-4 leadi ng zeros. followed
F ( II. \.) = ~C(II)C(V) LL D[x, y ] eos [ x. lI] cos [ y, vl ~~ the number 24. If each value is 6 bi ts. the run-length e ncoded version requires only
x = 0)' = 0 - 6= 12 blls. Any seq~ence of numbers could simil arl y be replaced by a sequence of
number p3lrs, each pUlr replacing a sequence of zeros and a numbe r. The seque nce O. O.
0, 0,24, 0, O. 8, O. 0, 0, O. 0, 0, 16 could thus be replaced by three pairs : (4, 24). (2. 8),
Using a ROM to store the precom puted cos ine va lues speeds up the computation of
(6, 16), reducing the number of bits fro m 15*6=90 down to 6*6=36 bits. Note that the
the inner term of the OCT.
number of zeros at the beginning of the sequence or in between nonzero numbers may be
Quantization zero, and the last number may be zero. For example, the sequence 2, 0, 0 , 63. 2, 0 , O. O.
Trans lating to the frequ e ncy domain using the OCT does not directl y perform compres· 0, 0 could be encoded as (0,2), (2,63), (0,2), (4,0).
sion-we merely conve n ed a n input 8x 8 block to a n output 8x8 block . That output 8x8 Run-le ngth encoding achieves good compress ion only if there are many 0 in the
block represents amplitudes o f panicu lar cos ine wave frequ e nc ies. We can achieve com· sequen ~e of numbers. Fonunate ly, the nature of the OCT leads to man y 0 nu mbers (not
press ion by rou nding those amplitudes. such that we use fewe r bits to represent the all cosme Jrequenc,es are need~d to approximate a signal reg ion. 0 tho e freq uencies
amplitudes. For example. suppose we use 8 bits to re present the amplitude, meaning we wlil have 0 coeffiC ients). espeCiall y after quanti zation (many coefficients are 'ust mall
can represe nt amplitudes ranging from 0 to 255. Suppose we only represent even ampli· numbe rs. whic h become 0 during quanitization). Thus, applying run-Ienoth J enCoding
tudes . mea ning 2. 4, ... . 254. In that case, we can drop the lowe t order bit. in the after quanti zation leads to funher compression. e
representation o f the amplitude. resulting in onl y 7 bits. The decoder would merely
append a 0 to the 7-bit number to obtain an 8-bit number again . For example, the 8-bit EXAMPLE 6.27 Computing compression ratios involving Quantization and run-length encoding
numbe r 00001111 would be compressed to the 7-bi t numbe r 0000 111 with an implicit 0
Continuing Example 6.26. assume that the 30-frame MPEG-2 sequence has the same frame
in the eighth bit. The decoder would ex pand that 7-bit number back to the 8-bit number
sequence and average sizes as that example. bUI that each frame is further compressed by OCT con-
0000 111 O--not ice that the decoded number is sli ghtl y different tha n the original , being
version to the frequenc~' domain fol!owed by quafllizati o ~ and run ~ lenglh enCoding. A Sume the
1-1 rather tha n the origina l 15 (a n example o f why MPEG-2 compression loses some
DCT OUlput block conSIsts of 64 8·bll numbers. thai quantization reduces the average number size
im age quality) . We could take thi s rou nding concept furth e r, onl y re presenting amplitudes to 5-bil numbers. and that run -length encoding reduces the resulting number sequence ize to 30%
that a re multiples of 4 (thu dropping the two lowest order bits. yie ldin g a 6-bit represen· of its size.
tati on). or a re multiples of 8 (dro pping the th ree lowest o rde r bits. yie lding a 5-bit The compression ratio would be 8 Mbits * 30 I 5/8 * 0.30 *(2 • 1bilS + * _ Mbi + ~O •
representatio n). 0000 IIII mi ght be re presented as 0000 I wi th three implicit Os, tilu I Mbits) = 240 19.7 = 25: I.
decoded bac k to 00001000. The decoded num ber o f 8 is diffe re nt from the original
number 15 due to the ro unding. Huffman Coding
The rou nding described above. ach ieved by droppi ng low order bit s to ac hieve com· After run-Iengtll encod ing. each block consists of a sequence of numbers. me numbers
pression. i, known a, qIlQl/ti1.l1tiol/ . otice the Iradeoff- mo re ro undin g yie lds more wi ll occur in that equence more frequently than others. HUffman codillg i a method of
compre"ion. at the expense o f acc uracy. Fort unately. 11//l1/{/lIs dOli 'tllotice sl/eh rolllldillg red uc ing the num ber of bll. reqUIred to represent a et of values, by creating shoner encod-
ill the hix hlreqllellcy COIIIIJOIIeIll.1 of the pict"rc-Qur vi,ion ju,t i'I1 ' t Lhat precise. We ings for the frequentl y occurring val ue~ . and longer encodings for the Ie ' ~uet1! \-alue.
368 6 Optimizations and Tradeoffs 6.7 Prod uct Profile: Digital Video Player/Recorder 369
Huffman codi ng. a form of encod ing known as entropy encoding, is another powerful Summary
concept in digital data compression. Suppose you wish to represent an original sequence Summarizing MPEG-2 video enCod ing:
of 16 numbers O. 3. 3. 31. O. 3. 5, 8, 9. 7. 15, 14.3. O. 3. O. Assuming 5 bits per number, The use of 1-' P-, an d B-f rames achIeves
' .
a straightforward binary encoding would be: 00000 000 11 000 11 11111 00000 . f . compres Ion by nOl resending redundant
10 ormatIOn of Successive frames, but rather JUSI sending the differences.
000 11 00 1 a 1. and so on. for a total of 16*5 = 80 bit . We can reduce this total by first
observing that there are on ly 9 unique symbols: 0, 3. 5, 7, 8. 9, 14. 15, and 3 1. We really ~ OCT transform s 8x8 blocks of frame to the freq uency domain . which doesn' t
only need 4 bits to uniquely identify each symbol. We cou ld thus assign the nine unique ac leve compression itself, but rather enables compression in lhe next steps.
sy mbols to 4-bit encodings using the fo llow ing de finiti ons: 0=0000, 3=0001, 5=0010, Quanti zation achieves funher compression by reducing the number of bits needed
7= 00 11, ... , 3 1=1001 (note that the encod ings are no longer the binary number represen- to represent the OCT coefficients, through rounding.
tations of the origin al numbers). Thus. the ori gi nal sequence of numbers (0, 3, 3, 31 , 0, 3, Run-length encoding achieves further compression by replacing sequences of zero
5, ... ) would be encoded as 0000 0001 0001 1001 0000 000 1 0010 etc. , for a coeffiCients by a number indicating the number of such zeros.
total of 16*4 = 6-1 bits. The key observation here is that we can encode numbers using • Huffman.
cod'109 ac h'leves funher compression by encoding frequently occurring
any arbitrary uniq ue bit patterns we desire, as long as the encoder and decoder are both coeffiCient numbers with shorter encodings than less frequently OCCurring coeffi-
aware of the encoding defin itions. cient numbers.
We can take this definitions concept a step fu nher. by using encodings of different
The sequence of steps is shown graphically in Figure 6.88.
lengths. Observing that 3 and 0 occur more frequently than the other numbers, we might
give 3 and a shoner encodings. So we might create the following encodi ng definitions:
... 010t0010t100101010 --.J
0=00. 3= 10. 5=010. 7=0110 .8=0111, 9=11 00.1 4=1101. 15=111 0.3 1=1111. How ,----...._1 0101111010101001oot - ! OCT
these definitions were created is just beyond the scope of this discussion, though it's really .. !§ t001001oo0t010t11101 L--;====,--.!.,
~ ~ .§
not hard to learn. Notice that the encodings are such that the shoner encodings do not
appear at the left of any of the longer encod ings. For example. 00 does nOl appear at the left ~
g~ §
*~
101010001000t0111011...
Uncompressed
digital video
MPEG-2 video
of any of the longer encodings, like 010, 011 0,0111, etc. This feature allows the decoder
~~Q L~~~:::::!.....!., (compreSsed)
to know when it has reached the end of the code word-when the decoder has seen 00, it
knows it has found an encoded a (because no other encoding stans wi th 00); when it sees
L.....:.:..=.."'--_r-...0101OO1011OO..
10. it knows it has found a 3 (because no other encoding stans with 10). But when the Figure 6.88 MPEG-2 video compression encoding overview.
decoder sees 01, it must look at the nex t bit, and if it sees 010, it knows it has found a 5
(because no other encoding stans with 010). Using this variable-length encoding scheme, Our example compression ralio calculations yielded a ratio of about 50: I. In fact, the
the original sequence (0. 3, 3, 3 1. O. 3, 5, ...) wou ld be encoded as 00 10 10 1111 00 compressIOn ratio can be varied by varying each of the above steps. We can use fewer
10 010 etc. We have insened the spaces just for readabili ty; the actual encoding would just I-frames to achieve even hi g ~er compression at the cost of degraded video quality. or
be 001010 1111 00 1 00 1 0 etc. The total number of bits wou ld be 4 * 2 (for the four Os, more I-frames for Impro.ved Video quality at the cost of more bilS. Likewi e. we can vary
encoded with the two bits 00) + 5 * 2 (for the five 3s, encoded with the two bits 10) + 1*3 the amount of quantization to trade off quality and compression ratio. Becau e a typical
(for the one 5, encoded with the three bits 010) plus 6*4 (for the six remai ning numbers 31, movie Will have some slow-changi ng scenes and other rapidly changing cenes. and some
complex colored frames and other si mpler frames . the compres ion ratio for different
8. 9, 7. 15, and 14, each encoded as 4 bits), totaling 45 bits-much reduced from the orig-
parts of a video may actually vary. Notice lhe permeating presence of cradeoffs (primaril
inal 80 bits required by the straightforward binary encoding. between quality and compression ratio) throughout MPEG-2 encoding. y
Huffman cod ing achieves good compress ion when some numbers occur much more
frequently than other numbers in the sequence of numbers to be encoded. Fonunately,
thi s is indeed the case after OCT, quantization, and run -length tasks are performed on a
--------.l Huffman
=-:-:-::::l decoding
h
block of a frame. For examp le, there may be plenty of as, Is, 2s, etc. , and fewer occur- .010tool0ltOO....L ~~=':=::!..,,!., r-o-
MPEG·2 Video I R~e~:~~~hJ~t Uncompressed 8' fl
~! ~
[IJ
rences of higher numbers.
InvelSe 1-1 . .Ot0100t0110010t010
(compressed) digital video
~, ~- ~I--
EXAMPLE 6.28 Computmg compression ratios involving Huffman codll1g Iquantization tOt011t1010tOt00100~_ !§ ~ 'is
"I

InvelSe L- tOOtOOtOOOtOt011110t Q> ~ a


Continuing Example 6.27, assume that pairs of numbers after quanlizalion and run- length encoding
are Huffman coded, and that such encoding reduces the number of bil'> by 50%.
I. OCT I - t010tOOO1000tOlttOt1 . Cl 8
L.....:.- =
00

The compression ralio would Ihus be 240 I 0.50' 9.7 = 50: I. Figure 6.89 MPEG-2 video decoding overview.
370 Optimizatio ns and TradeoHs
6.9 Exercises 371
An MPEG-2 decoder merely needs to appl y the above steps in reverse, as ill us- 6.3 Perform two-level logic s" ".
K E Ize optmllZatlon for the equati on F(a.b.e) = a + a'b'c + a'e using a
tra ted in Fi gure 6.89. to convert an MPEG-2 strea m of bit s back into a series of -map. xpress the answer as sum-of-producLS.
pictures . or video. 6...4 Perform [wo-Ievel louie s' ' "
b 'd' 0 lze optllnlzallon for the equation F(a bed) - a' be ' +
Clearly. MPEG-2 encodi ng and decoding require a lot of computations performed at _ a e + a bd using a K-map. Express the answer as sum-of-~rodu~ts.
speeds fa st enough to create smoot h-looking. good-quali ty video. Custom digital circuits
6.~ :~r~~r;, two-level logic size optimization for the equation F(a . b . e . d) ab +
can help achi eve those requ ired speeds. usmg a K-map. Express [h e an~we r as sum-of-products.
6.6 Perform two-level logic size opti mization for the equa tion F( a . b . c) - a' b ' c + a be.
assummg t,hm IIlput combinntions a ' be and a b ' c can never occur (those two mintenns rep-
6.8 CHAPTER SUMMARY resent don t cares). Express the answer as sum-o f-products.
In th is chapter. we introduced (Section 6.1 ) the idea that sometimes we can improve a ~os 6.7 Perfo nn Iwo-Ievel logic size opti mization fo r the equation F(a bed) : a ' be ' d +
particular design cri teri a without hurting other cri teri a (optimi zation). but usually we can a b ' cd ' . assuming that a and b C'in never bOlh be 1 at the sam~ li~le: and thilt e and d can
improve one criteri a at the ex pense of another cri teri a (tradeo ff). We descri bed (Section never both be 1 at the same time (i.e., there arc don' t cares).
6. 2) the problem of two-level size minimi zation. int rod ucing K-mups as a visual method, 6.8 Consider the equation
f h f II . _
F(a, " be) : a ' e + a e + a ' bU' K d . .
. ~Iflg a -map. etennme whI ch
and then describing auto mated heuristi cs for two- level as we ll as multi level logic size o ,t : ? o\\, l ~ g ~e rms are implicants (but not necessari ly prime irnplicanls) of the equation:
minimization. We discussed (Section 6.3) methods for optimi zation and tradeoffs in abe . a b . a ' be . a ' e . e . be . a ' be ' . a ' b.
designing sequential logic. including state mini mizati on. state encoding, and Moore 6.9 Repeat the previous problem. but this time determ ine which of the terms are prime impiicams
of the function.
versus Mealy type FS Is. We highl ighted (Section 6.4) several alternati ve methods for
implementi ng some datapath components. incl uding a faster adder using carry-lookahead, 6. 10 Forthe equation F(a . b . c) = a ' e + a e + a ' b. delermine all prime implicanlS and all
essential pnme Impl lc3nts of the function.
and a sma ller multipli er using sequenti al multipli cation. We described (Section 6.5)
6.11 Forthecquation F(a . b . c . d) 3 ab ' e ' + abe ' d + abed + a ' bed + a ' bed'
methods for RTL optim izations and tradeoffs. including the powerful concepts of pipe-
determ ine all prime implicants. and all essenti al prime implicanlS. .
lini ng and concurrency as means of achieving para llel execution-a key purpose of
6. 12 F~r th e prev ~ou s problem, u~e the heuristic method of Table 6. 1 to obtain a two-level size opti-
custom dig ital design. We also described the RTL methods of component allocation, mized equation expressed in sum·of-products form .
operator bind ing, and operator schedu ling. We briefl y menti oned (Section 6.6) some
6.1 3 Use repeated applicati on of the expand operation to heuristi cnlly mi nimize the equation
higher-leve l methods. includi ng the general idea of serial versus concurrent computation, F(a ,.b.' c) : a ' b ' e + a ' be + a be . Try expanding each term for each variable. Give
and the selection of e ffi cient algorith ms. We also introduced some basic concepts of the mlnlll1lzed equati on in sum-of-products form .
power red uction. incl uding clock gating, and using low-power gales. 6. 1-' Use repe:lIed applicmion of [he expand operation to he uri Ii ally mi nimize the equation
A yo u can see from thi s chapter. there are many methods fo r im prov ing our design . F(a .b . e . d . e) = a bede + abede ' + abed ' e'. Try expandingeachtermforeach
Yet. thi s chapter just scratched the surface of such methods. An entire mul tibillion-dollar- variable.
per-year industry exists that specializes in mak.ing automated tools for converting behav- 6. 15 Using algebraic method s. reduce the number of gate inpu ts for the fo llowing equation b} cre-
ioral descriptions of desired system functionalit y into highly optim ized circuit ating a multilevel circuit : F(a . b . e . d . e . f . g) abede + abed ' e ' fg +
impl ementations- that industry is known as Electroni c Design Automation (EDA) or as a bed ' e ' f ' 9 , . Assume onI)' AND. OR. and OT gates will be used. Draw the circuit for
Computer-Aided Design (CAD). Thi chapter hopefully gave you enough exposure at the original equation and for the multile\'el ci rcuit. and clearl y list me delay and number of
gate inputs for each circuit.
least to understand the basic idea behind circuit optimi zati on at various levels of design
abstraction. ranging from the gate level up to the RTL leve l and beyond . SECTIO N 6.3: SEQ UE T IAL LOG IC OPTI:M IZATIONS AND TRADEOFFS
Do. 6. 16 Reduce the number of stales
6.9 EX ERCISES PLUS for the FSM in Figure 6.90 b)'
eliminating redundant s l ~lI e s
SECT IO 6.1: INT RO DUCTIO N by using an implic3lion table.
xy=OO xy= 10
6. 1 Defi ne the tc rm ~ "optimization" and "tradeoff." and provide everyday examples of each.

SECTIO 6.2: COM BI ATIO AL LOG IC O PTIM IZATIO S A D TRA DEOFF'S


xy= 10 xy=Ol
6.2 Perform two- level logic , ize optimization for the equation F ( a . b . c) - a b ' e + abc +
a ' be + a be' u, ing (a) algebraic method•. (b) a K-map. Ex pre" the an" ver, as sum·of· Figure 6.90 FS I e\ rullple.
product,.

- - ~ ---.-.. -
372 Optimizations and Tradeoffs
6.9 Exercises 373
6.17 Reduce the number of states Inpuls: x: OulpulS: y 6.24 Conve n the fOll owi ng Mealy FSM to the
for th e FSM in Figure 6.91 by nearest Moore equivalent Inputs:s,r.
using an implication tnble. Outputs: u,y
().IS Reduce the number of Slates
for the FSM in Figure 6.92 by
using an ill1p l i c~lI i on tab le.
fi.19 Compare the logic size (as
number of ga le inputs) and lhe
delay (as number of gate- 6.25 Conven the follow in g Mealy FSM to the
de lays) of a straightforward neares t Moore equi valent.
l·bit bi nary encod in g of the Figu re 6.91 Sequence detector for bit pattern s "01"' and "10"
FSM in Figure 6.93 with a
3-bit output encoding and with Inputs: g,r
Outputs: x,y, z
a one-hoI encoding of the
same FSM .
6.10 Compare the logic size (as
number of gate input s) and
the delay (as number of gate-
de lays) of a minima l bit wid th
state encoding and an output
encoding for laser-based dis- g'r'/xyz=0 10
tance measurer FSM shown in glxyz= 111
Figure 5.20.
Figure 6.92 FSM exa mpl e.
6.2 1 Compare the logic size (as SECTION 6.4: DA TAPAT H COMPONENT TRADEOFFS
number of gate inputs) and tlle
~ Inputs;::e: out: : w,X ,Y r---. 6.26 Trace the execut ion of the 4-bit carry· lookahead adder show n in Figure 6.59 when a = II and
delay (as number of gale- b = 7.
delays) of a minimu m binary
encoding (if not possible. ind io ~
wxy=100 wxy=010 wxy=001
wxy=OOO
6.27 Trace the exec uti on of the 4-bit carry-lookahead adder shown in Figure 6.59 when
b = 4.
a = 5 and
cate why). ou tput encoding.
and one· hot encoding of the Figure 6.93 FSM example. 6.28 Trace the exec uti on of the 16·bit carry-lookahead adder shown in Figure 6.59 when a = 43690
FSM in Figure 3.39. and b = 2 1845. Do not trace internal behavior of the indi vid ual 4-bit carry-lookahead adders.

6.22 Conven the Moore FSM fo r the code detec tor circuit show n in Figu re 3.46 10 the nearest 6.29 Design a 64-bit hi erarchical carry·lookahead adder using 4-bi t carry-lookahead adders. Wbat
Mealy FSM equivalent. is the total delay through the 64-bi t adder? How mu ch faster is the carry-lookahead adder
co mpared to a 64-bit carry· rippl e adder (co mpute as slower ti me/faster time).
6.23 Conven the following Moore FSM 10 the nearest Mea ly FS M equivalen t.
fi.30 Design a 24-bit hi erarchical carry-lookahead adder using 4-bi t carry-lookahead adders.
6.31 Design a 16· bit carry-select adder using 4-bit ri pple carry adders.
Inputs:S,r
Outputs: a,en SECTION 6.5: RTL DESIGN OPTlMlZA TlONS AND TRADEOFFS
6.32 The adder tree shown in Figure 6.94 is used 10 co mpute the sum of e ight inputs on every clock
cycle. where the sum is S - R + T + U + V + W + X + y + Z.

a=O
en=O
37~ Optim izations and Tradeoffs 6.9 Exerc ises 375

(a) Design J pipcli ncd version I nput s : byte a [256 ] . byte b[256] . byte ey
of th e adder tree (0 maxi-
Ou t put: by t e su mx . by t e sumy . byte e[256]
mize th e speed at which we MULT_OR_ADD:
can operate our clock input
i nt i -O :
elk .
elk i nt s umx 0:
(b) Create a timing diagram.
i nt s umy - 0;
6.33 Assume the delay of an adder whi le( i < 256 ) {
is 3 IlS . How fast can we
execut e the ndder tree shown in
if ( a li ] > 128 ) I
Figure 6.94 and how fa st can e [i] = al i] * b[i ] :
we execute the pipelined adder s umx = su mx + e[i ] :
tree des igned in Exerc ise 6.32? el se
L - - - -- _l> s e[i J a [ i] * ( b [i] + ey) :
6.3~ What are the latency and
throughput of the pipelined sumy s umy + e [i ] ;
Figure 6.94 Adder tree used to compute the sum of
adder tree YOli designed in eight inpUls every clock cycle.
Exercise 6.32? i++;
6.35 (a) Convert the following C-l ike code lO a high-level slate mach ine .
(b) Use the RTL design process shown in Table 5.1 to convert the high-level tate machine for
6.38 Redesign the datapath and controller designed in Exercise 6.37 by allowing up to nine concur-
the C code to a controller and a datapath. Design the datapath to structure, but design the
rent additIOns and inserting pipeline registers 10 data path and updating the controller if
controller to the point of an FSM only.
necessary. Assuming a comparator has a delay of 4 ns. an adder has a delay of 3 ns. and a
(e) Redesign your datapath to allow for concurrency in which four multiplications and two
multiplier has a delay of 20 ns, how long wi ll the circuit take to fini sh its computation?
additions can be performed concurrentl y.
6.39 Give n the high-level state
I npu t s : byte a[256] . b [ 256] machine in Figure 6.95.
Out put: by t e sum . byt e e[256 ] create two di fferent
designs: onc design opti-
MUL T: sO = sO· cO 51 = 51 +sO"cl 53 = 52+s0· c1 F = 53· 54-c2
mized for minimum
i nt i =0 : circuit speed and one s2 = sO· x2 54 = 50· c1
int s um = 0; design optimized for
minimum circuit size. Be Figure 6.95 High-level Slate machine for Exercise 6.39.
wh il e ( i < 256 ) {
sure to clearl y indicate the component allocation. operator binding. and operator scheduling
e[i ] = ali] * b[ i] ;
used [0 design the two circuits.
sum = sum + e[i] :
i ++: SECTION 6.6: MORE 0 OPTIMIZATIO SAND 'ffiADEOFFS
6.~0 Trace through the execution of the binary search algorithm when searching for the number 6
6.36 Redesign the data path and controller designed in Exercise 6.35 by allowing up to four concur· in the following sorted list of 15 numbers: I, 10,25. 62. 7~. 75. 80. 4. 5. 6. 7. 100. 106.
rent additions and inserting pipeline registers to your da tapa th and updating the controller if III, 121. How many comparisons were req ui red to find the number u ing the binllr) search
necessary. A suming an adde r ha a delay of 3 ns and a multiplier has a delay of 20 ns. how and how many comparisons would have been required using a linear search?
long will the circuit take to finish its computation? 6A I Trace th rough the execution of the binary search algorithm when searching for the number 99
6.37 (a) Convert the following C-li ke code to high-leve l state mac hine. in the fo llowing list of 15 numbers; I, 10.25.62.-74, 75 . 80. 4. -. 87~ 99. 100. 106. III.
(b) Use the RTL design process hown in Table 5. 1 to convert the high-level state machine for 12 1. How many comparisons were required to look for the number u ing the binllr)' search
the C code to a controller and a datapath. Design the dawpath to structu re, but design the and how many comparisons are required using II linear search?
contro ller to the point of an FSM only. 6A2 Trace through the execution of the binary search algorithm when searching for the number L I
(C) Redesign your datapath to allow for concurrency in which three comparisons, three addi- in the list of numbers from the previous example. How many comparisons were required to find
tions. and three multiplications can be performed concurrent ly. the number using the binary earch and how many comparisons are required using a linear
search?
6AJ Using the list of 15 numbers from Exercise 6.41. how many numbers ould \\e find faster
using a linear search algorithm compared with the binary search algorithm?
376 Optimizatio ns an d lra deoHs
6.9 Exerc ises 377
SECTION 6.7: POWER OPTIMIZATIO N
~ DESIGNER PROFILE
6A-l Given (he logic gates shown in Figure 6.96, optimize the followin g circ uit by reducing power
consumption without increasing the circuir's dclny. Smila has degrees in team are also my friends and it's a lot of fun to work
Electronics Engineering with th em."
a
.md in Computer Science. I n her decade of work so fa r. Smita has taken on some
b and has worked in the management responsibi lities. "As manager of one of the
dig ital design fi eld for four products that my company deve lops. I playa variety
nearl y a decade. She spellt of different roles. I wo rk with my team of 7 software
d a lot of time thinking about developers to determine what features to build in the
the choice of a co ll ege product and how best to build those features. I work with
maj or. ··Wh al major should the marketing and sales team to understand what the
Figure 6.96 Log ic gal e li brary. 2/0.5
I in vest my focus. energy. customers need and how best to message and position our
format means 2 ns delay/O.S nw power.
hean. and soul for what prod uct. Finall y, I work with other groups that are
will be some of the 1110 St involved in releasing a product - technical publ ications.
(l A5 Given the logic gates shovm in Figure 6.96. optimi ze the following circuit by reducing power produc tive years of m)' li fe?" She chose engineering. for application engi neering. and product engineering. The
consumpti on wi thout increasing the circuit's delay. severa l reasons. ·'First. engineerin g is a career in itsclf- diversity of my job makes it very interesting.
unlike some other majors. jobs speci fi cally for Smita enjoys the respect thm engineers receive. "As an
engineering majors arc out there. With engineering. I enginee r. I am highly respected by custo mers , partner
b would le:ml the 1110 S1 va luable and universal of ski lls: companies, and by our market ing and sales organizations
problem solving. Second. engineers have many options. beca use I have a deep understandin g of our products. I
because engineers are highl y va lued for their problem rea ll y know my stulT since I bu ilt it and I get recognized
solvi ng ski lls by other pro fessions, such :'IS management for it: · And regarding the pay: '·1 get compensated very
consulling. marketing. and investment banking. And we ll for my skill s:· She also likes the lifestyle: "I get in to
elec trical and computer engineers cun choose from a work around 10 a. m. and leave around 7 p.m. I don't have
mnge of industri es in which to work: telecommunications. earl y morning meetin gs unlike the folks in marketing and
image proccssing. mcdical devices, Ie fabrication. and sales, and I can work from home once a week or more
6..t6 Given the logic gates shown in Figure 6.96. optimi ze th e followin g circuit by reducing power even banking. This was a phenomenal discovery for me!" often if I wish. Thi s is also a great career for women - I can
con sum pti on without increasing the circuit 's delay. Smit<l continued her educa ti on by doing graduate take time off and return to my job without much penallY
studies in Computer Science, researching methods for when I have children. I can tailor my work hours as I Deed
a aut omati cally designing integrated circuils (I e) or chips- as my children are grow in g up. Lastly. I realize that I can
b
"a fascinati ng fi eld because it in volves a mix of hard ware move from engineering to other functions such as
and so ft ware skill s and knowledge. I continued in this marketing and sales. but not the other way around ! That's a
profess io n after school and worked for a company that great benefi t of be ing an eng inee r - more option :.
deve lo ps Computer-A ided Des ign (CAD) so ft ware used Smita recommends engineering and computer science
h
by hardware designers who work wilh a type of chip students focus on certain thing~ while in college.
ca ll ed an FPGA (Field Programmable Gate Array). ··Fi rst. ge t a good unde rstanding o f both hardware and
FPGAs can be used for an amazing variety of applications so ftware. Systems are highl y integrated today and there are
all the way fro m high-speed tel eco mmunicati on chi ps 10 very few compani es that develop one witho ut payi ng very
6A7 Gi ve n th e logic gates show n in Fig ure 6.96. optimi ze the fo llowing c ircuit by reducing power low-speed and low-cost chips thaI go into elec troni c toys close attention to the other. For instance. though I write
consumption without increasing the circuit's delay. and games. Our software saves designers many months or so ftware. I need to completel y understand the hardware for
even years of time. In fact. without our sofl ware, it would which il will be used. My husband. o n the other hand.
a be abso lul ely imposs ib le for people to design most chips designs telecommunication chips but works very closely
b
even if they had a decade or more to do it:' with hi s oft ware team. especially during the ini tial design
Smita (shown mountain climbing above) loves her stages when they decide what to implement in hardware
wo rk . ·'My wo rk is inl e llectu all y stimulating and I have versus software and how to design the hardware interface
an opportunity to innovate, create. and actu ally build so that the software algorithms work efficiently:·
so methin g reall y useful.'· She al so enjoys the peopl e- ··So, wh at do I mean by a good understanding of
aspect o f her wo rk . ·'1 wo rk in tea m, of d ynamic people hard ware and so ftware? In software. 1 think it is mosl
because 1110s1 proj ec ts, hardware or software. are done important to deve lo p good software ··habits·. Treat your
in lea ms o f 3- 8 peo pl e these days. Th e peopl e o n my program li ke a well-landscaped garden-you want it
378 6 Optimizations and Tradeoffs

~ DESIGNER PROFILE (continued)


beautiful and weed- free. Understand claw ~lruc tures we ll
and know when ant: is morc appropri ate than th e ot her.
Organize your code, be disciplined. cross the Ts and dOl
"Other than these hardware and soft ware skills, become
adept at math and analysis. Learn to frame problems and
break them down until you can sol ve them.
experimental and try diffcrcllI tools and methods. Have a
Be
7
the h. document diligently. have your code reviewed by
friends. and fina ll y. don'! be afraid to throwaway code hypothesis and thcn go about proving or disproving it. If
and rewrite it if yo u disCQVC! f a better way," YOll haven' t already, you wi ll soon discover thai
"In hardware. understand the b'1Sics of logic design and
then make sure you also understand the capac iti ve.
cngineering is nOI onl y fun . bUl also provides you with
many fulfilling career opportun.ities-so stick with it and
Physical Implementation
inductive. and resisti ve propertie s of circuits since these play make th e mos t of it !"
a big role in designing the high-speed circuits of today."

7.1 INTRODUCTION
A digital circuit design lhat we've created bUl just drawn out . perhap wilh pencil on
paper or as.a 6gure in this book. is just a drawi ng. Somehow. we must event ually imple-
ment that dtgttal circuit dc ign on a real phys ical device. so that the de vice can then be
placed In some electronic product to
carry out the desi red fun ction. owadays, BeltWarn
such a device is usually some form of
in tegrated ci rcuit, Or IC. also known as a
computer chip, or just chip. In other
words, looking at Figure 7. 1, how do we
get from (a), the seat belt warning light
ci rcuit we designed in Chapter 2. to (b). a
physica l impl ementat ion using an IC? Digital circuit Physical
In th is chapter, we will describe design implementation
(al (b )
several popul ar physical implementati on
technologies for digi tal circuits. Figure 7.1 How do we get from (aJ to (hJ?

7.2 MANUFACTURED IC TECHNOLOGIES


If we are willing to wait weeks or months fo r a physical implementation of our digital circuit
design, and iF we are willi ng to spend tens of thou ands of dollars to million of dollars for
that physical implementation, lhen we might consider implementing our circuit using one of
several technologies that involve the manufacture of a custom or semicustom Ie.

Full-Custom Integrated Circuits


One physical implementation technology is known as a custom Ie. A!ull-CIIstOIll Ie is a
chip created specifically to implement the gates (actually. the transistors) of the desired
digi tal circui t design (Figure 7.2). We digital designers wouldn't usually build full-custom
ICs ou rselves, but rather we would send our desired digital circuit design out to a group
or company that specializes in transforming digital de igns into custom IC . Engineers.
assisted by computer-aided de ign (CAD) tools. conven our desired digital circuit de ign

379
380 Physical Implementa tion
7.2 Manufac tured Ie Technologies 38 1
into a c ircu it o f tran sisto rs. and then decide BeltWarn Gate Arrays
where to place each transistor on the surface
The hard~st pan of custom IC design is designing and fabri cating Ihe transistors that will
of the c hip. how to orie nt each transistor
_ Custom go onto t e surface of the chip. Designing and fabri cating the wires that connect those
(e.g .. left to right. ri ght to left. top to bottom ,
layout trans Istors IS somewhat simpler. Gate array ASIC techno logy utili zes a chip who e tran-
ClC.). how big to make each transistor, etc.
sis tors are predesigned to form rows (a rrays) of logic gates on thc chip. as shown in
All that infomuHion abou t how the transi s-
=Igure 7.3. Gate arrays are sometimes re ferred to as sea-oj-gates. To imple ment a desired
tors should be layed out on a chip's surface
Igltal Circuit on a gate array chip, we mere ly need to c reate the I"ires that conneci those
is known as a layol/t. Then. the fu ll-custom
gates. Creatlllg the wires represent just the last steps of fabrication. and thus gate array
IC engineers send that layout inform ation to
technology e liminates much of the time and cost of fabricatin g a cllip for a partic ular
a special factory lhat speciali zes in fabri -
- - - - -- Fab deSign. A gate array company predesi gns and mass-prod uces the gate array chi p, and then
cating ICs. known as a fabri cation plant. or months c ustomizes some of those chips for each clie nt's c ircuit- the c hip i somewhat custom-
Jab for short. Fabricating an IC is often
I zed ~ h~nce the term sellliCI/SIOIII . and the c ustomizatio n is for a particular circu it
re fe rred to as a sili con spill . Ie appitcatlon. hence the ternl afJfJlicatioll-specific. Figure 7.3 illustrates how we might
Fabricating an IC is an extremely
costl y. de licate. error-prone process, uti- Figure 7.2 Full-cuslom Ie design. Implement our seat belt warning light circuit (Figure 7.3(a» using a gate array chip
(FI gure 7.3(b». Figure 7.3(c) shows how we might map the desired 3-input AND gate to
lizi ng state-of-Ihe-art photographic, laser. and c hemica l equipment that costs hundreds of
two 2-lIlput gate array AND gates. and the inverter to one of the gate array inverter . The
milli ons of dollars. The fabrication process may take many weeks or even months,
figure also shows how we might imple me nt the desired wi ring among the gate array's
because transistors and wires are formed as layers on the surface of a chip, and each layer
pillS, the gate array AND gate, and the gate array inve n e r. The remaining gate and pins
may take hours or even days to form through chemical processes.
on the gate array chip would be unuti lized . Fabricating these wires would re ult in the IC
Implemen ting a digital ci rc uit on a full-custom IC is a complex and ex pensive task.
be ing customi zed to our seat belt app licat ion (Figure 7.3(d» .
Costs for setting up the fabri cation of an IC, known as 1I0llreclIrrillg ellgilleerillg (NRE)
costs. can easi ly exceed many millions of dollars for a full-c ustom Ie. Furthermore, that
setu p takes time. perhaps months, and that time ma y be costl y to us too-the product for
whi ch we are fabri cating the chip may be losi ng market share to a competing product
a lready completed and being sold while we wai t for o ur chip to be fabric ated. Once we've
set up the details needed for fabri cation, the fabrication process itself is less expensive.
But because we c ustom designed everything, the probability is high that we made a
mi sta ke somewhe re in the transi tors or wiring. Therefore. after fabricating a full-custom
Ie. we may find e rrors that necessitate refabricating the Ie. known as a respill . Respin-
Figur.7.3 Gale array lechnology: (a)
ning may happen two or three times. each time requiring weeks or months, thus costing desired circuit. (b) gate array before w
us even more. We ought to ei the r be making millions of c hips, or c harging large amounts wires are added. (c) gale array after
of money per chip. to earn back the large NRE costs. wires are added. thu implementin g
Accordmg (oot/e Needless to say. full-custom IC fabrication is not extremely common. Designe~ Ihe desired circuil, (d) fabricaling Ihe
sun'(')', on!., about choose to implement a digital c ircuit on a full-c ustom IC when they know they will wires completes the Ie. NOle: real
/00/.- 0/2002
digital circuits
produce the c hip in ex tre mely high volumes , such as a mass- produced chip found inside gate arrays hnve many thousands or
were Implemellted calculato rs or wristwatches, or a mass-produced microprocessor chip like a Pentium. millions of gates. not just a fcw.
aJ CUSlOm tCf. Hi gh vo lumes in the te ns of millions or more are needed to offset the cost and time
needed to produce a custom Ie. Alternatively, desig ners may choose to implement a We point out that the actual mapping of our desired di gital cir uit to a gate array
digital c irc uit on a custom IC if cost is not ti ghtl y constrained but maximum perfonnance would typicall y be carried ou t by an au tomated tool. Designers rarely. if eve r. carry out
that mapping manuall y, and in fact usua ll y don ' t even see that mapping in any fonn-the
is a must. as mi ght be the case in military or space applications.
mapping is a ll done by tools, resulting in huge data files that can be processed by other
tools at a fab to control the fabrication process. We also poin t out that a typical gate array
Semicustom (Application-Specific) Integrated Circuits-ASICs
chip may hold lIIallY thol/sal/ds or milliolls oj gates: the gate array shown in Figure 7.3.
Because physical imple mentation on full-cu stom ICs is so costl y a nd time-consuming, having less than ten gates, is trivially sma.! I and is for illustratio n purposes only-gate
semic ustom technologies evolved during the 1980s and 1990s that reduce the costs and arrays wilh ol//y 10 gales do 1101 exis/. Furthennore. we would typically not u e gate
the time of fabricat ing a ch ip, known as Applicatioll-Specific Illtegrated Circuits , or arrays unless our design contained thousand of gates or more. For de' igns with only a
ASICs. Two popular AS IC techno logies are gate a rray and standard cell. few gate. we would instead use logic ICs: see cetion 7A.
384 Physical Impleme ntation
7.2 Manufa ctured Ie Technologies 385
NOI ice thm our standnrd cell implementa- co = ab
lion places the cell such thai wiring is 5 = a'b + ab' N NT~ implement an AND gate using NAND gates, we can subslilute the AND gate by a
minimi zed. whereas the gate array implementa- Ah . gate fOll owed by a NOT gale (which we know 10 be a two-i npul NAND gate
wit Its IIlputs tied together), as
tion of Figure 7.4 requi red u S to run the wi res to a
co shown in Figure 7.8. Thi s works
the pre-existing gate I OC~ll ions. resulting in b
longer wires. Thus. the tandard cell implementa- because given in puts a. b, Ihe first
tion may be faster than the gate array NAND compu les (a b ) , , and Ihen
implementalion. si nce shaner wires lypica lly the NOT gate computes (a b) " _ Figure 7.8 Implemenl ing an AN D gale usi ng
have sha ner deb )'. a b, wh ich is AND. NAND gales.
To implement an OR gate using
Implementing Circuits Using Only cell row NAND gates, we can substiwle the
NAND Gates
You may reca ll from Chapter 2 that CMOS
transistors lend themselves more readil y to
creating NA D and NOR gates rather than
Figure 7.6 Half-adder usi ng
standard cells.
OR gate by a NAND gate wi th each
input invened, as shown in Figure
7.9. This works because given
~+b " K}-
F=(a'b')'=a"+b"
tnputs a, b, the circuit of NAND =a+b
AND and OR. The stated underlying reason gates in Figure 7.9 computes Fi gure 7.9 Implemenling an OR gate using
was that pMOS transistors conduct Is well but not as. while nMOS transistors conduct ( a ' b ' ) ' . which by DeMorgan's AND gates.
as well but not Is. In any case, gate arrays typicall y contain plenty of NAND aneVor Law is a " + b" , which simpli -
NO R gates. rather than AN D and OR gates. And standard cell designs will also be more fi es to a + b - which is OR .
effic ie~t if implemented using NAND or NOR gates rather than AN D and OR. Further- When we replace a circuit originally consisting of A D/OR/NOT gates by a circuit

.
more. creating a gate array is much eas ier using just one type of gate, like just NA Ds, with NAND gates only using the above substitutions. we may fi nd that certain Signals get
or just NORs. rather than having to decide how many AND gates, OR gates, and NOT double-tnverted- the signal feeds into an inverter and then immedialely feed into
gates to pre-instantiate in the arrays . Given the ready avail ability of NAND or NOR gates another invener. Double-inverti ng a
in CMOS ASIC tec hnol ogies, we therefore want a method for converting AND/OR Cir- signal yields the original signal, so
cuits to NAND circuits or to NOR circuits. double inversions can be replaced by ~
Fortunately, converting any AND/OR circuit to a NAND-onl y circuit is possible just a wire, as shown in Figure 7.10. ~
because NAN D is a uni versal gate, as was menti oned in Section 2.8. A tllliversal gale is Such eliminali on red uces the transis- Figure 7.10 Double inversions can be eliminated.
a logic gate type that can implement any Boolean fun ction usin g gates of that one type tors needed without changing the
only. One way to understand NAND's universa li ty is to recogni ze that we can implement circui t's funct ion .
a NOT gate, an AN D gate, and an OR gate by substituting each by an equivalent circuit
of AND gates. Therefore any circuit of NOT, A D, and OR gates can be implemented EXAMPLE 7.3 Implementing a half-adde(s sum circuit using NAND gates
using NAN D gates only. Figure 7. 11 (a) shows the sum circuit for a half-adder (sec Seclion 4.3). usi ng AND. OR. and Nor
To implement a NOT gate using AND gates. We can implement that circuit using AND gales only by subs tituting each gale with an
gates, we can sub titule the NOT gate by a equivalent NAND ci rcuit. as shown in Figure 7. II{b). Aflcr the substitutions. we note that there are
two-input NAND gate with its twO inputs two signals that are doublc·invcncd. Eliminating the double inversions resu lts in the circuit shown
ti ed together, as shown in Figure 7.7. The in Figure 7.II (c).
double inversion
truth table in the fi gure shows that the Inputs Oulput

~
a

~~
NAND gate with its inputs tied together acts x a b F
the same as an inverter. When the input X is 0 0 0 1
0, both inputs of the NAND gate are 0, 1 t t 0 a
causi ng the NA ND gate to output 1. When
the input X is I, both inputs of the NAND Figure 7.7 Implemenling a NOT gale
gate are 1, causi ng the NAND gate to using a NAN D gate double inversion
(a) (b) (e)
outpu t O.
Alternatively, we could simply connect X to one NAND input. and a 1 to the other Figure 7.11 Implemenling a half-adder's sum circuil usi ng NA D gales only: (a) original ANDIOR!
NAND input. Then if x is 0, the NAND outputS 1. and if x is I, the NAND output 0, NOT circuit. (b) circuit oblaincd after SUbSlilUling equivalent A D circui~ for e3ch gate.
achieving the desired OT gate behavior. (c) circuit after eliminat ing double inversions.
386 Physical Implementation
7.2 Manufactured Ie Technologies 387
When conven in g A D/OR/NOT circuits by double inversion
double inversion
hand 10 NAND ci rcuits. some people find it easier
10 simply draw inversion bubbles rather than the a
NAND-based inveners. as shown in Figure 7.12. b b
Then. double inversion bubbles on a signal cancel.
Any remaining isolated inversion bubbles become
b
a NA D-based NOT gate. Thus, lhe ci rcuit in
double inversion
Figure 7.12 wou ld end up identical 10 the ci rcuit in (a) (b)
double inversion (e)
Figure 7. 1 I (c),
If NAND gates with a fi xed number of inputs Figure 7.12 Drawing inverters as . Figure 7.14 Implementing an A D/OR/NOT circuit using NOR, onl y: (a) original circuit, (b) circui t
are available. such as 2- input NAND gates onl y, inversion bubbles during obtained by substituting AND/ORINOT gates by equivalent NOR circuits. using inversion bubbles
we can first modify the AND/OR circuit 10 use conversion to NAND. :or ea~e of drawing. (c) final circuit after elimi nating double inversions and replacing standalone
only 2-input AND/OR gates (by composi ng larger inverSion bubbles by NOR-based NOT gates.
gates from smaller ones-see Seclion 5.8), before
convening 10 NAND gates. The half-adder's sum circuit was implemented with fewer NA D gates than NOR
gates. Depending on the original circuit, the reverse cou ld be true. We saw that NAND
Implementing Circuits Using NOR Gates gates were well-suited for circuits in the sum-of-products form. NOR gates are best
used when a circuit is in product-of-sums form (a level of OR gates feeding into a
Converting AND/OR/NOT cir-
cuits 10 NOR gate circuits is
similar to convening to NA D
circuits, as a NOR gate is also
a-{)x>-a. --- a-c[>-a. single AND gate).
Gate array and standard cell libraries typically inc lude additional components.
beyond just NAND or NOR gates, that have efficient CMOS implementations. For
example. a popular such component is known as AND-OR-INVERT. or AOl for shon .
a universal gate. The process of Such a component has two 2-i nput AND gates (thus four inputs total). feeding into a
lransforming circu it into 2-input NOR gate. That circuit can be efficiently designed using CMOS transistors. Thus,
NOR gates replaced each we would want to utili ze AO I components, and other si milarly compact available compo-
AND, OR. and NOT gate wilh nents in a library, as much as possible.
equivalent NOR-based circuits, The task of convening a general logic circuit to a circuit using onl y components from
as shown in Figure 7. 13. We a panicular technology 'S library (e.g., a particular gate array library or standard cell
can replace a NOT gate Wilh a library) is known as tecllllology mapping. The task of determining where to place tho e
Figure 7.13 NOR gate equivalencies.
two-input OR gate with the components on a chip is known as placement. and the task of connecting tho e compo-
inputs tied IOgether (or alterna- nents by wires is known as routing. All three tasks, collecti vely known as physical
ti vely, by a two-input NOR design , are typically done by automated tools today.
gate Wilh one input tied 10 0). We can replace an OR gale wilh a NOR gate followed by
an inverter. yieldi ng (a+b) " = a+b. We can substitute an AND gate with a NOR gate EXAMPLE 7.5 Implem enting the seat belt warning light on a NOR -based gate array
having inverted inputs, yielding ( a' +b' ) , a ' '*b' , a b (notice the use of Implement the Bel/Warn circuit of Figure 7.15(u) using the NOR-based gate array of Figure
DeMorgan 's Law). 7.15(a). Noticing that the gate array has only 2-Input NOR gates. we first conVert the Bel/Warn
circuit to usc AND/OR gates wi th 2 inputs only. as shown in Figure 7. 15(b). We then convert the
ANDIOR circuit to the NOR-only circuit in Figure 7.15(c). using the equivalencies in Figure
EXAMPLE 7.4 Impleme nting a half-adder's s um circuit using NOR gates 7.13, and using inversion bubbles rather than NOR-based inverters. We then see a double inver-
sion on the wire from input S. so we eli minate those two inversions. Note that we do not
Earlier. we demonslrated how to represent the half-adder's sum output with NAND gates; we can eliminate the double inversion between points 3 and 4 in Figure 7. 15(c). be ause the first in"er-
just as easily implement the sum output using NOR gates. The half-adder's sum circuit is shown sian is part of a NOR gate-eliminating that first inversion would convert the OR 2.3.te to an
agai n in Fi gure 7.14(a). We replace each NOT. AND, and OR gate by its equivalent NOR circuit in OR. defeating our goal of havi ng NOR gates onl y. After converting remaining stand-a1~ne inver-
Figure 7.14(b), using inversion bubbles instead of NOR-based NOT gates for convenience. We sions to OR-based inverters. we map the circuit to the gate array's _-input lOR 2ates as in
eliminate double inversions. and replace stand-alone inversion bubbles by OR-based NOT gates, Figure 7.15(d)-we numbered the OR gates of Figure 7 .15(c) and (d) to show the ~rre pon-
as shown in Fi gure 7.14(c). dence between the two circuits.
388 1 Physical Implementation
1.3 Programmable Ie Technology-FPGA 389
Fil'ld·
prog rammable The words "gale array" are lhere in the name because, when FPGAs firsl became
gale arrays popu lar in the mid- 1980s. they were marketed as an alternative to gate array technology,
(FPGA s) h l1v(' 110
"gale a rrays "
which was very popular allhal lime. Thus, an FPGA was a semicustom IC (nearly syno-
iI/side 'hem- mous with gale arrayal lhal lime) thaI could be programmed in the field instead of at a
the I/(lm e is there fabrication plant. However, be forewamed Ihat the inlernal design of an FPGA chip looks

DD-D-
--- - - - -- - - - --- -- --------
due 10 historical
reasons,
nothing like a gale arraY-lhe naming is somewhal unfortunate.
The two basic Iypes of components inside an FPGA are lookup lables and switch
matrices. Those components are replicaled hundreds of limes in regular patterns inside an
DD-D-
-- ------ -- - - --- -- - - -----
(b) P w FPGA. We now describe each type of component.

Lookup Tables
D-D-D- (a) (d )
A basic idea underl ying FPGAs is lhal a memory can implement combinatioltal logic.
(c) More specifically, a I-bit wide memory with N address lines. and hence 2N words, config-
ured 10 read the word corresponding to the present address. can implement any Boolean
Figure 1.15 Implementing the BelllVa,." circuit on a NOR·based gate array Ie: (a) original gate combinational func tion of N variables.
array. (b) - (c) convening (he des ired circuit LO two- input OR gal e!' only. (d) final gate array with
Recall that a memory configured 10 be read will outpul the contents of the word cor-
wires.
respond ing to Ihe present address al the memory's address lines. So if a 4x I memory's
address lines a 1 a 0 are 00, the memory wi ll outpul the contents of word O. If the address
7.3 PR OGRAMMABLE ICTECHNOLOGY- FPGA lines are Ol. lhe memory outpulS the contents of word I. Likewise, 10 reads word 2. and
ManufaClUred IC technologies require at least a few weeks. and usually more like several 11 reads word 3.
Tlte key idea
months. to canven a desired di gital circui t design into a physical Ie. What if we are underly ing
Implementing a Boolean function wilh a memory can therefore be done simply by con-
developing a circuit that we want to implement roda,,? In that case. we can utilize one of FPGAs is '''m necting the funclion 's inputs to lhe memory address lines, and storing a 0 or 1 in each
several programmable IC technologies. In a programmable Ie techllology. we tmplement
tI memOf)! wilh memory word to match the desired funclion OUlput for each combination of inpul values.
N'addre,'is lines For example, consider lhe function F( x . y ) = x ' y ' + xy . The truth table for the func-
a desired circuit simp ly by writing a panicular sequence of btls tnto a memory (or COfl implemenr

number of memories) contained in the Ie. Using a programmable IC technology has the 1Il1y combi,wliollal lion is shown in Figure 7. 17(a). To implement the example fu nction, we can connect x and
!1If1Cliofi wilh y to a 4x I memory's address lines a 1 and a 0, respectively. and based on the truth table. we
drawback of worse performance. size. and power compared to custom or semicustom Ie N· i"pIIIS, store a 1 in word 0, a 0 in word I. a 0 in word 2. and a 1 in word 3-in other words. we
technologies. But we get our implementation today. and the benefits of that fact may out·
slore lhe trulh lable OUIPUIS in the memory. The memory then implements the d ired func-
wei gh the drawbacks. tion, as shown in Figure 7. I7(b). For example. when xy=OO, we wanl the output to be 1.
-The most popular form of programmable IC
Figure 7.1 7(c) shows thai when xy-OO, the memory's address lines will be 00. and thus the
technology is known as a Field-Programmable memory will outpul lhe contents of word O. which i the value 1 , as desired.
GoteArray. or FPGA . An FPGA company prefabri -
cates an FPGA chip, meaning that the chip contains F =x'y' +xy F =x'y'+xy
4x 1 Mem. 4x 1 Mem. ,
all the transistors and all wires that the chip will G = xv' 4x2~.
ever have. We buy lhose FPGA chips. and then
program the chip to implement our desired ci rcuit.
To program in lh is context mean imply to down-
load a series of bits into lhe chip's memories-not
to be confused with writing high-level oflware pro- Figure 1.16 FPGA chip>.
x Y
0 0
0
0
F
1
0
0
l~rd

------

x-
y-
--I ';
- --~
--
---J
a1
aO 0
0
0
1
l ~ rd

x=o
~ al
aO
1
2
3
0 \
0 ,
1,
0:
,
x Y
0 0
0
0
1
F G
(1 O'{
0
°i
: 0 1 :
\ 1 0 :
/1

grams like C or C++ code. Such programming .. . y=O


F=1
"---"
+F
OCcurs in lhe fie ld. meaning in our lab. or offi ce, or home. 3.', opposed to tn a fabn atton
F G
plant. Hence the words "field-prog rammable" in Ihe FPGi\ nallle. Funhermore. program· (8) (b) (c) (d ) (e)
ming typica lly takes only seconds. or perhaps minules at most. Figure 7. 16 show. SOllie Figure 1.11 Implementing logic functions using a memory: (3) _-input fun 'li n truth table. to)
FPGA chips. The chip al Ihe top. wilh iL, front and back shown , mea.,ures ahoul 3/4 tnch on corresponding memory contc,llls and 'onnectlo,ns. (c ) the propt!r outpUt appe3f'S for the gi\cn input
each side. The chip on the bottom measures just over I inch on each side. values. (d) two functi ons h~\\,11lg the same two mputs. (e) mcm I) l.."Ontents for the '\\ Q functions.
390 Physical Implementation
7.3 Progra mmable Ie Technology-FPGA 391
A 111~1110ry with JII bits per word. rather than just I bi t per word. can implement M
16-word memory a 16 . .
runctions. as long as all those M functions have the same inputs. For example. consider func tio Id · ' -tnput functIon would require a 64 K word memory; a 32-input
thetworunctions F(x . y) = x ' y ' + xy and G(x , Y) - xy '.The tmthtableror nthwou. reqUIre a 4-billion-word memory. The needed memory size grows the
same as e size or the f ' . .
the e two functions is shown in Figure 7. 17(d). A 4x2 memory. which has 2 bits per numb ff . . unction s truth table, whIch we know grows as 2'v ·
. where N IS the
er 0 unclton tnputs I h . . .
word. can implcmcm those two functions. as shown in Figure 7. 17(e). ' t' < f . . n SOrt, a tmth table IS 1101 an effiCIent Boolean function rep-
resenl a Ion .or unctions . h ' .
A memory used to implement a combinational circuit is known (in FPGA termi- imple ' WIt nu merou tnpu ts, and thus a lookup table IS not an efficient
nol ogy) as a lookup table. When used as a lookup table. we typ icall y rerer to the memory mentatIOn ror runctions with numerous inpu ts.
Partltlomng a funct' ' . .
by the numbcr of iI/pillS (address li nes) and the number or out pu ts (bi ts per word), rather . . Ion s CirCU it among multiple lookup table can yield more effi-
cIent Implementations ~ I .
than by the number or \I'onls and the num ber or out puts. For exampl e, we would refer to . . f or arger fun ctIOns. Consider the extended eat belt warning
CirCUI t ro m Example 28 L '
an 8x2 mcmory being used as a lookup table as a "3- in put 2-output lookup table," rather . " . " et s eX lend the ci rcui l even more by addin a a third "diag-
nostlc IIlput called d th r . 0>
than as an 8x2 lookup table . . al orces the warnlllg light 10 tu m on when d=l-perhaps a
From this point forwa rd. we' ll assume Ihe memory is configu red for read, and thus mechamc IIlvestigating a raulty warning light mighl want to force the warnina liaht on to
we \\,on't show Ihe read line sellO 1. Isolate whether the lighl has blown OUI or to help determine ir a seat bel~ se~sor has
fatled.
. The ex. tended circul't 's
I ShOwn .III F'Igure 7. 19(3). That CirCUit
. . can . t be mapped to a
EXAMPLE 7.6 Implementtng the seat belt warning light with a lookup table 3-lIlput I-output lookup table because the circui t has 5 inputs, bUI the circuit could be
mapped onto a 5-lIlput I-output lookup table. Alternatively, we could implement the
Use a lookup lable 10 implemenl Ihe seal belt CIrCUIt by UStng a 3-input I-ou lput lookup table connected to another 3-input l-output
p s w
warn ing light circuit from Figure 7.1. whose
!0 ' lookup table, as shown in Figure 7. 19(c). We do so by partitionina the oriainal circuit iOlo
circuit appears in Figure 7. 18(a) and whose
0 0
,
0
iO two groups. such thaI the fi rst group has 3 inputs and I output. a~d the s~cond group has
equation is: 0 0
, :0 3 IIlputs and I output. as ci rcled in Figure 7.19(b). The fir t group's output. whicb we've
,./ = kps '
0
0 , 0
:0 labeled as x, has the eq uation x = kps '. The second group 'S output has the equation
\Ve generate the truth table for the fune· 0
, :0 vi = x + t + d. We would program the lookup tables to implement these functions_
tion . as shown in Fi gure 7.18(b). Because the 0 :0
:,
as shown III FIgure 7.19(c), thus implementing the desired circuit using two lookup tables.
circuit has three inpu ts. we kn ow we' ll need 0
I
a 3- inpul I-oulpul lookup lable (memory). :0I
o 0 ' ~~.. BeltWarn BellWarn ax, Mem.

-----
\Ve connect the inputs 10 the memory's (b) 8x' Mem.
, 0
address lines. and store the truth table in the
memory. as shown in Figure 7.18(c). Ihus 2 0
Programming
,
- .... 0 0
0
o
, 0

implementing the desired runction . Ir the 3-


o .
(seconds) 0 2
input I-output memory is an Ie. then we are o 0
k
o
done implementing our design. and can
, P
insen the Ie into the electronic system Wilh
"hich Ihe Ie should inleracl.
(c) Ie
7 0
0
X (a) 3 inputs
, oulpUI
3 inputs
, OUIPUI
,
0 __ ..... 7

You've ju t seen an example of a very x=kps' w=x+t+d 0 o


imple programmable IC technology-a w (b) t
d
memory. We can use a memory chip Figure 7.18 Lookup lable implemenlation. (c) w
with N address lines and hence 2N
Figure 7.19 Partilioning a circuil onlO IWO lookup lables: (a) desired circuit. (b) circuil partitioned inlo !!fOUpS with 31
wo rd ,. and with M biL~ per word . to l11os13 inpuls and I OUlpUl, (c) groups mapped 10 IwO 3-inpul I-oulpul lookup lables. -
implement M dirrerent Boolean functions of the sallle N inputs. We can purchase a
memory chip before we need it for our design. and then we can "program" the memory Notice that the implementation with two lookup tables has a total of + = 16
chip in our lab to implement 3 desired Boolean function . 1V0rds, compared to 32 words that 1V0uid have been present with a 5-input lookup table.
Thus. partlllol1lng a CIrcUIt among small lookup tables an re -ult in better effi ienc\ than
Partitioning a Circuit among Lookup Tables using one larger lookup table. -
This efficiency can be seen even more dramatically f r e,amples \\ ith rn re
Unr rtunatc ly. u,i ng a memory to implement a Boolean function doc not work well for inputs. For example. the runctlon F - abc + de + ghi . , h \\n in Figure
function, with numerou, input,. For exa mple, while a <I -input function wou ld need only a 7.20(a). has 9 inputs. Implementing the funcli n on a single lool..up tabk \\~uld
392 Physical Implementation
7.3 Programmable Ie Technology-FPGA 393
require a table wi th 29= 5 12 words . Ho wever. we can partition the c ircuit into groups

!~,
such that each gro up has 3 input s and I o utput- th e firs t g roup wou ld compute abc, 8x2Mem. 8x2 Mem.
the second def . the third ghi, and the fourth would OR th e o utputs of the first three 0 00 0 00
groups to ge nerate the o ut put F. Each gro up co uld be implemented us ing a 3-input 00 1 10
I -output lookup table. meanin g 8x I memories. The res uitin g implementation wou ld 00 2 00
(a) __ - ------------
have four such lookup tables. as shown in Figure 7.20(b). The to tal words for that 10
fo ur-table implementa ti o n wo uld be a me re 8 + 8 + 8 + 8 = 32 wo rds-fa r less than
the 512 words required for a si ng le 9-i nput look up ta ble. Figure 7.20(c) compares the
relative si ze s of a 5 12-word and four 8-word memories. Notice the tremendous reduc -
tion in size. e- -_ _"""'~___..l
(b )

a---.r::::;::::::::---, 512x l Mem.


b ~
a§ 3-1 u
f (e)
C- r -'-.-/
d~
d- ' -r---.... Figure 7.21 Partitioning a circuit onto two lookup tables: (a) original cireuit. (b) transfonned circuit
e F ~ _ 3'1 F
that breaks the 4-IOput AND gate IOto two smaller ga tes ' and then that haws the 3-Input
. I -output
~::!~:===( 8xl Mem. . ) .
groupIOgs. (c mappIOg of each group to a lookup table, with the group's function converted to
h programmed btls 10 the lookup table. Italicized bits are unused.

(a) (b ) (e) In the previous example, notice that we did not use one of the columns in the first
Figure 7.20 Dividing a many- input circuit amo ng smaller lookup tables reduces totallooirup table Ilook up table, and dtd. not use o ne of the columns in the second lookup tabl e el'th er. U·
smg
size: (a) 9-input ci rcuit. (b) ci rcuit mapped to 3-input I-out put lookup tab les. (c) size savings ookup tables sometttnes results in unused me mory cells. Us ing lookup tables also some-
compared to 9-input I-output lookup table. ttmes results In unused loo kup table words, as illustrated in the follOwing example.

EXAMPLE 7.8 Mapping a 2x4 decoder to 3-input 2-output lookup tables


Parti tio nin g a function amo ng sma ll look up tab les is m o re efficient than imple- Let's implement a 2x4 decode r. without enable, using 3-input 2-ourput look'llP tables. A 2x4
menting a function o n o ne large look up tab le . But what is a "s mall" lookup table-a decoder has two inputs..11 and 10. and four outputs, dO. d 1. d2. and d3. A mapping i shown in
tab le with 2 inputs . 3 inputs, 4 inputs. 7 inputs. o r maybe even 10 inputs? Ftgure 7.22. The equauons for each ou tput are dO = i 1 . i 0 '. d 1- i 1 . i 0, d2- iIi 0'. and
Research e rs have conducted numerous s tudi es on large numbe rs of typical circuits, d3=111 O. The lookup tables tmplement those eq uations using the top halve of the tables' words'
the bouom halves are unused. .
a nd found that 3- input o r 4-input lookup tab les seem to work best for most circuits.
Furthe rmore . resea rc hers fou nd that 2-output lookup tables a lso seem to work weli
for mo t examples. Thus. we' ll use 3- input 2-output lookup tab les from this point
8x2Mem. 8x2Mem.
fo rward .
0 10 0 00
1 01 1 00
EXAMPLE 7.7 Partitioning a circuit among 3-input 2-output lookup tables 2 00 2 10
3 00 3 01
Implement the circuit shown in Figure 7.21 (aJ u ing 3- input 2-output lookup tables. We begin by 0 a2 0 a2
il al 4 00 al 4 00
trying to partiti on Ihe ci rcuit into groups such that each gro up has at most 3 inputs and 2 outputs.
iO aO 5 00 aO 5 00
However. the 4-input AND gale prevents us from successfully perfonning such panitioning, 6 00 6 00
because whatever gate Ihat group is in will have at least four inputs. To remedy this problem, we
00
decompo,e Ihat gate into two smaller gates. while maintaining the same functionality. as shown in
Figure 7.21(b). We can then partition the circuit into two groups. each wi th 3 inputs and l output,
a, <hown in the figu re-We've numbered the inputs to each group to make clear that each group hns tl iO
dO dl d2d3
three tnputs. We then map those groups onto two 3-inpul 2-output lookup tables as shown in Figurt (a) (b)
7.2 ](C). ot lce that the fi~t look up table's 0 I output is unu<cd. and the second table's DO output is
unu\Cd The fir<ttah le\ DO column implements t-abc. The .,.,cond wblc's 01 column implements Figure 7.22 Mapping a 2x4 decoderto tw~ 3-input _-output lookup table : la) desi!\'d ireuit.lb)
r td + e. mapping to two lookup tables. ltaltCized btl are unused.
39~ Physical Implementation
7.3 Programmable Ie Technology-FPGA 395
An FPGA may come wit h tens. hundreds. or even thousands of lookup tables. and thus
EXAMPLE 7.9 A 2x4 decoder on an FPGA with a switch matrix
can implement large amoullls of combinational log ic.
We repeal Example 7 8 here us h FPGA h .
inpUlS 10 Ihe fi rs t lookup lable :~; ~a:e as in Eo~vn 'In ~i~U~ 7.23(a). We Coan easily gel the proper
Progra mmable Interconnects (Switch Matrices) external in ut ;0 . . x mp e .. y connectmg . external input iI . and
inputs 10 I~e SeC:~d,~e ~ppropnale FPGA InpUIS. as shown in Figure 7.24(a). To gel the proper
In the previous examples. we have been creating custo mi zed connections between lookup FPGA . h ~ up tabl e, we first connect ex ternal input if and external input ;0 to the
tables. However. the point of FPGA s is that the ent ire chip is prefabricated-including . 1," Puts t at reed 1010 the switch mmri x. We then configure the swi tch matrix such thaI swi
the wires . FPGAs therefore come with programmable illlereolllleels. sometimes called ~::~~: t'h;:,~tg~';O~as~~sh Ih~ou.gh 10 swilch malri x .output 00. which means Ihat eXlemal input~~
. . WI, C ma ln~ Outpul 00. We ach ieve thaI configuration by programmin 10 into
swilch molriees. which allow us to program the conneclions among lookup tables. Figure
~~. to: 2-bl~ register In Ih~ switch ~aLrix , as shown in Figure 7.24(b). Likewise, we con~gure me
7.23 shows a simple FPGA chip having six inputs (PO-P5), IWO 3-i nput 2-output lookup
~l. atnx ~~Ch tha t switch matnx ~npul 1113 pa sses through to swi tch matrix output 01. meaning
li C
:xt
lables. one -l-input 2-ou tput sw ilch mat rix. and four OU lPU IS (P6-P9). All three of the left ,ema. Input ~ passes through to switch matrix output 0 1. \Ve achieve that confi ouraLion b ro-
lookup tab le's inpuls come from the eX lernal inputs Pl . P2. and P3-that lOOku p table's gramlllJllg 11 I~to the boltom 2·bit register in the switch matrix. Because the switch matrix o~tputs
inputs can ' t be changed. However. two of the right look up wble's inputs may come from connect 10 the nghl lookup tables InpU' Is. we ' ve successfully connected external inpuLS if and fO to
eilher the left lookup wble's outputs. or from Ihe external input P4 and P5. The switch IEhe second
I lookup lab
. Ie's in pu I's. as d'
e Ire d. We program the Iwo lookup tables as we did in
xamp e 7.8. Thus. external OUlpUIS dO-dJ can be found at the FPGA eXlemal pins as shown in
matrix determines which of those connecti ons will be made. FIgure 7.24(a). .

FPGA (partial) Switch matrix


2-bit FPGA (partial) Switch matrix
8x2 Mem. 8x2Mem.

c;,J
8x2 Mem. memory 8x2 Mem.
0 00 0 00 0 10
00
t t 01
0 00
1 00 mO iOS 1 sO 00
PO
2 00 m1 i1 4x1 d~ 0 2 00 10 mO iOS 1 sO
3 00 m2 I- 3 00 m1 OO~
P6 i2 mux 0 a2 i1 4x1
d3
P1-t-
:~ 4 00 P7
m3 i3 i1 a1 4 00 1m3 mux
~;1: aO 5 00 iO aO 5 00 d2 i3
6 00 2-bit 6 00
-"" I- memory 00 --I- 11
00
t t

L~~~======l: pgP8 ~
1051 sO
~
"
iOS1 sO

_____ i~ ~~~ df-2.l I-


1~~~d1
:E-!
~ i3
il 4Xl
n ® L-.
P4•. ~==~~ iO I:;rtx
P5l: '-------'-
(a) (b) (a) (b)
Figure 7.23 A imple FPGA architecture: (a) an FPGA Ihal includes a swi lch matrix, and (b) the Figure H4 Implementing a 2,4 decoder on the FPGA fabric having a witch malrL': (a1 e.'temal
Ii " itch matri x's internal s , howing two 4x I muxes controlled by twO 2-bit registers. Note: real co ~nect l ons. and pro~rammed bits In the lookup table - and switch matrix. and (b) a look inside the
FPGA, have hundreds of lookup tables and swilch m3lrices, nOI jusl " felV. ~\V ltc h matnx. showmg the programmed connections between the Outputs and input . Italicized bilS
In the lookup tables are unused.
The swilch matrix's internal design appears on the right of Figure 7.23. It consi ts of
two 4x I multiplexers. The lOp mux connect Ihe sw it ch malri x output 00 to one of Ihe EXAMPLE 7.10 Extended seat belt warnmg light on an FPGA
matri x' s four input, . The bottom mux connects thc output 0 / to onc of the matrix 's four We arc, to i.mplcment the, ex tended seat belt warning light system f Example ~ . t't on the FPGA
inpui s. A two- bit memory (whi ch is ac tua lly a 2-bit register. but ca lled a memory for con· ShOW ~l In Figure 7.2~. (Figure 7. 19 showed how to partition a similar circuit in 1\\0 groups. \\; th
sistency with the memory in\ ide a lookup lable) holds Ihe two bits that set each mux' equations X c kps. and w - + t + d. For this example. W - + .) \\'e connect • . p.
two select line . Thu,. we can program the de;ired connection; simply by writing the and 5 to the FPG Plll~ gOll1g to th~ len lookup lable. and \\ e progrnm thut lookup I,ble to imple-
appropriate bil ~ into Iho\O two 2-bil me morie;,. otice that each ,witch matrix outpul can ment the .fullcllon kps : tl!' shown In FIgure 7.2-. \Ve connect an utpul of the left lookup mble.
be configured independently of the other. In facl. we could even make Ihe same inpul rcprcsentlllg x . 10 the nght lookup table!. b) programming the ~\\ Itch man;" to ("(lnnect 0 to 0 .
\Ve connec t t to the right lookup tnble also. b) connecting to an e\l~mJI pin lnn (tXt h..' ~\\ ilch
appear at both output, . though that's probabl y not u~eful in tim FPGA desig n.
Illtllrix input m2 . and then b) onfiguring the ~\\ itch Il1nIrh to ,,"onnl.: "'\ m2 t 0 1. \\"e then pn gram
We' lI illLl, trale the use of the switch matnx With an CX;1I11plc. the right lookup tnblc to il11plcl1lcl1Ilhc function \ + t . as S.ho\\l1 in Figure -~ . -
396 7 Physical Implementation
7.3 Programmable Ie Technology- FPGA
397
FPGA (partial)
Bx2 Mem. Bx2 Mem. FPGA
0 00 0 00 CLB
CLB
t 00 01 Bx2 Mem.
01 Bx2 Mem.
2 00 a 00

:r'
0 0 00
3 00 3 01 w 00
00 :~ 4 00 PO 00
1 00
s aD 5 00 aD 5 00 2 00
00
6 01 6 00 00
00
7 00 ---7 --Bfr- 00
00 @laO
00
01 DO 01 DO
00 ~?@l01
matrix 00
m2
m3
CLB oulput
Switch
flip·flop '
6' matrix

(a) (b)
l·bit
CLB
Figure 7.25 Implementing the extended seat belt warning light circuit on the FPGA fabri c having a outpul-
\\ itch matrix : (a) external connections and programmed bils. (b) n look inside the switch matrix, configuration
showing the programmed connec ti ons. Italicized bils in the lookup tables are unused. n~ N

Notice that. in the previous two examples, we implemented two differelll circuits using
the sallie FPGA chip. To implement the two different ci rcuits, we merely had to program
different bi ts inlO th e lookup tables and swi tch matrices. That's the appeal of FPGAs-
~
~
~~~=tP7 N
N

Figure 7.26 An FPGA 'th


they implement our circuit just by programm ing. table W ' o· WI can fi gurable logiC. blocks. which contain flip-flops along with a looku
. e ve put S In all the configuration memory bi t cells in the figure. p

Configurable Logic Block


EXAMPLE 7.11 Implementing a sequential circuit on an FPGA
In the previous section. th e illustrated FPGAs were missing a critical element needed to W . h .
e WIS 10 Implemenl Ihe circuit shown in Figure 7.27(a) on the FPGA of F 7'6 W firs
implement general circuits. namely.jlip-jlops. Without flip-flops, FPGAs could not imple- connect a and b to the left lookup tabl e. and C and d to the right I k bl Igure .-. e . t
ment sequenti al ci rcuits. matrix as shown' F 7 ?7( 00 up ta e throueh the SWlt h
b' .h . In Igure .- c). We program the left lookup table to output the fun;oons a ' and
FPGAs may include a flip-flop w ith each output of a lookup table-two flip-flops in . as s own In Figure 7.27(b). Likewise, we program the right lookup table to output C and d We
th e case of a 2-ou tput lookup table. The lookup tabl e and its flip-Oops IOgether are known program all the configurable logic block outpulS to connect to their flip-Hops by pro . . 1
as a configurable logic bLock. or CLB. A simple CLB i shown in Figure 7.26. Each con- mto the CLB Output configur::uion memories. as shown in Fi gure 7.27(c).· grnmnung s
fi gurable logic block has a 3-input 2-output lookup table, and has two outputs and two
flip-flops. Each flip-Oop is loaded every clock cycle with the correspond ing lookup table
output. Each output of the CLB can be con fi gured to either come from the output's flip-
fl op. or direct ly from the carre ponding lookup table output. That configuration is done
by programming a I -bit memory (which itself is a flip- Oop, but we' ll call it a memory to
avoid confusion), shown in Figure 7.26, thai controls a 2x I mu x for each CLB output.
The output flip-flops enable us 10 implement cquential ci rcuits, that is, circuits
having registers, on Ihe FPGA .
398 7 Physical Implementation
7.3 Programmable Ie Technology-FPGA 399
FPGA
. CLBs and switch matrices in commercial FPGAs are mo re complex than described
CLB CLB tn thiS Chapter. For example. CLBs may contain two lookup tables. or direct connections
8x2 Mem. to adJacenl CLBs to support carry chain . Switch matrices may contain more inputs and
o 00 output and more flexible swi tching Options. Furthefl11ore. commercial FPGAs may also
Ot tnclude large embedded RAM memories for data storage. and embedded mUltipliers or
to multiply-accumulate units for fas t multiplications.
o
o Programming an FPGA
w y a
(a) b We ha ven ' t said anything ye t. about how we actually progra m the lookup table. witch
ma tn x configuration memones, and CLB-output config urati on memories; in panic-
Leh looku p lable ular, how do we get the program bits into the configuration memories? The
configuration memories are all the lookup table memories. the swi tch matrix memo-
a2 at aO Dl DO
ri es, and the CLB -output configuration memories. Conceptuall y. programming is
0 b w=a' x=b' , e nabled by the FPGA having all the configuration memory bit storage cells connected
0
0
0 0 ,/f
1 1 i
l ',,
o ""r
, " asone big shift reg iste r. That shift regi ster' bit storage cells are spread out acro s the
c hip. so don' t represent a traditional register whose bits are u uall y in one place. but
0 0 \ 0 1 ,
thtnktng of them as a shift register helps unde rs tand thei r connectiviry. Actually.
0 9-/'
\ Q---- storage cells connected as a shi ft regi ster are typically referred to as a scan chain .
below unused The FPGA will have an extra input pin for programming that erves as the hift input
w
(b ) for the shift register. Another extra input pin ind icates that programming is taking
(c) place . During progra mming. we shift in the bits necessary to implement our de ired
ci rcuit. Remember that the configu ration memory cell s onl y get wri tten during pro-
. a sequenlW.
Figure 721 ImplcmCnlllH! . I Cifelli
· ., on,·,n FPGA·. (0)
• desired sequenlial
. .circuit.
.. gra mming of the FPGA-during normal FPGA operation. those configuratio n memory
Ibl left CLB"S lookup table program bi ls. (e) programmed FPGA. Unused b'IS arc naltclzed cells become read-only. Thu , one can conceive of FPGA s whose configuration mem-
ories are made from programmable read-onl y memory technology (PROM . EPROM.
Care sho uld be take n 10 avoid confus ing the o utPUI. nip-flops the mselves ~nd the
o r EEPROM) , although today most FPGAs use RAM and flip- fl op components for
CLB OUIPUI configura tion " memories··-the configu rall on memones s ~ore blt~ that
configuration memories. RAM and flip-fl ops are used probably becau e those compo-
program the FPGA to implement the desi red ci rc Uit. ~efore CirCUit op~rallon . while the
nents need to be programmed quickly using the scan chain method. easily achieved
OUlput flip-flops store the bits tha t the circui t loads dunn g CirCUli o pe ratio n..
using RAMlflip-flop components. but not 0 ea ily using EPROM or EEPROM
The storaoe clements for the lookup table. the CLB OU IPUI configuration, and the components.
'" itch matrice~. arc collectively known a an FPGA's cOllfigllrotiollmemory, although that
Automated tools that prog ram FPGAs usually start wi th a file containing the bits
"memory" is comprised of numerous smaller memories a nd even reg isters or flip-flops.
to be shifted into the FPGA c hai n-that file is known as a bit file . The tool that
creates the bit file obvious ly mu st know the number a nd purpose of every bit ell in
Overall FPGA Architecture the FPGA scan c hain. so uch too ls will generate a different bit file for different
FPGA devices.
G rid of C LBs and switch matrices
A commercial FPGA contains hundreds or even
thou'and, of CL B, and switc h malrices, arra nged EXAMPLE 7.12 Programming an FPGA
in a regular pallem on the c hip. A smllple This exa mpl e demonstrates programming nn FPGA for th e FPGA and de Ired circuit shown
arra nge ment i, shown in Fig ure 7.28. CLBs in Example 7. 11 . Figure 7.27 from Example 7. 1t showed Ihe required con lent of the c "fig-
connec t with horito nWI a nd vert ica l rou ling unuion mem ory on th e FPGA to implement the desi red ircuil. \\'c replicate the C' Oleots in
chan nel,. whic h connect to ,witch matrices. A Figure 7.29(a). Ihis lillie illustraling the llIanner in \I hi h Ihe FPGA h3, the configllr:lti n
,ample connec ti on of a CLB to the ro uting c han- memory bits co nnected 3;:.. a scan chain. Figure 7.2:9{b) :-.ho\\ s h \\ that SClD ' haLO 'on epru-
neh i, , hown for the top cenlc r LB . The rou ling ally forllls a ~ O-bil shift register. Figu,"" 7.29( ) .ho\\ $ Ihe ,'onlenl, of .1 bil rile thlt could be
channe l, con,i" o f ten, of wires. represented in used to program th e FPGA to im plement (hI! desi~d circuit. \\'r ,:re:lled thJt bit til ~iJUpl~ b~
follOwing the d3Shcd line that represent!'. the scan ·hain. placing Is Jnd O ~ into the- bl( lile a..,
the Ilgure ju" as \Ingle bolded wire,. Figure 7 28 FPG arc hilecture.
we sec them in the figun! .
7 Physical Implementation
7.4 Other Technologies 401
r -______________,FPGA report FPGA size by saying a particular FPGA h "d .
" 100 000 tica l " as a enslly of 100.000 system gates" or
CLB report' d yp be gates. These numbers are approximations, and many people view such
CLB e num rs very skepllcally (be ' .

-
Pin
~
Pclk
FPGA v d '
.
. cause somettme companies like to exaggerate).
en ors mtght also descnbe FPGA size as the number of "10' bl ks"
" lookup t bl " h' h .
a es, w tC tS useful when comparing sizes of FPGA h . th
of logic blocks or lookup tables.
gIc DC
s avtng e same type
or

0
0 FPGA versus ASICs and Microprocessors
a
b
FPGAs arc less efficient than AS ICs in terms of de lay size and powe F I th
(a) . . f F " r. or examp e e
CIrCUtt 0 tgure 7.22(a) could be implemented with a delay of ju t one gate-delay 'in a
custom or semtcustom IC technology. However, when mapped to the FPGA of Fi e
7.26, thal CtrCU IL w tll have a longer delay- the inputs must pass through the left Ct':-s
lookup lable (whIch may have a delay of two gate-delays), th rough tbe left CLB 's output
muxes (another two gate-delays), through the switch matrix (another rwo gate-dela 5)
through th~ nght CLB's lookup table (another two gate-delays). and finally thro ugh Yth~
nght ~LB s output mu xes, resulling in a total of ten gate-delays. In terms of size an
ASIC Implementation of the circuit of Figure 7.22(a) would require about ?O Iran . '
whereas the FPGA implementation using two CLBs and a switch matrix ;;'0 Id StSlOrs.
several hundred transistors. u reqwre
An FPGA implementation of a circuit will therefore be slower and bigger than an
Pin ASIC ~mplementattOn of the same Ctrcutl. Some studies have shown that FPGA are

-
Pclk L _____-=.:::======+-~===.i==+===~-__'.;;__'
approXlmately .1O t~mes slower, and 10-30 times bigger, than ASIC implementations of
the same ctrcu tl. SlIntiarly, a circuit implemented on an FPGA may Con ume about 10
ttmes more power than when implemented on an ASIC. But the advantage of being a ble
to program FPGAs immediately and for almost no cost. rather than hav ing to wai t weeks
Figure 7.29 Programming an FPGA : (a) all configuration bi t cells ex ist in a scan chain, (b) a scan or months while spending tens of thousands of dollars. often o~rweighs those
chain conceptuall y is a big shift register. (e) a bit fil e's contents would be shifted in during dt sadvanlages. -
programming-some relati onships between the file' s bits and configuration bit cells are shown. . Despite the perfomlance, size, and power overhead compared to ASIC. FPGAs are
sltll much faster than software on a microprocessor for many tasks. in pan because
FPGAs can effectively implement concurrency. pipelining. and bit-level operations. Thus.
How Many Gates Docs an FPGA Implement? FPGAs possess the programming fl exibility of software on a microprocessor. Yet
We usually think of a di gital ci rcuit 's size using the noti on of "gates" to represent design approach the performance of an AS IC. representing an excellent implementation option
ize. A design with 3000 gates is like ly bi gger than a design with 2000 gates. Of course, for many designs.
whether that state ment is true de pends o n the type of gate5 used in each design (e.g.,
because XOR gates are bigger than NAND gates, 2000 XOR gates may actuall y be
bigger tha n 3000 NAN D gates), as well as the num ber o f inpu ts to each gate (a 20-i nput
7.4 OTHER 1ECHNOLOG IES
gate is bigger than a 2-i nput gate). Thus, a common method o f indicating de ign size for In this section, we describe other technologies for physically implementing digital ir-
a circuit approximates the Ilwllber oj 2-ill/1l1t NAND gates tha t wo uld be required to c uits. Some of Iho e technolog ies are older technogies that are still useful for p~cuJar
implement the ci rcuit. So whe n we say lhat a c ircui t consist, of 3000 gates or 2000 gates, situations. Others are newer technologies that are beginning to gain popularity.
we typically mean that if th o~e c ircuiL~ were implemented using 2-i nput AND gates,
they would require 3000 2-inpu t AND gate, and 2000 2- inpul NAN D gales,
Off-the-Shelf Logic (SSI) IC
re5pectively.
FPGA~ have lookup lab le, and swi lCh malrice'> in, ide, not gate" FPGA sizes are Sometimes we need onl y implement a circuit having just a few gates. In these cas _
therefore typically reported by con,ide rin g how large of a ci rcuit made up of 2-inpul using an FPGA may be overkill. as FPGA typicall. upport th usand r milli os f
AND gale, could be implemented using the FPGA ;orc hilcc lurc. FPGA vendo may gates. Likewise, u ' ing an SIC would al 0 be overki ll . For es "here" e only need a
.102 Physical Implementa tion
7.4 Other Technologies 403
few 2.ates. we might instead use one or more. off- vee
the-,Ilelf lo!!ie I s. A logic IC typicall y contains a 114 113 112 111 110 19 18 Fairchild's 74LSoo subfamily of the 7400 series. In addition to basic gates. the table hows an
few. perhap: ten or less. gates connected directl y to IC wiLh 0 flip-fl ops. fu ll-adders, or a magnitude comparator. Pans also exi I for XOR.
the les pi ns. as shown in Fi gure 7.30. The IC XNOR, buffers, decoders. multiplexers, up-counters, up-down-counters, and more.
shown has four AN D gates and 14 pins. One pin IS There are several different subfamilies of 74oo-series pan s-pans from a subfamily
for power 10 the IC (known as VCC). the other for can be used with other pans from the subfamily, bUI generally not with pans from other
2round (eND ). The re maining pillS connect 10 the subfamilies. The reason is Ihat Ihe voltage and CUrreOi sening of a subfamily are designed
fou r NO gates in the Ie. as shown in the figure. such that the ICs can be connecled without worrying about adjusling the voltage and current
Diffe rent logic ICs have gate types other than between ICs. The 74 series (e.g. , 7400, 7402. etc.), is the basic subfamily. based on a type
11 12 13 14 15 16 17
AN D. such as OR. NAND, NOR , or NOT. To budd GND of Iransistor known as TTL-designers using logic ICs today only use 74-series Ie if they
a small ci rcuit from these off-the-shelf logiC ICs. must integrale WiLh old designs, and typically don ' t use the series for new designs. The
we would simpl y pl ace the ICs on a board and 74LS subfamily (e.g. , 74LSOO, 74LS02) uses a special Iype of TTL technology known as
connect the appropriate pins. ICs wit h o ~'l y a few ScholLky Ihat results in lower power and Slightly higher speed than the 74 series-the "L" in
gates are known as Small-Scale IlIlegratroll chips. the name means low-power. the "S" means Schottky. The 74HC subfamily use high-speed
or 551 chips. Figu re 7.30 Example logic Ie. (denoted by the "H") CMOS (denoted by the "C') Lransi lors. The 74F subfamily was
introduced by FairChild, consi ting of fast (hence Ihe "P') advanced Schon.ky TTL logic.
7.jOOlCs Numerous oLher 7400 subfamilies exist, with new subfamilies SLill being inlroduced.
The most popular off-the-shelf SS I Funhemnore, additional series of off-the-shelf SSI ICs also exist in addition 10 the 7400
IC are know n generall y as 7400- series. Another popular series is the 4000 series of ICs. a CMOS series thai evolved in the
series ICs. A 7-100 IC ty picall y con- 19705 as a low-power alternaLive to Ihe TTL-based 7000 series. More series exi 1100.
tains fo ur to six logic gales. and aboul
1-1 pins. A particul ar 7400 IC is hown EXAMPLE 7.13 Seat be lt warning implementation using oft-th e-shelf 7400 ICs
in Fi 2ure 7.3 1. The IC measures about Figure 7.31 7400-series Ie.
Usi ng 74LS-series ICs shown in Table 7.1. physicalJy implement the sea! belt warning ligbl circuil
112 inch across. The IC package of Figure 7.1. shown again in Figure 7.32(a). We could implememihe invener using a 74LS~. The
shown has two rows. or line s, of pins, TABLE 7.1: Common ly used 7400-series ICs. 74LS08 has 2-inpul AND gates. and we need a 3-input AND gate. A simple solution is 10 decom-
and is thus known as a dllal-illlille Part Description Pins pose Ihe 3-i nput AND into two 2-inpul ANDs. as shown in Figure 7.32(b). The final implementation
package. or DIP. is shown in Figure 7.32(c).
74LSOO Four 2-inpul NA D 14
7-100 ICs fi rst became avail able in
74LS02 Four 2-input NOR 14
the earl y I 960s. The original 7400
chip had fou r NA 0 gates, and cost 74LS().I Six invert ers 14
about S looo each. in 1962. That's 74 LS08 Four 2- input A 14
riaht-S looo And that's in 1960s Three 3-input NA ND 14
74LS 10
d~lIar . when' a U.S. engineer earned
74LSII Three 3-input A D 14
only abou t S I O.ooo/year. The price
dropped igni fi cantl y during that 74LSI 4 Six inveners (Schmil1 LTigger) 14
decade. Lhanks in large pan to the u e 74LS20 Two 4-inpul NAND 14
(a)
of huge nu mbers of the devices by the
74 LS27 Three 3-inpul NOR 14
U.S. Minuteman Mi ssile and Lhe
Apoll o rocket programs. and has con- 74 LS30 One 8- inpul AND 14 Figure 7.32 Implementi ng the seat
tinued to drop since Lhen due to 74LS32 Four 2-i npul OR 14 belt warn ing circuit wilh 74LS-
series ICs: (a) desired circuit. (b)
cheaper tra n iMors and huge vo lumes. 74 LS74 Two D fli p-nop. positi ve edge 14 circuit tran sform ed 10 lise 2-inpul
Today. you can buy 74oo-series ICs triggered. with preset and reset AND gates. (c) circuitmnpped 10
for Ju~t ten~ of cents each. 4-hll binary full-adder 16 two 74LS I s. Additional
74LS83
Parts with different gates have dif- - -
74 L 85 4-blt 11Iagnllude comparator 16
connections not shown would be
ferent pan numbers. Table 7. 1 , hows power 10 the 11 4 pin, and ground
JI'I~ .. l '11
(b)
\ome commonly used 7400 pan l from \""'" to the 17 pins on ench Ie. (e) ____________________________________ .J
40-1 Physical Impleme ntation
7.4 Other Technologies 405
Preferably. we \\Q uld implement the circuit usi ng j U ~1 onL' IC. (0 reduce board size. cost. and 11 12 13
power. onvcning the circuit to use only one type of gatl!. like AN D gates onl y, or NOR gUles
'-:±'= i---f'-~,:-'-'-'-'''-'-''-'---''l figure 7.34 A basic example of a programmable
onl y. could result in just one IC. For example. if we could t:onvcrt to 3-i npul OR gales. we could
use the 74LS27 chip. \Ve s lart by conve ni ng the circui t to NOR~ onl y. as in Figure 7.33(1.1). We logic device. (AND gales are wired·AND.)
, },--
remml! the doub le in\'e rsion. and replace the si ngle in ve rsions by 3-i npul NOR gates. The im ple-
mentation u~i n g a 7-tLS27 Ie is shown in Figure 7.33(c),
)
programmable node
~

,.'~-f]-
r;~
, ~ ,

01
Figure 7.33 lmplementing Ihe seal
belt warning circui t with one 74LS
IC. namely. the 74LS17 consisting " unblown" fuse " blown" fuse
of three 3·inpul NOR gales: (a) (a)
desired circuil transformed 10 OR j - ---- -- ----- - -- --- i
:0 :
gales with inversion bubbles. (b) , ' p :
circuit wiLh double inversio ns
eliminated and si ngle inversion.s
, i i ~=f11==~t2~=t~3--lt4~1t5--lI6r--t7r­ ..-.------.+-...__._._..__ ~~~_~c;j
replaced by I·input OR gales. (c) 'w figure 7.35 Two Iypes of programmable
circuil mapped to a 74LS27 chip. l __~'-._-4~-..-<_~""._-._-.-_.""'.~.....__.....J.~mmm ... programmable nodes nodes: (al fuse based. (b) memory based.
Additional conneclions not shown ,,, (c )
"ould be power to the 11 4 pin. and , olle·time programmable (OTP) devices. Fuse· based PLDs are po I .
. 1' " ".
I .
pu ar In e ecmcally
ground to the 17 pin.
... - -- - - - ------ - - --- -..! nOIsy app Icatlon~, ltke space apphcauons, smce memories can have their contents
(b )
changed from radl3uon In space. Th~y are also popular in applications demanding b.igh
securIty, smce mahclous enemIes can t reprogram the device. Memory·based devices are
Simple Programmable Logic Device (SPLDI more common. however. SInce they can be reprogrammed and thus reduce costs when we
A programmable logic device. or PLD. is an IC tha t can be configured to implement a make deSIgn ~ h a nges . The memones used are almost always nonvolatile. meaning the
varielYof log ic functi ons. ranging from tens to thousands of gates. PLDs became popular memones don t need power to retam theIr stored bits. (See Section 5.6 for more info-nna.
in the I970s (thus predati ng FPGA s). as they could implement far more functi onality in a tlon on nonvolatIle memones.)
si ngle IC than pos ible usi ng SSI ICs. . You might be wondering how those AND gate work when the programmable node
A PLD device comains a prefabricated circuit wi th a set of external inputs feeding IS programmed to dIsconnect an Input-how does the AND gate treat an input with no
into a large AN D·OR circuit structure. with the special feature that the user can configure conneclt on? As a O. as a 1, or a something else? Actually. PLD don't use nonna! AND
(via "programm ing") which external input connect to the AND gates. For example, gates. Instead. they tYPIcally use what IS known as ·'wired·AND." Explaining how wired
Figure 7.34 shows a basic PLD with three inputs feeding into three AND gates followed AND works .IS beyond the scope of thIS book, and instead the Subject of a COllISe on tran.
by an OR gate. The inputs feed into the AND gate in both true and complemented fomls. slstor:level Clrcutts. For our purposes, we can thmk of a wired·AND gate as an AND gate
Each wi re feedi ng into each AN D gate pas es through a programmable node. which can that SImply Ignores unconnected Inputs. -
ei ther pass the node's input to the node's outpu t. or di sconnect the node' input from the Real PLDs have more than just three t1 12 t3
node's output. Thus. by programming the programmabl e nodes. we ca n program the PLD inputs. three AND gates, and one output. PLD
to implement a llY 3·term functi on of three inputs. structure drawi ngs thus need a more concise
way of drawing the ci rcuits. A concise method
-H.. .;IO+.-oI"'.-c.--,. ,-; ------- -----]
The programmable node des ign varies among t y pe~ of PLD;. Figure 7. 35 shows two
types. The type shown in Fi gure 7.35(a) is based on a fu ,e. A fu ,e conducts like a wire, or drawi ng PLDs is shown in Figure 7.36.
un less we "b low" the fu se. meaning we pa;s a higher· than· norm al current through the Such a drawing docsn't show the progrdm·
fuse. causing the fuse to literall y burn up and break . A blown fuse obviously does not mabie nodes. and simply utilizes an "x" to
01
conduct electricity. The type .. hown in Fi gure 7.35(b) i, baloed on memory and n tran· indicate a connect.ion. In the drawing. wires
sistor- we program a 1 into the memory to ca u,e the tran,i \lor to conduct. and a 0 to that cros. each other are 110 1 connected unless ,,,
cause the transiMor to not conduct. We omit the detai l.. of how to program the fu ses or an "x" exists at the cros ing. FunhernlOre.
PLDIC :
program the memOrI e, th e m ~e l ves. Memory· based PLD, ca l) u,uall y be reprogrammed, such a drawing uses a singk wire to repre em -----------------------..!
in contra" to fuse · ba ed PLD, that ca n only be programllled once and are known as all the A D gate inputs. representing the figure 7.36 implified PLD dr:l\\;ng.
~06 Physlcallmplementallon 7.4 Other Technologies 407
\\ in~xl · AND. T he li ~ lIre shows how we would use such n drawing to indicate the connec· reg islered OU lput. is 10 implemem a slate regisler and control logic (i.e., a controller}-Ihe
tion; needed to gCJ~crate the term 13* I 2 ' . The "x" on the left represe11ls 12 ' feeding AND array gelS ils inputs from Ihe regislered oUlputs and eXlernal inputs. and Ihe OR gates
into the top AN D gale. The " X' on the ri ghl ind icales 13 feed ing i11l0 Ihe lOp AND gale. then generale Ihe eXlernal outputs and Ihe next values for the state register.
Some PLDs nOi only have a programmable AND array, bUI also have a program-
EXAM PLE 1.14 Seat belt warnlllg light using a sim ple PLD p mable OR array, meaning Ihe OR gale can gel its inputs from any of Ihe A D gates.
\\'~ a r~ to impkllll.! nt the ~ ent belt warni ng light -- --- --- ----------------------.
SPLD versus PAL versus GAL versus PLA
!:iv ... tem of Fi2ure 7. 1 1I ~i 1l 2 the PLD of Figure 7.36.
Li ke so many names in Ihe rapidly evolving field of high lechnology, names for PLDs are
\\.~ ca n do ..,~ b) progral~ llling th e PLD as show n
in Fi gure 7.37. \Ve ge nera te the des ired ten ll kps I
a bil blurred and confusing. Originally ( 1970s). PLDs consisted of programmable AND
by p;og ram mi ng th~ con necti o ns ror the top A ND arrays and programmable OR arrays, and were known as programmable logic arrays. or
2atc J!) shown. \Ve wa nt the botto m tWO AND gates PUs . In Ihe mid- 1970s, a company named AM D (Applied Micro Devices. lnc.) devel-
w
;0 ou tput 0, so th :lt th e OR gate' s ou tput equal s the oped PLDs Ihal instead had OR gales with fi xed rather Ihan programmable inputs. as in
top AND gatc'!:i output. We can achieve Os by Figure 7.38 and Ihe OIher PLD fi gures we've shown, and referred 10 such device as Pro-
ANDing an input with il.~ co mplement- the result grammable Array Logic, or PALs ("PAL" is a regi stered trademark of AMD). PALs were
of a*a ' I.!o. al\\3)!! O. The fig ure shows two ways origi nally fuse-based and hence one-time-programmable. A company named Lattice
of ~}(.· hic\in2 O!!. " iIh the mi dd le gate usin g j ust one SemiconduclOr Corporalion developed a PLD using a memory-ba ed programming
of lhe inpu~~. Jnd the bottom gale using all th ree Figure 7.37 Seat beiJ warning system approach ralher Ihan fuses, resulting in reprogrammability. and referred to such device
inputs-the re ... uh is lhe same. on a simple PLD. as Gelleric Array Logic. or GAL (which are registered trademark of Lattice Semicon-
duclor Corporalion). As PLDs became more complex (a well di cu in the next
PLDs Iypicall y have more than j ust one OUIPUI. Figure 7. 38(a) shows a PLD wilh IWO
seclion). PLDs based on PAL or GAL architectures (PLA archi tectures seem to be prett),
outputs instead of just one. Each ompul is an OR of up W Ihree terms. , . rare) became known as Simple PWs. or SPWs. to conlraSlthem with the more complex
Many PLDs have a D flip-flop thai stores each OUIPU I s bl!. and Ihe PLD s oulpUI pill PLD varielies. Today. numerous companies manufacture SPLDs. and often state Ihat Iheir
can be programmed 10 connecl from the OR gale oUlput or Ihe flip-flop OUiPUI, known .as SPLD archilecture is based on "PAL" or "PAUGAL" architectures. wilh Ihe distinction
combi nali onal or re2 istered outpul, respeclive ly. A PLD supponlllg comblllalionaUregls- bel ween PAL and GAL nol seemingly relevant in that comexl.
tered OUIPUI is show~ in Figure 7.38(b). SPLDs Iypicall y support lens 10 hundreds of logic gates.
11 12 13 11 12 13
:-~-~~~;~7
------ -- ---------- -- - - ------- --,' Complex Programmable Logic Device (CPLD)
, ,,
, : As IC rransislor densilies grew in the 1980s. companies began to build PLDs 10 suppon
:01 thousands of gales. However. the PLD archilecture described in the previous ectioD does
nOI scale well 10 thou ands of gate -who needs one big huge circuit of two-Ie"el logi 0
Inslead. architectures evolved Ihat consisted of numerous SPLDs on a in21e de,; e. con-
nected using switch malrices (also known as programmable imer onnec~)-see e tion
7.3 for delails on swilch matrices. These devices today are lenO\\ n as Complex PWs, or
CPWs. CPLD can Iypically implemem designs with thousands of gUles.
SPLDs versus CPLDs versus FPGAs
What's Ihe dilference among SPLDs. CPLDs. and FPGAs' In general. Ihe tenn PLD is used
! PLO Ie :
1 __ - ______________________ _ _ _ ...
-------------------(bi---- for devices thm support lens 10 hundreds of gates. CPLD for devices Ihat suppon thousands
(a) of gales. and FPGAs for devices Ihal suppon tens of thousands f gates 10 million of g31 •
Figure 7.38 (0) PLD wilh IWOoUlpu15. (b) PLD ",ilh programmable regislcrcd OUlputS. ~ Funhemlore. loday. SPLDs and CPLDs are almo t always nonvolatile. me:min; Ihe\
can store their program even after power is removed. whereas FPGAs are aIm ' t ;:;Jway~
AnOlher eXlension is to allow Ihe PLD oUlput 10 be either the tnJe or complememed volati Ie. meaning Ihey lose their program when power is remO\ ed--and thu~ must in lude
.alue of the OR gale or fli p-fl op output. using a 2x I mux controlled by a progr:llnmable bit external circuitry thOi lores the program in nonvolatile mem ~ and that progrnn ' Ih
Yet anOlher eX lension is for the oU lput 10 feed back 10 Ihe input array. One use of feedback FPGA from IhOi memory on po\\ cr up of Ihe FPGA. FPGAs toda~ are likely, Iatile
is to implemem fu nclions wi lh more lenns. achieved by feeding bac k Ihe combi national because of the wlIy Ihey are programmed using a s an chain. \\ hi -h is easy using flip-flops
OUiPUI val ue. Another very common use of feedbac k. achle"ed by feedlllg back Ihe and RAM cells. bUI would be diflicull u ing nom lalile mcm ~ bit: . Ho\\ 'cr.
hysical Implementation
7.5 IC Technology Comparisons • 409
,'unceptuall), any of SPLD" CPLDs and FPGA s could be made to be volatile or nonvolatile,
cOlllain
'd custom. di gi tal c·IrCUI'ts having
. hardware optim ized for high-speed low-power
and une might 'lI1tic ipate thai fulure FPGAs wi ll include FPGAs that are nonvolatile,
VI deo compress
~ '. ion and. deco mpresslon
. (k nown as eodees )-such pl atforms often contain
co ecs or a Wide vanety of protocol s (e.g, MPEG 2 MPEG 4 H 264 ). th
pl at form co Id b d ' .. ' , ,. , etc, ,Since e
~ -to - ASIC Flows . h N u . e u e In different products supporting di fferent standards, An example
IS t e ..expena platform from Philips. Furthermore, some platform SOCs contain FPGA
An interc"ing new tec hnology that has evolved in the earl y 2000s is Ihat of creating an
In add ll ion to one or more microprocessors and custom di gital circuits on the Ie Exam-
AS IC from an FPGA- ba,ed des ign, Many designers usc FPGA s for AS IC protolyping,
pl es mc lude the V,nex II Pro platfoml from Xilinx and the Excalibur platform from
The) 'hC auto malcd too ls 10 implement Iheir circ ui t on FPGAs, and Ihey then extensively
Ahera. DeSigners mig ht uti lize a platform SOC to prototype an ASIC, or to physically
lest Ihe ci rcuit in Ihe circ uit 's enviro nmenL for example, in a prolotype DVD player, The Illlplement a syslem In a fin al product.
FPGA-based prolotype implementali on may be larger, costl ier, and more power-hungry
Ihan an ASI C-ba,ed implementation, but can be very use rul for detecting and correcting
7.5 IC TECHNOLOGY COMPARISONS
err()r~ in the circuit. (IS we ll ;)~ for demonstrati ng the event ua l producl. Once satisfied with
the circui t. dc~ig n e rs mi ght then use aUl o l11 ~lt ed tools to rci l11 plement the circuit on an Relative Popularity of IC Technologies
AS IC. The AS IC imple mentation traditionall y did not ul ilil e any informalion from the TABLE 72.: Sample 'to of new
FPGA implementat ion. We've described numerous technologies in this implementations in various
Implementing large ci rcuits on AS ICs is a diffi cul t lask, even with automated tools. chapter. In this section, we' ll give you some idea technologies. Total is more than 100%
Nonrecurring engineering costs may exceed hu ndreds of thousands or even millions of of the relative popularity of some of those technol- due to overlap among categories,
dollar>, and fabricat ing the IC may take many week, or monlhs. Furthermore, any ogies. Table 7.2 provides the relati ve percentage of Technology %
problem \\ ith the fa bricated AS IC may require a second fabri calion cycle, requiring addi- deSigns that were physically implemented in
tional \\cck, or month" Pro blems may arise in Ihe AS IC Ihat didn ' t appear in the FPGA various technologies in 200 1, based on a particular ~S:-ta_n_d_ar
_d_c_e_Il_ _ _ __ ___-_5_%_
due to the completely new implementati on of the circui t as an AS IC- perhaps timing study. The table considers each new unique design Gate array 5%
problem, might arise, for example, due to the circuit being placed and routed in a com- only once, meaning that it doesn't mailer how :S=-y-s- te-m---o-'-n--a---C-h- i-p------3-Q%-
plete ly different fash ion Ihan was the case in the FPGA . many copies of the same design were manufac-
To ea,c the migration of a circui t From FPGA 10 AS IC, some FPG A vendors offer a lUred. That table's data does not include off-the- :F::U::II-:-C:-U c:-_ _ _ __ ___'_w
-:s:-to_m _"'_
'truclUred AS IC approach. In a siruelllred A S IC approach, an automated tool converts the shelf SS I ICs or SPLDs (both represent only a tiny CPLDIFPGA I O~

FPCA illlplelllell/(llioll to an AS IC imp lemenlat ion, in conLrast to converting the origillal fraction of the IC market from a total dollars per- ~O=-t:-he-r------------
circuil to an AS IC impl ementation. In oLher words, a truclU red AS IC will refl ect the lookup specli ve, and are thus often excluded from such 5'l-
surveys). A different study describes 2002 IC reve- Sou ,t:<: Synopy , DAC 2002 panel.
table and ,wi tch matrix truclUre of the original FPGA . However, the structured AS IC will
nOI be programmable. and thu will have faster lookup tablCe and faster switch matrices,
/" 2002 alolle. nues (as opposed to unique designs) totaling $11 billion as follows: standard cell 5~ Sb, fuji
m'ari), 80 billiun
becau~e Ihei r conte Ill;, will have been "hard wired" in to the AS IC. The structured ASIC's l e s (o[ on /)'{Jes)
custom 20%, gate array 10%, PLDIFPGA 17%, and other 5% (source: WSTS, lC Insi2hts),
cell~ can be preplaced . with only wires left to be completed to implement a part.icular cir- were prodllced. Yet anoLher study lists 2002 ASIC revenues at S I 0.9 billion, PLDIFPGA revenue at- _-
(Source: Ie billion, and SOC revenues at $7.6 billion (source: Busi ness Communications Company,
cuit. The re,u lt i, les, I RE cost (tens of Lhousand!. of dollars rather than hundreds of '"sig/us McClean
thou,and, or mi ll ions) and le;s time-to-s il icon (weeks rat her than mOlllhs), as well as less He{Ja r/. 200J.)
2003). Num bers from different studies vary: we provide these numbers just to !rive you a
chance of unfore~cen problems, The drawback is that the ASIC will be larger, slower. and general feel for the popularity of the various technologies. - -
more power-hungry than a tradi tional AS IC. bu t ti ll bellcr than un FPGA. Some general trends seem to include the increasing popu larity of FPGA . the
increasing use of structured ASIC approaches, and the increa ing appearance of y rem-
on-a-chip.
The 10015 used to map digital designs to phy ical implememations, ollecti\elv
known as Eleelronie Design Alltomation tools, or EDA , themselves fom) a market with
The advent of IC, contain ing a billion tran,btor has led 10 I ;, Lhat contai n what used to
re ve nu e~ of $3 billion in 2002. $3,6 billion in _003. and predi ted re\'enues of billion
exi,t on multiple ICs, Thu;. a single I may contain dozen, or hundred, of microproces-
in 2006 (source: Ganner DataquesL 2(04).
'Of>, cU'lOm digita l ci rcuits. memories, bw,cs, elC, An I with numerous processors,
CU\i()m circuits. and memories is known as a System -oll-a- Chip , or S Oc.
While many SOC, are creatcd by dc,ignw, for a pa nicular application (e.g" for a Tradeoffs among IC Technologies
particular DVD player), other SOC, are crealed to be uscd in a variety of diffe rent appli· Figure 73 9 i llusLrat~s the general tradeoffs anlOng the key I techn I gies dc_ -ribed in this
cauons . Such platform S OCs might conlai n prOCe"Of\ and ell ' tom circ ui ts specifically chupter. Technologies toward the right an be more customized t a parti ulac d ired dr-
(or an apphc:lllon domain. For e~am pl e. a platform SO for Ideo processing mighl cui t, and thus may have fustcr perfomlance, higher density (smaller chip for a giw n dn:uitt
-110 Physical Implementation
7.5 IC Technology Comparisons 411
lower PO\\cr. and larger chip capacil y (more circuils on " single chip). BUI such customized processor is implemen ted in stan-
lechnologies \\ ill be more cosil y 10 design and wi ll lake time 10 design. Technologies dard cells. Point 4 illu trates the
loward Ihe Icfl are less cuslOmized 10 a particular desired circuit. and Ihus may be more Custom .,/ • ( 1)
choi ce of implementing software processor
.(2)
quick ly a\ ail able and have lower design COSI. bUI al Ihe expense of slower performance, less on a programmable processor, More optimized
densi lY. higher power. and less chip capacilY (fewer of our circu its on a single chip). More where the programmable pro-
generally. lechnologies loward Ihe righl allow for more oplimizalion. Technologies to the cessor is aClually implemented Easier design
left yield less oplim izalion. bUI yield easier design. on an FPGA. Wh ile that concepi
, may seem strange, a program- / '
~t
• Full-custom mable processor is jusl another Programmable
~---~ l • Siandard cell (semicuslom) circuit, so Ihat circuil can be processor· (4) • (3)

E:
!!' ,
• Gale array (semicustom) mapped to an FPGA just like any
• FPGA 0>' other circuil. Programmable pro-
e:
• PLD 5}i cessors mapped to FPGAs are in Gale Siandard Full-cuSlom
~: fact becoming increasingly pop- array cell
Quicker availability .......--
ular. because a designer can Figure 7.4ll Ie lechnologies and processor varieties are
----.. Faster performance
Lower design cost ......- choose how many proces sors to onhogonal implementation features. Four of the ten
- - Higher density
----... Lower power put on a single IC (perhaps the possible choices are shown.
Figure 7.39 Tradeoffs among - - Larger chip capacity desioner
. 0
0 wants 9 proorammable
o · processors on one IC), and because a designer cao put
sc\eral lC technologies. Easier design More optimized sln)e-purpose processors alongside programmable proces ors-all withoul havin2 to
fabrlcale a new Ie. -
Furthermore. FPGAs and PLDs nOl only enable easier design , bU I may be reprogram-
Of course, programmable processors can often be purchased as off-the- helf ICs. so
mabie. a feature Ihat enab les changes 10 the circuit lale in Ihe design cycle, or even after
a deSigner uSing a programmable processor may nOl have to worry aboul the processor's
the circui!"s IC has been deployed in a fina l product. IC lechnology.
Choosing an IC lech nology for a parti cular design wililherefore depend on the con- But increasi~g l y, des igners must place a programmable processor within their own
straints imposed on Ihat design. If a design needs 10 gel 10 market quickly, Ihat constraint IC. coexlstlllg with other processors. When a programmable processor coexists on an IC
fa"ors PLD and FPGA lechnologies. If a des ign must be extremely fast, that constraint along with other processors (program mable or custom)_ that pro!!Tarnmable proces or is
favors emicuslOm or full-cu Slom technology. If a design must consume very little power often referred 10 as a core. -
or lake up "ery little space. Ihose constrainls favor emicu 10m or full-cus tom technology. Our discus ion of IC teChnologies and proce sor varietie has thus far assumed just
If changes 10 the ci rcuil are likely_ thai constraint favors PLD and FPGA technologies. one type of each Item (e.g., one type of FPGA). In reality. each type it elf has maoy vari-
Choosing the besl lechnology is a hard problem , requiri ng careful consideration of eties. For example_ dozens of differentlYpes of FPGAs are available. varying in their size.
numerous compeling con traints. speed. power. co t. elc .. Likewl e, dozens of different rype of programmable proces 011;
are avru lable. also varying In those features . And we know thai we can create different
Ie Technologies versus Processor Varieties Iypes of cuslom processors, varying also in their size_ speed. power. etc. Thus_ each point
II~ Figure 7.39 and Figure 7.40 IS actually a large collection of points that spread out in
IC technologies and proces or varieties are onhogonal imp lementation featu res. Two imple- different dtrecllons on Ihe plots. and may even overlap with other types_ Funhennore_
mentation feat ure are orthogol/al if we can seleci each independen tl y (in mathematics, other IC lechnologles as well as proces or varieties exist and continue to e\"ol"e_
orthogonal means forming a right angle). We know Ihatlhere are severa l proce sor varieties We also point out Ihat a single IC may actually incorporale se"eral different IC tech-
thai can each implement a desired system function . including a custom proce sor, or a pro- nologies. So a single IC may have some circuits created u' ino full-custom technol02Y.
grammable processor. Fi gure 7.40 ill ustrates that Ihe ch ice of processor variety is and other circuils created using AS IC or even FPGA lechnolog~. Like\\ i' e. a single p;;;'
independent of the choice of IC technology. Point 1 illustrates the choice of implementing cessor may h a~e dl.fferelll parts Implemented in different IC technologies. F r example. a
desired system fu nclionality using a cu tom processor circllit wi lh a fu ll -cu 10m IC tech- common situation IS for a programmable proce sor 10 ha\'e it ' datapath implemented in
nology. That choice re,ult, in a highly optimized de ign. Point 2 illu>lrates Ihe choice of ful l-cuslom technology. but ils controller implemented in ASIC technolo~\-the rens n
implcmenting a cu'tom processor ci rcuit on an FPGA. While the ci rcuit may be optimized, being that the datapal h is very regular. while the ontTOlier is mo ' th uns~- rured combi-
the FPGA I lechnology results in a less-optimi7.ed implemcntalion (compared to full- national logic. -
cu"om) but an c;"ier de~ign . Point J illustrates the choice of implementing de ired system In summary_ designers have a hllge number of choices in ch ' ing proc :;sor \':lli-
funcllonality as ,oftwarc execuling on a programmable procc\Sor, where the programmable elies and IC Ie hnologies to implement their s:nem' .

b
·H 2 1 Physical Implementation
1.6 Product Profile: Giant Video Display 413

Ie Technology Trend-Moore's law LEOs have long been used to


Under,tandi ng thc trends or IC di play s im ple device status (e.g. , on or
technologies requires ~n o\V l edg e off) ..tex t messages. or even simple
of ~ loorc ', Law. il'l oore's LAw ~'" 100.000 graphi CS. However, until recently. LEDs
were o nl y ava ilab le in white. yellow.
ro ugh I) state; that IC capac it y I to.ooo
double, evcry 18 mo nth;. Figure re~ . and green colors, Hnd were not very
Q; 1,000 bri g ht. Thus. earl ier LED video dis-
7.-1 I plots , uc h do ub ling. begi n- Cl.

nin g with about 10 millio n !!? plays were typica ll y small , used onl y a
tOO Traffic lig hllt fin~
tran, istor, pCI' IC in 1997. The
plot u,os a logarithmic ,ca lc fo r
the l'· ax is-each tick mark repre-
*
';;'

~
c 10
Sing le color. and were designed for
IOdoor lise. However, wi th the deve lop-
ment of the blue LED in 1993 . and the
illclIlIdesceflllig
(Illd red plastic em'er
ill Traffic light made from
se\'eral hllfldred red LEDs

develop ment or brighter LEOs. ru ll - figure 1.42 LEOs arc replacing incandescent
scnl~ 10 {ime~ more than th e
color LED d isplays evolved that can bulbs in Iraffic lights. as we ll as other areas.
prc\ ious ti ck mark . The g row th
Figure 1.41 The (rend of incrc<lsillg transistors per Ie. disp lay video in much the same way as a computer monitor or te levi ion. even in unny
rUle is astounding-ICs increase
fro m 10 milli o n tran;istors in o utdoor ~n vlro nm ents .. In ract. LEOs. being a semico nductor technology. have been
1997 to o wr 10 billion trans istors in 2015. That means that the 20 15 IC can ho ld 1000 IIn provlO" at a rate sllnd ar to transistor (w hich also use semico nductor technology). The
I" tl 2()().J \/,1'1:(-11. time, mo re tra n, i; tof> than the 1997 Ie. In other words. the 20 15 IC is as powerru l as Improvement has followed what is known as HailZ's LAw (the LED equivalent o f M oore 's
lUI I,,'t-/I/(',- aboUl 1000 1997 IC,. Thi s increas ing capac it y trend has also resulted in the cost per tran· Law), statll1g that the LED " nux per package" doubles every 18-2-1 mo nths. which has
prt'\Illt.ml been the case ro r several decades. Due to thi s improvement. man y people predict that
\//1l'tf'flt'tl Ihllt 11('
!-l istor dropping at nearl) the same astounding rate.
The IC capacity trend has man y implications. One implicat io n is that digital LEOs WIll repl ace incandescent lig ht bulbs ror home and office lighting. LEOs ha"e
"U~'" II(lU
{'(m{id~r
designers ca n creale Illilssively paralle l designs that usc huge num bers of functional units a lready begun to replace lOeandescent bulb in traffic lights. as illustrated in Figure 7.-12.
IrlllUUIOrf U,S . Fig ure 7.43(a) shows a large LED video display capable or display ing full-color
('{It'IIf/I,II, jrt!t'.
and reg is ter>. to create high-perro rmanee systems not previous ly practical. The number or
required tra nsisto rs ror suc h des igns might ha ve bee n considered abs urd just a decade ear· Video on a 15 x 8 ~a rd scree n. Beca use each LED is relati vely large ( 1/8th of an inch
lier. Another implication is that the s ize overhead of FPGAs compared to AS ICs (about Wide. fo r exa mple) In comparison to the pixe ls or a computer monitor. one has to tand
lOx) become, Ie,s rel evant . making FPGA s an increas ingly popu lar c ho ice in mo re sys· several feet away from the LED d isplay to view the image wi th out notiCing the indi\'idua!
tem s. Ye t another implicatio n is that des igners increas ingl y need automa ted too ls to help LEOs. Ir we look closer at the LED disp lay. as seen in Figure 7.-l3(b). we can see the
build the ~e multimilli o n Lransistor circu its. and Illay increas ingly wish to use RTL and IOdl vld ua l lines or the displays. Ir we loo k even closer at the di splay. we can finallv
even hi g her l eve l ~ o r design (e.g .. C-based des ign ) as the method ror desc ribin g circui ts, the indiv idu al LEOs wi thi n the display. as shown in Figure 7.-13(c). That figure sho\\'~ ;:~
Iea \ ing the re mai ning design to tools. the LEOs are cl ustered Into groups or red. green .. and blue LEDs-each cluster represents
A t ~ome point. Moore's Law must come to an end. because trans istors cann ot shrink o ne pIxel. For the LED Video d isplay shown 10 FIgure 7.-13 . each cluster or LEOs consi ts
to an infi nite ly ~mall ize. When th at end wi ll occur has bee n a s ubject of deba te ror many of five LEOs: two red. two g ree n. and o ne blue LED. Giant video displa~ are indeed
yea". Some people ha\e predi cted Moore 's Law wi ll continue a couple decades into Lhe Inte nded to be viewed from a dis tance. 0 mo t viewers don't see the indi\ idual LEOs.
2()()(h.

7.6 PRODUCT PROFILE: GIANT VIDEO DISPLAY


In the late 1990, and 200(h. g iant co lor video di s play, became popu lar a t s port stadi ums.
car dea le" hip'. ca, ino<;. rreeway bi ll boards. and various othe r locat i ns. Most s uch video
d"play, utillLe a huge g rid or light-emitting di odes (LED,) driven by digita l ci rcuits.
A light-emil/iug diode (LED ) is a semicondu ctor device that eilli ts lig ht when current one pixel
pa,<'c, through the device. In conLras t. a traditio nal " incande,ccnt" lig ht bu lb em its light
when c urrent p""e, throug h the bulb " internal filame nt. which i, a hi gh-res istance wire (a)
that heat' up and glow' w he n c urre nt fl ow' Lhrough the wire- the wire, however. doesn't figure 1.43 LED video di>pla) : (al a large LED di phI) (aboul 10) ard< \\ ide and - ~ ards tam. [h) a
hurn bc~du,e II " cnclo,cd in a vacc um or ine rt ga, "ithlll the bulh. BeclIu, c LED light do,cr \ ie\\ ~ho\\ ing :lOoUI I ~qllan: yanl. (c) :l \ C'r) cl "e \ le\\ 'ho\\ mg ahout I 'Quare incn--Ib
come, rrom a ,em lcond uctor material and no t Imm a ho t g lowlIlg hl'lnlent in tl bulb. LEDs "pi ch" can be ,cell. each pi\cI h ~l\ ing :2 red l UPJXr-Il'f~ J.~ld lo\\t:r-nghll.'Ir pl \ "1 . ~ gre 'n \UPf"t.~­
U'>C Ie" rower, ia't longer. ;lnd ca n ha ndle vihration' that would brea~ u regular light bul b. right and Itl\\ "r-Ieft of pi,el). and I blue LED (\'ellll'r 01 PileI).
~I~ Physical Implementation
7.6 Product Profile: Giant Video Display 415
A>sume we wan l to creme an LED video display capable of di splaying a nOx480
pixel video. where each pixel simply consists of one red. one grecn. and one blue LED. If Module
,,
each LED cluster has a width of just over 3/8 inch ( 10 millimelcrs) and a heighl of 3/8 ,
Panel Panel Panel
inch. our displ ay will be rough ly 2-1 feet wide and 16 feet hi gh. Furthermore. our display 000 ... 0 Blue Red

~~~
000
wi ll contain over one million indi vid ual LEOs. because 720 * 480 = 345.600 pixels. and
the LED, per pixel results in 1.036.800 LEOs. Panel Panel Panel Panel
"\\ ... Module
Q
...
Controlling every LED using a single digital circuit wou ld require millions of output 000 ... 0
pins and miles of wire to connect all of the LEOs. Insteael. as depicted in Figure 7.44. an Green
Panel Panel Panel Panel (b) (e)
LED vieleo display is construcled of smaller and smaller co mponents. The LED display
(a)
consists of an arrayal' small er components call ed /lollels. shown in Figure 7.44(a). The
panels are large display components typica ll y designed in a modular fashion such that Figure 7.44 LED video displays are designed hierarchically: (a) Ihe LED d'lsPI .
larger panels. which can be composed to create different sized displ"ys d ha~ conSIsts of several
displ ay manufaclUrers ca n easily create custom-size video displays and repair broken . d' . .. . an w lch can be
In IVldually replaced to repair broken panels. (b) each panel consislS of several smaller LED
components within a display simp ly by replac ing individual panels. The LED display
modules. responsible for controlhng the IndiVidual pixels. and (c) each pixel .
panels are further div ided into LED lIIodllles that control the physical LEOs. shown in red. green. and blue LEOs. Con ISIS of a cluster of
Figure 7.~{b ). An LED module is the basic display component and. depending on the
design of the module. can cont rol anywhere from a few hundred to a couple thousand The LED module controller
LEOs. For example. in designing a nOx~80 pixel di splay. we may want 10 use an array Rl
displays a video image by
of 6x6 panels. where each panel consists of an array of 5x5 LED modules. Each LED sequenlia lly scanning. or R2
module would then need 10 control an array of 24x 16 pixels. where each pixel is com- enab ling. each row and di s- R3
posed of three LEOs. playing the pixel va lues for each LED Module
The LED video display functions by div iding the incom in g video stream into sepa- column within the video image. Controller C2 C2 C2
rate strea ms for each panel. The pane ls furt her process the video stream by dividing the Us ing this teChnique. only one (R) (G) (B)
incom ing video strea m into even smaller streams for the LED modules. Finally. the LEOs row of LEOs is illuminated at
modules display the video frames by controlling the LEOs to output Ihe correct colors for any given time. However. the
eac h pi xel. or LED cluster. LED module scans the rows fast
LED Module enough such that the human eye Figure 7.45 LED module circuit consi ting of a matrix of
The LED module controls the individual LEOs wi thin the video display by turning the perceives all rows a being red (R). green (G). and blue (8 ) LEOs controlled b\ the
LEOs on and off at the proper times 10 create the fi nal color images. Because each LED illuminated. LED modu le controller. R IIR2IR3 are row I thro~eh 3.
The LED module must and el fe2 are columns I and 2-thus the matri.• sh~wn is
module can consist of thousands of LEOs. directl y controlling each LED would require
2x3 pixels. or 6 pixels total. with 18 LEOs total (3 LEOs
too many wires. In stead. as shown on Figure 7.45, the LED within the LED module are con trol the LEOs to create the per pixel).
connected in a matrix with a single control wire fo r each row and three control wires for desired color for each pixel.
each column (one wire for each colored LED within the LED clusters). In the fi gure. the Each pixel wi thin a video frame is typically represented usinoe an RGB co Ior pace. An
LED module controller control s an array of 2x3 pixe ls, where each pixel consists of three RGB (red/greenlblue) color space is a method to create any color of li2ht b dd' 0
'fi' .. . h f d - ya tn . spe-
indi vidual LEOs. for a total of 18 LEOs. But as shown. the controller u es only 9 wires 10 Ct c 1I1tenSllles. or bng rnesses. are. green . and blue colors. Each pixel within a -video
control those 18 LEOs. The wire aving using this row and column approach becomes frame may be represented as three 8-bi t binary numbers. where each -b'tI num be r peCt- .
even more significant fo r more pixels. An LED mod ule with 24x 16 pixel s and three LEOs fies the intensity of the red. green. or blue colors. Thus. for each alar. the LED od I
per pixel would have 24* 16*3 = 11 52 LEOs. but the controller would requ ire only 16
The largest LED 111uSt be able to provide 256 distinct brighlness levels. However. an LED by itse7f o~~
displtl}' in 2004
w i re~ (one per row) plus 24*3 wires (three per column). for a total of onl y 88 wires. 35
was J jeel wide supports IWO values: 011 and off. or full Inlenstly and no intenSity. .
by 26 feet wl/. . To support 25.6 brightness levels. the LED module controller u e pul e width modu-
buill fI.fillg 10 lal1 on. In pulse IIIld,h lIIodulatlOlI (also known as PWM). a controller dri\es a wire \\ ith
large FI'GA s, 323
lIIo(/erale-si:.e a 1 value for a specific percentage of a time period-the signal being 1 is kno\\ n as a
FPGA.r. 333 flash pulse. the duration of the 1 is known as the pulse s width. and the pen:-enta2e of the
II/l' lIm rie ,f, 1I11l1
period spent at 1 is kno\ n as the dilly cycle. When thm pulse drive ' an LED: a \\ ider
3800 PLDs.
(Sou rce: Xedl pulse causes the LED to appear brighter 10 the human eye. Figure 7A6 illu>t:rates ho\\ the
j (mmal. Wiml" LED module controller uses pulse width modulation to suppon \ ariou, brighrn -, le\eIs
21JO.1). for the LEOs. To illuminate an LED at full brightne s. the controller 'impl~ Jri\C, the
416 Physical Implementation
7.8 Exercises .j 17
LED with I for the entire period . as shown in Figure 7.46(a). To illuminate the LED at
half brightness. the controller uses a pulse with a 50% dUl y cycle. as shown in Figure download ing a bitstrcal1l into the FPGA I device. One might notice the , imi larity of tlwt
7.46(b). For 25% bri ghtness. the controller sets the pul se to I for 25% of the period. task wllh the t~. k .01' implement ing functionali ty on a microprocc>sor. which also involve,
mean ing a 25 % dut y cycle. as shown in Figure 7.46(c). For an LED video di splay. the down ioadlllg bus Into an Ie device. Thu,. Ihe diff Tence between so ftw:lfe on a micropro-
LED module cont roll er divides the length of time each row is scan ned into 255 time seg- cessor and cu, tom digitnl ci rcuits continue, to be blu rred "pec iall y when one can. iders
ments. and cont rols the brightness of the LEOs by tu rni ng each LED 0 11 for 0 to 255 time that modern FPGA., can al,o incl ude one or several l1l icrop rocc;.sor. with in the sa llie I .
segments. thereby support ing 256 levels of intensity. For more IIlfOrmallon n the blu rri ng. ;.ec "The Soft ening of Il ardware." F. Vahi d. IEEE

,
Period 1 ,,
.. 1'-
Period 2
,,
"i·
Period 3
"
,,
: III
Period 4 .,
,, COlllplller. April 200

~
7.8 EX ERCISES
(a) ~
SECTJOI 7.2: ~ I A N U IOA CT U R E D I C TEC II NOt.OG I ES
(b )

r r r
~·S 7. 1 Explain why gale nrray I
(C
)-{l lechnology.
technology ha... II :-.honcr prmJucliulI lime thull fu lJ.CU ... lol1l I

7.2 Ex~lain why Ihe u,c of NAND or lOR gale' illu CMOS gmc ttrray clfell il ill1pl cmc l1l~ili on i ~
IYPlc:lily preferred o\er all AI O/OR! OT 1ll1plcIl1Cnial lUil 01 :t 1:11"CUll .
Figur. 7.46 Pul se widlh modulation can be used 10 create various LED brightness levels: (a) for full
7.3 Omw OJ gale ar~IY Ie hav i." g th ree rows. the firM row IHl vi ng f( ur 2-illplil AND sale.... lhc
brightness. the LED is always on. (b) for half brightness. the LED i" tumed on 50% of the time. and
~ccolld :OW h~v lng ~our 2-lllplil OR gale ... , and Ihe th ird row huv ing lour OT gale!'.. Show
(c) for quarter brightness. the LED is turned on 25% of the time.
how to IIlSla n li atc wm:~ 10 the gate array to implcmclIl the lum:lion F (d b e) - ab +
Because an LED module cont roller must prov ide precise ly timed signals at a fast
a'b' c'. . . C

rate. custom processors are commonl y used rather than just microprocessors. FPGAs are 7.4 A" ulnc it ~~tlndard cell library h,n... ;1 2: inpu l A D g Ule. n 2-inpul OR gale, und a NOr g UIC.
a common choice for impl ementing those custom processor circuits in LED video di - Usc ;:1 dr;:I\\.I.11lg 10 , how how 10. 1I1 . , ~an ~ta l e and place M:mdard cc ll ~ On ;:IIl I :.tnd wire Ihem
pl ays. due to several reasons. First. FPGA s are fast eno ugh to support the required scan ~ oge lhc r ,to Implement the function 111 Ex ercise 7.3. Draw yo ur cl.! l1 ~ thl.! ~all1e ... i ll.!!.i~ Ihe galc~
111 Exercl!)c 7.3. and be ~ u rc your row ... ;:Ire of equal ... i/c.
rates. Second, the circuit on the FPGAs can be easil y changed, making it possible for the
di splay manufacturer to fix bugs in the c ircui!. and even upgrade the circuit, without Gs 7.5 Draw II gal e ;:1"::1)' Ie
Iw vi.ng three rows, Ihe fi rst row havin g four 2-illpUI AND gales. lhe
req uiring the high cost of creati ng a new AS IC. Third , the di splays themselves are fairly sccond r~w ha~ lI1g f~u r 2-lIlpul OR galC!), unci the third row havi ng fOllr NOT gUles, Show
how to Illslanllalc wires t Ihe galc arra), 10 implcmclll the equtlliull F (a , b , c .d )
large. expensive. and consume much power. and therefore the larger size. higher cost. and a ' b + cd + c ' .
more power consumption of FPGA s compared to AS ICs do not impact the overall dis-
7.6 As ume " "Iandard cell li brary ha, a 2~ i nput A D gate. a 2-il1l>ut OR gate. and a NOT gate.
play's size. cos!. and power 100 signifi cant ly.
Usc a dr:)\vlIlg 10 ~how how 10 II1:,wnlm Lc and place ... tandard ce ll ... on an I and wire Ihem
IOgcth er to implcmcnl Ihc runction in Excrcil"c 7 .5. Be ... urc to draw yuur cell , Ihe '3me 'lizc a!o.
the gnlc~ in Exc rci ... e 7.5. and be "'lIre your rows ure o f equal ')i/c. . -
7.7 CHAPTER SUMMARY
7.7 Consider the imp lcmenlalions or a Imlf-addcr wil h a gale array in Figure 7.4 ;:lI1d with Mundard
In th is chapter, we discussed (Section 7. 1) the idea thai we must map our circuits to a cells in Figure 7.6. A" ul1w each gate or cell (including inverte",) I"" a delay of I n". Abo
physical implementati on so that those circuits can be inserted into a real system. We assume th?t every II1 C ~ of w ire ( ~or each II1ch .111 your draw.ing. nOl an an (lcilial I ) h~ a delay
introd uced (Section 7.2) some technologies that require th at a new chip be fabricated to of 3 11 ... (wire:, arc re lall vc ly ~ I ow 111 the Cr.l of li ll y fa l,l lran"'I'o tor\ ). Li l11~l t c the delay of the gate
implement o ur circuit. Full-custom technology gives the most optimi zed implementation, array and the standard ce ll circu its.

but is expensive and time-consuming 10 design. Semicustom technologies give very good Gs 7.8 For your soluti?ns to Exerci"es 7.3 and 7.4. ' " ume that cuch gate and cell ha, a delay of Ins.
implementati ons while costing less and taking less time 10 des ign. through the prede- and Ih;'1( evcry Inch .of w ire ( for ench II1ch 111 your dm wlIlg. not on an actuu l Ie ) co rresponds to
n delay of 3 ns. E~llInat e lhe del ay~ of the gme arra y and ~Ianda rd cell circuits.
signin g of the gates or cell s that will be used on the IC. We described (Section 7.3) the
increasingly popul ar technology of FPGAs. and showed how a circuit could be mapped ~ 7.9 Draw a circui t using AN D. OR. and OT gates for the following equation: F( a . b . c) _
onto a set of programmable lookup tables and switch matrices. We highlighted (Section P LUS a ' be + a be ' . Place inversion bubbles on that circuit to conVert the circuit to:
(a) NAN D gates onl y.
7.4) several other technologies, including off-the-shelf SSI/MSI ICs. and programmable
(b) OR gate, only.
logic devices. We gave some data (Section 7.5) showing the relati ve popularity of the
7.10 Draw a ci,,;uit u s i~g AND; OR. and OT gates for the following equation: F( a . b . c) _
technologies described in the chapter.
a be + a + b + e , Pl ace II1vers lon bubbles on th at circuit to con\'ert the circuj t to:
An interesting trend in physical implementation is the trend toward programmable (a) NA D gates only.
ICs (FPGAs in particul ar). Impl ementing functionality on an FPGA involves the task of (b) NOR gates onl y.
.U S Physical Implementation
7.B Exercises 419
7. t I Ora\\' ~I circuit using AND. OR. and NOT gales for the following equation: F( a . b. C) ""
7.22 Show how to implemcllI on Iwo
(a b + c) (a ' + d) + c '. Convert the circuit (0 a circuit using: Bx2 Mem.
3-input 2-ou tput lookup tab les the fo l- Bx2 Mem.
(a) lAN D gates only.
lowing function : F (a . b . e , d)
(b) NOR gnles o nly.
a ' bd + b' cd ' . Assumc the two
7.12 Draw a circu it usi ng A 'D. OR. and NOT gales for the following equation: F ( w . x . y . z) "" lookup tables arc connccted in the
(\'1 + x) (y + z) + \'/Y + X Z. Convert the circuit to a circu it using: manner shown in Figure 7.47. You
(a) NAND ga tes o nl y. may not need to use every lookup -... a2 a2
(b) NOR gates onl y. table Output. -... al a1
7.13 Draw:l circuit Ll silH! AND. OR. and NOT gates for the following equat ion: F ( a , b , C , d ) =- 7.23 Show how to implcmen t on two 3-inpul -"" ao
(a b) (b ' + c) ~+ (a d + c ' ). Con ve rt th e circuit to a circ uit using:
I 2-ou tput lookup tables the following
(a) NAND ga tes o nl y. function s: F(x , y . z) : x 'y + d1 d1 dO
(b) NOR gates o nly. xyz ' and G(w , x . y , l) : w' x ' y '-----"r---';
7.14 Create J template for convening a 3-inpul AND gate to a ci rcuit using only 3-input NAND + vi' xy Z ' . Assume the two lookup
gates . tables are connected in the manner Figure 7.47 1\"0 3-input 2-outpu t lookup tables
shown in Figure 7.47. Implemented using 8x2 memory.
7.15 Create a template for converting a 3-inpul OR gale to a circui lllsing onl y 3-input NAND gates.
7.24 Show how to implement on two
7.16 Crea te:1 Icmplale for converting a NOT gate to a circuit using only 3-input NAND gates.
3-input 2-ouput look up tables the following function s: F ( a , b . e . d ) - abe + d and
7.17 Ass ume a standard ce ll library consisting of 2-input and 3- inpu t NAI D gates with a delay of
G "" a'. You. mu~t implement both F and G with only two lOOkup tables connected in the
I ns each. 2-input and 3-i nput A D and OR gates with a delay of 1.8 ns each. and a NOT gate manner shown In Figure 7.47.
wi th a de lay of I ns. Compare the number of transistors and th e delay of an implementation
using only ANDIOR! OT gates with an implemen tation using only NAND gates for the func- 7.25 I m~lement a 2-bi t comparator that compares two 2-bi l numbers and has three outputS indi-
tion: F ( a . b , c) =a b ' c + a ' b. For calculating the size of an implementation. assume each cating greater-than, less-Ihan, and equal-to, using any number or 3-input 2-output loo"-.'1Ip
tables and Custom connections among the lookup tables.
gate input requires IWO transistors.
7.18 Ass ume a s tandard ce ll library consisting of2-in put AND and OR gates wit h a de lay of Ins 7.26 Show how to implement" 4-bi t carry-ripple adder usin g any number of 3-input 2-inpu t loo~-up
each. 3-in put AND and OR gates wi th a delay of 1.5 ns eac h, and a NOT ga te wi th a delay of tables and cll stom connectIOns among the lookup tables. Hint: map one full-adder to each
look up table.
I ns. Compare the number of tran sistors and th e delay of an implementation using only
2- input AND/OR gates and NOT gates with an implementatio n using o nl y 3-input AND/OR 7.27 Show how to imp leme nt a 4-bi t carry-rippl e adder using any number of 4-input t -oUtpUl
ga tes and NOT gates for the functi o n: F (a , b , c): a be + a ' b ' e + a' b ' e '. For cal- lookup tables and custom connec ti ons among th e lookup tables.
culating the size of an implementation. assume each gate input requires two trans istors. 7.28 Show how to implement [1 comparator that compares two 8-bit numbers and has a sin21e
7.19 Ass ume a stan dard ce ll library co nsisting of 2-i nput NAND and NOR gates with a delay of equal-to output. using any number of 4-inpul I-output lookup tables and custom onnecti;ns
I ns each. and 3-i nput NAND and NOR ga tes with a delay of 1.5 ns eac h. Compare the among the lookup tables.
numbe r o f transis tors and th e delay of an implementati o n usin g on ly 2-inpu t AN DINOR 7.29 ~how the bi t file necessary to program the FPGA fabri c in Figure 7.29 to implement the func-
gates with an implementation using only 3-input NANDINOR gate s for the function: ti on F (a , b , e ,d ) = ab + cd. where a. b. e. and d are exte rnal inputs.
F ( a , b , C): a ' be + a b' e + a be ' . For calculating the size of an impleme ntation. 7.30 Show the bit file necessary to program the FPGA fabric in Figure 7.29 to implement the func-
a~s ume each gate input requires two transistors. Iton F (a , b . e . d) : abed. where a. b. e. and d are external inputs.
7.31 Show th e bit file necessary to program the FPGA fabric in Figure 7.29 to implement the func-
SECTIO ' 7.3: PROG RA MMABLE I C TECHNOLOGY-FPGA
tion F (a , b, e, d) = a ' b' + e ' d. where a. b. e. and d are external inputs.
7.21) Show how to implement o n a 3-input 2-output lookup tabl e the function F (a , b , c) : a +
be . SECTION 7.4: OTHER TEC HNOLOG IES
7.2 1 Show how 10 implement on tWO 3-input 2-output lookup tab les the function F (a , b , e ,d ) : ab 7.32 Use any combination of 7400 ICs listed in Table 7. 1 to implement the function F (a . b , e . d)
+ cd. A ~s ume you can connect the lookup tables in a custom manner (i.e .. do not use a switch : a b + cd.
matri x. jU~1 directly connect your wire ). 7.33 Use any combinati on of74oo ICs listed in Table 7. 1 to impleme nt the functi on F( a , b . e. d )
= abc + ab'e' + a ' bd + a'b ' d ' .
7.34 By drawing XS on the circuit. program the PLD of Figure 7.38(0) to implement a full-adder.
7.35 By drawing Xs o n the circuit. program the PLD of Figure 7.38(a) to implement a ~-bit
equality compara tor. Assume the PLD has an addi tional 14 input.
7,36 *(a)Design a PLD device capa ble of supportin g a 1-bit carry-ripple adder. B, drn\\ing ·s on
your PLD circuit. program the PLD to implement the 1-bit arT) -ripple adder -

r J
.UO Physical Implementation

(b) Using a CPLD device consist ing of several PLDs frolll Figure 7.38 and <.Issu ming you can
connect the PLDs in a clistom manner. impJcmcllI the 2-bit c:1rry-ripple adder by drawing
Xs on Ihe PLDs.
(c) Compare Ihe size of your PLD and Ihe CPLD by delermining Ihe gales required for bolh
designs (make sure you compare the number of gales within the PLD and CPLD and not
the number of gates used for your implementation).

SECTION 7.5: IC n:CHNOLOGY COMPA RISONS


7.37 For each of the sy stem constraims below, choose the 1110S1 appropriate technology from among
FPGA. standard cell. and full-cu stom Ie technologie s for implementin g a given circui t. Justify Programmable Processors
your answers.
(n) The system must ex ist as a phys ical prolOlype by next week.
(b) The sys tem shou ld be as small and low-power as poss ible. Short design time and low cost
are /lat priorities.
(c) The sys tem should be reprogram mabi e even after the final product has been produced.
(d) The sy lem should be as fasl as possible and should consume as lillie power as possible. 8.1 INTRODUCTION
subjec t to being completely implemented in just a few months.
(e) Only five copies of the syslem will be produced and we have no more Ihan S 1000 10 spend rDigital
h
circuits des d ..c' .
Igne to pell ornl a slJ1gle processlJ1g ta k. such as a seat belt warning
on all Ihe ICs. Ig t. a pacemake r, or an FIR fi lter, are indeed a very common cia s of digital circuits. We
7.38 \Vhi ch of the following implementations are "at possible? (I) A custom processor on an ;Ighl refer to a circuit perfornling a single processing task as a sil/gle-purpose processor.
FPGA . (2) A cuslom processor on an ASIC. (3) A cuSlom processor on a full,cuslom IC. (4) IJ1gle·purpose processors represent a class of di gital circuits enabling tremendously fast
A programmable processor on an FPGA . (5) A programmable processor on an AS IC. (6) A or power·efficlent compulation. However, another class of digital circuits. known as pro-
programmable processor on a full -custom Ie. Explain your answer. grammable processors, is a lso extreme ly popular, as well as being more widely known.
The progra mmable processor is largely responsible for the computing revolution that has
take n place in the past severa l decades. leadi ng to what many call the infonnation age. A
programmable processor. also know n as a gel/eral·purpose processor, is a digital circuit
whose panlc ular process ing task is stored in a memory. rather than being built into the
CirC Uit Itself. The representat ion of that processi ng task in the memory is I...11own as a pro-
gram . Figure 8. 1 illustrates single·purpose versus general-purpose processors. We could
creale a custom digi tal circuit for a seat belt warning light system (Chapter 2) or an FIR
filter system (C hapter 5). or instead we could program a general-purpose processor circuit
10 Im plement those systems.

3'lap FIR filter Other


program programs

Seat belt
warning lighl
single·purpose
processor

3'lap FIR Ii Iter


single·purpose processor
Figure 8.1 Single.purpose versus geneml'purpose processors. General·purpose processor
-'22 Programmable Processors
8.2 Basic Architecture 423
Some programmab lc processors. like thc well-known Intel PCllIium processor or
Sun', Spare proce"or. are illlended for use in dcsktop computers. Other programmable Trallsformin g that data. meaning perfomling some computations with that data
proces,ors. l i~c ARM. MIPS. 805 1. and PIC processors (whi ch arc widely known in the that result 111 new data. and
design community bu t kss known by the ge nera l public). arc illlendcd for embedded sys· Storillg the new data. meaning writing the new data to some output locations.
tems. like ce llul ar telephones. automobiles. video ga mes. or even tennis shoes with
blinki ng lights. Some programmable processors. like the PowerPC. arc intended for both h ~or example. a SCat belt warn ing sy tern reads bit data from sensors representing
de -ktop and cmbedded domai ns. w et er a seat belt is fastened and whether a person is sitting in a eal. transforms that
A benefi t of a progra mmabl e processor is that its circuit ca n be mass-produced and data by com put1l1g a new bit indicati ng whether to tum on a warning li oht. and writes that
then programmed to do almo. t anything. Thus. the same programmab le desktop pro- new.
' data to a warning I"Ight. An FIR fi Iter read data represenllng
. thee most recent set of
cessor can fun \Vind ows 98. \Vindows XP. Linu x. or whateve r new operating system Input SIgnal samples. tran form s that data by performing multipli es and add. and writes
program co mes aboll l. Likewise. that same processor ca n run appl ica ti on programs like new data to an output representing the filtered signal.
word processors. spreadsheets. video games. web browse rs. ctc. Furthermore, the same A data memory holds all the data that a program-
program mable cmbedded processor can be used in a cel l phone. automob ile. video game. mable processor can access. as input data or output somehow
data-for now. assume the word in that data memory r-------~ connected
or tenni, shoe by programming the processor for the des ired process ing task. Mass-pro-
are somehow connected to the outside world (e.g .. to to the
duction resu lts in low costs due to amorti zat ion of design costs (sec '"Why such cheap outside
calcu lators"" in Chapter -' fo r a discussion of amortization). the seat belt sensors or to the FIR input and output world
Of course. because programmab le processors arc ma s- produced and then used for a SIgnals). To process that data. a programmable pro-
wide \'ariety of applications. there aren't as many unique programmable processor cessor needs to be able to load data from data
designs as there arc single-purpo e processor desig ns. It follows then that there are far memory into one of several registers (typically a reo-
fewer programmable processor designers than there arc single-purpose processor ister file) within the processor. need to be able ~o
designer. evertheless. even th ough you may never desig n a programmable processor as feed data from some subset of registers throuoh func-
part of your job. it i interesting and enlightening to understand how such a program- tional units that can perfoml all ;os ible
mable processor works. Some people argue that people who understand how a processor trallsformatioll operations (typically an ALU) we
works are even better software programmers. And tec hnology trends have led to the situ- might consider wit h results stored back into a register.
ation of des igners being able to create semi custom processors ("appl ication-specific" and needs to be able to srore data from a regi ster back
processors) that have ju t the right archi tecture fo r one or a mall number of applications, into data memory. Therefore. we ee the need for a
making knowledge of programmable processor designs important. Finally, there are programmable processor to inc lude the basic circuit
indeed peop le who do de ign programmable processor architectures. and you never knolV shown in Figure 8.2. showing a data memory. regi ster
if you might end up being one of them. fil e. and ALU. That circuit is known as the program- Figure 8.2 Basic datapalh of a
In this chapter. we show how to design a simple program mable proces or using our mable processor s datapat" . The basic datapath programmable processor.
prc\iously-described digital design method Our purpose is mainly to demystify these shown in Figure 8.2 can perform the following po _
device~ and to provide an intu ition of how programmable proce sors work. We point out
sible datapat" operatiolls in a given clock cycle:
that real mass-produced proces ors are designcd using different methods. and their designs
can be much morc complex than the de ign described in this chapter- learning about those Load operatioll: Thi s operation loads (reads) data from anv location in the data
proce. so,,' designs is the subject of many textbook. on computcr architecture. memory into any register in the register file . A load ope';tion is illustrated in
Figure 8.3(a).

8.2 BASIC ARCHITECTURE ALU operatioll: This operation tran form s register data by p sing am two fegi --
ter through the ALU configured for any of the ALU' supponed -ope~tions . ",;"d
A programmable prOce%or consist of two main parts: a datapath and a cont rol unit. back 111tOany regtster of the regIster file. An LU opcrmi n is illustrated in Figure
We' ll provide a genera l imrodu ti on to those two parts in thi , ,ccti on. then we'll provide 8.3(b). Typical ALU op~rations include addition. subtraction. logical A..;U.
a more detailed look at tho,c parts in a subsequent scction. logical OR. etc. -

Basic Datapath Store operatiol1: This operJtion stores (write) dara from an~ regi -ter in the regi.ter
fi Ie to an)' data 111e1110ry location. A store opemtion is illustrated in Figure '.3( ).
We can view procc"lI1g generally as:
These possible datapath .operations are ill ustrat~d in Figure .:. E:tch ,uch opcnti n
data. meanll1g reading the data on wh Ich we ",i,h to work from some
Lnlldllll( requires the appropnaw setllng ot the c?ntrOl I11putS f the uara mem 1\ . I11U'. re!!i~ter
Input locution,. file. and L - those control 111pUtS wlil be sho\\n shortl~. For n(\\. Just familiarize
Programmable Processors 8.2 Basic Architecture 425

, ourse lf \\'ilh Ihe basic dalapalh's abililies. NOlice Ihm Ihe dmapalh in Figure 8.2 cannol Basic Control Unit
direcll Y oper:lIe on dala memory locmions wilh Ihe ALU in one clock cycle. because lhe
dOIa n1l,,1 firsl he read inlo Ihe regisler file . which ilse lf requires a clock cycle, before lhe Suppose we walll 10 use Ihe basic datapath of Figure 8.2 LO perfonn the simple processing
dala can be opcralcd on by Ihe ALU. A dalapalh Ihm requires all dala 10 firsl pass through lask of addmg dala memory local ions 0 and I logelher, and wriling the resull in data memory
Ihe regiSicr lik before Ihal d:lIa can be Iransformed by Ihe ALU is known as a load-store locallo~ 9-m olher word, we WamlOcompule 0[9} = O[OJ + O[ I J. We can achieve this
processmg lask by "inslrucling" the dalapath lO perfonn the following operations:
architecture .
load datapalh memory localion 0 LO regisler file regisler RO (i.e., RF[OJ = O{O/),
load daLapmh memory lOCation I lO regisler file regisler R I (i.e. , RF{ I) = O{ I /),
perform an ALU operation lhat adds RO and RI and wriles the resuil back into R2
(i.e., RF{2/ = RF[O/ + RF{ I J), and
Slore R2 into data memory localion 9 (i.e., 0[9) = RF[2J).
NOle lhal we could have used any regislers in the regisler file, rather than RO, RI. and R2.
If 0 [0/ contained Ihe value 99 (in binary, of course), and O[ I J contained the value
102, lhen afler carryi ng oUllhe above operalions, 0[9J would cOnlain 201.
You mighl lhink lhal having 10 instrucl the datapath lO perfonn four distinct opera-
lions is a rather cumbersome way of adding IwO dala items. If you could build your own
CUSlom digilal circuillO implemenl 0[9/ = 0[01 + O[ I }. you would likely just feed OlOI
and 0[1 J lhrough an adder whose ompul you would conneCl LO 0[9J, thus avoiding Ibe
four operalions involving the regisler file and ALU. We see the basic tradeoff of single-
purpose versus programmable processors-programmable processors have the drawback
(a) (b) (e) of compulalion overhead because Ihey have to be general, but they provide the benefits of
a mass- prod uced processor lhal can be programmed lO do almo l anything.
Figure 8.3 B"ie dalapalh operalions: (a) load (read). (b) ALU opcr:lIion (transform). and Somehow we need 10 descri be the sequence of operations-RF[OJ = OlOJ, Iben
(c) "ore (wri le).
RF{ I }=O{ I}, lhen RF{2/ = RF{O/ + RF{ I}, then 0 [9J = RF{2/-that we desire LO execute
on the dalapalh. Such a description of desired processor operations are known as instruc-
EXAMPLE 8.1 Understanding data path operations tiOIl S, and a colleclion of instruclions is known as a program . We will tore Ibe desired
program as words in anOlher
Which of Ihe following are valid single-clock -cycle dalapalh operalions for the datapalh of memory. cailed the ins/rue /ioll
Figure 8.l?
memory. We'll describe how 10
Instruction memory I
I. Copy daw from a data memory location inlo a reg iMer file localion. represenl lhose instructions Ialer. 0: RFIO]= DIO]
2. Read dala from two d:U3 memory loca ti ons into IWO rcgi,(cr file localions. For now. assume lhat the four 1: RFlll= Dll]
instruclions are somehow slored 2: RFI21= RFI01+R F(1]
3. Add data rrom IWO data memory locat ion... and l)IOre the result in a register fi le location. 3: DI9]= R Fj2l
in locations O. I. 2. and 3 of Ihe
~. Copy dma from one regi<ler file locolion 10 anolher regi"er fil e localion. inSlmction memory I. as shown in
5. Subtrac t data in a rcgi'lcr file localion rrom a d::!la memory loc:lIion. storing the result in a Figure S.4.
register fi le location. ow is where the comrol
( I) " a valid operalion. "nown a, a load opermion. (2) is 1101 1I valid operali n. We cannol read more unil plays a role. The cOllfrol
Ihan one daw memo!) local ion during a dmapmh operalion (for Ihi s dUlaplllh). and we cannOI wrile lilli/ reads "'Ich insll1Jclion from
10 more th:.m one regi\ lcr fi le loca ti on during:1O operati on. (3) b ,w/ :1 va lid operati on. Not only can insll1Jclion memory. and lhen
we nm rCild from two data memory l oca l ion~ during onc opt:ru tion. but wc cannot reed the read execules lhm inslmclion on the
,alue, dlreclly inlo Ihe AL 10 perform Ihe add- we mu" fi rsl perronn opemlion Ih.1 read Ihe dalapalh. To execule our simple
duta lIem; II1tO reg"ler hie lac.lion;. (4) is. v" lid oper,"ion. We can configure Ihe ALU operalion program. the conlrol unil would
10 'Imply pa" one of ," "'PUb Ihrough to .he QUIPUI (perhap' by adding 0) and slore Ihe re,uil in begin by perfonning Ihe fol- Control unit
Ihe reg"l« hie. (5) "/lor a valid operal.on. We cannOI feed a read duw memory lacOlion directly 10
lowing lasks. known us stages. to
Ihe AL -Ihere " no such co"neelion .n Ihe d.II.'pmh Vulue; read fr m dala memory mu I be
arry OUI Ihe firsl insinl lion: Figure 8.4 Tho control unil in 3 programmable p
loaded InlO Ihe rC~"ler hie hr.1
·'16 Programmable Processors
8.2 Ba sic Architecture 427
l. Fetch : The con trol unit wo uld stan by reading l i D} into a loca l register, a task
stage
h fetching
d flOj' s co ntents, the 'Instruct ion RFIO}=OIO}. into fR . Figure 8.S(b) shows
know n a, fe tching . Thi s stage requ ires one clock cyc le.
tl ~dsecon state decoding the instruction and thus determining that the instruction is a
1. Decode : The control unit wou ld then determine what opcration this instruction is 0" II1structlon F,oure 8 5{c) h
. h ." . sows the controller executing the in lruclion by confio-
urlng t ~ ~atapath to read the value of 010} and storino that value into RF/O]. If D/OJ
requc,ting. a task known as decoding . Thi s stage also req uires one clock cycle.
3. Exewte: Seeing th at thi s inslructi on re luests the datapath operation RFIO} = contall1e 9. then RIO} wi ll contain 99 after completion"of the execute stage.
010}. the cOlll rol unit wo uld set the cOlllrol lines of the dalapath to read DIO}, After proces II1g the instruct ion in IIO}, the control unit would fetch the in lruction
pas; the read data th rough the 2x I mux in fron t of the reg ister fi le. and write that that IS 111 III J. decode that instruction, and exeCUle that in truclion {lhus executing
da ta int o RIO}. The task of carrying out the operation is know n as exeClilillg. Most RF/ I} ~ Of I J), requiring another three cycles. Next, the control unit would fetch the
operations arc datapath opermion s (such as a load operat ion. ALU operation, or II1strucllon that i in 1{21, decode that instruction, and execute that instruclion {thus
slOre operation). but not all operati ons require thc datapalh (an example is the executing RFI2} = RFIO} + RFlf Jl, requ iring anOlher three cycles. Finall y. the control
ju mp instructi on to be discussed later). Thi s stage requires one clock cycle. unit wo uld fetch the instruction that is in 113}, decode tltat in lruction . and execute that
Thus. the basic stages the control unit carries out for th m first instruction are: fe tch, IIlstructl on (thus executing 019} = RF/2J), requi ring anolher three cycles. The four
decode. and exeellle. requiring three clock cycles to compl ete just thm first instruction. IIl structl ons wou ld require 4*3 = 12 cycles 10 run 10 completion on the programmable
processor.
The local register in which the control unit IOres the fetched instruction is known as
Tlte control unit wi ll require a controller.
the illstructioll register. or fR . a shown in Figure 8"+. NOIice thm the cOl1lrol unit needs
like those de cribed in Chapter 3. that in this
10 keep track of the locati on in instruction memory from which to fetch the next instruc-
case repeated ly performs the fetch. decode. IR=I[PC]
tion. Since the in struction locations are usually in sequence. we can use a simple up-
counter 10 keep track of the currelll program instruction-such a counter is known as a and execute steps (after having initialized PC ~_~_/ PC=PC+t
10 O)-nOle that a controller appears inside the
program COli liter. or PC for sha n . The processor stans with PC=O, so the instruction in
control unit in Figure 8.4. An FSM for that
flOJ represents the fir t instructi on of the program.
Figure 8.5 illustrates the three stages of executing the instruction RFfOJ = DIO} controll er appears in Figure 8.6. Tlte con-
stored in flO}. Assuming PC was previously initi ali zed 10 O. Figure 8.5(a) shows the first troller increments the program counter after

------------------------------------------------1 fetchin g each instruction in state Fetch. so tltat


the next fetch state wi ll fetc h the next instruc-
:________ ~ _-_- __ -------1--------------------------: ti on (nOlice tltat PC gets incremented at the
Controller
I InstructIon memory I : end of tlte fetch stage in Figure 8.5(a)). We ll
0: R F[O]=D[O] ,_________ - - --- --- - -- __ L --- --- ----- - -- -- --- ------, describe the actions of the Decode and Figure 8.6 Basic controller states.
1: RF[1 ]=D[t] i Instruction memOlY I Execllte states later.
2. RF[2]=RF[0]+RFP] i 0: RF [O]=D[O] Thu , the basic pans of the control unit include the program counter Pc. the in auc-
3: D[9]=RF[2] : 1: RF[l]=D[I]
tion register fR , and a controller. as illustrated in Figure .-I. In previous hapters. our
: 2: RF[2]=RF[0]+RF[1 ]
nonprogrammable processors consisted onl y of a controller and a datapath. Notice that
j 3: D[9]=RF[2]
the programmable proce sor instead contains a control unit. which itself consi IS of some
regi sters and a controller.
To summarize. the comrol unit processe each instruction in three tages:
l. first fetchillg tlte instruction by load ing the current inslTUction into fR and in ce-
Conlrolier menting the PC for the next fetclt.
2. next decodillg the inslnlction to determine its operation. and
3. fina ll y execlltillg the operation by setting the appropriate ontrol lines ~ r the data-
Lc~o~n~t~fo~l~u~n~ll______-l
_____ ~
path . if applicable. If tlte operation is a datapath operation. the oper~tion ma~ b<!
(a)
one of three possi ble type :
(b)
.
··· (al/oorlillg a data memo I) locmion into a register fi le location.
__ . --=_
__=_C::_~_~_.!_~~_:-:-_,..
:L ,,_-='
_ __ __-_-_J.J··
~~~~-_-_-_-_-_- (bl tmn,rorming <lma using an AL opemtion on register file locations and
(c ) writing results back to n register fik I 'ation. or
f,gure 85 Three tage, "f p"J<:c",ng nil' on,'ruWnn (a) fetch . (h) decode. (e) e,cnll" (el ,fwrillg a regi,ter file loc:nion into a data memOI) ·ation.

- ~ -- -...-- -
-'28 8 Programmable Processors
8.3 A Three-Instruction Programmable Processor 429
EXAMPLE 8.2 Creating a simple sequence of instructions the bits r)r2rlrO' For example, the instrucLion "0000 0000 00000000" speci fies
Crt'a{(' a 'l't of In,tmction, ror the.: proccs~or in Figure MA to compu te 0/3/ = 0/0/ + Df 1/ + D12/. a move of data memory location 0, or DIOI. into register file locaLion O. or
Each in!-tnlclion mu,t f\!prC'Cnl a valid ~i nglc·clod.-cyc l c d;lI:lp~lIh ope rati o n. RFfO/~in other words. that inslruction represenlS the operation RFfOI ; DfOf.
\Vc might ... lan \\ ith thr~e opl..'J"alions th at read th e data memory locati ons in to register file
LikeWise, "0000 0001 00101010" specifies RFfll=Df42f. We've inserted
location,: spaces between some bits for ease of read ing by you the reader-those spaces
O. R131; 0101 have no other significance and would not exi sl in the insLruction memory.
I. RI.JI; 0111 Store instruction-OOOI r)r2rlrO d,d6dSd,d)d2dldo: Thi s instrucLion specifies a
2. Rlcl; 0121 move of data In the opposite direction as the instructi on above. meaning a move
NOlI..' lhat \\(' intcllIionall y c ho~c arbitrary regi ster IOC;lIiolls. to make clear thaI we can use any
from the register file to the data memory. So "0001 000000001001 " specifies
DI91;RFIOf.
rcgl'ler:-..
Ne\t. we need to ;Idd the three va lu es a nd s tore the res ult ill a reg is ter fi le locati on. say R/ Jj. Add instruction-OOIO ra)ra2ralraO rb)rb1rblrb o rc)r~rclrco: Thi in truction
In other \lord,. \Ie wanllOperronn ille rollowing opermion: Rill; RI21 + RI31 + R141. However, speC ifies an addition of two register file registers specified by rb3 rb1rb lrbo and
the datapath of Figure.: SA cannot ;Idd three register file locations in a single operati on. but rather rc)rc2 rclrc O' with lhe result stored in tile register file register specified by
can an i) add 1\\0 location .... Instead. we can describe the desired add it io n comput ati o n by dividing ra)ra1ralraa· For example, "00 10 0010 0000 0001" specifies the in truction
the computation into 1\\ 0 dmapat h ope rations: RFf21;RFIOI+RF{ II. Ole that add is an ALU operation .
J . Rill; RI21 + RIJI None of these instructions modifies the contents of the instructions' ource operands.
~. Rill = Rill + RI.JI In other words. the load instruction copies the COnlents of the data memory location to the
specified register, but leaves the data memory location itself unchanged. Likewise. the Slore
Finall). \Ie ,\[il< Ih< re,uil inlO 0131:
instruction copies the pecified register to data memory. but leaves the register' contents
5. Df3I ; Rill unchanged. The add instruction reads its band c registers without changing them.
Thus. our program c:onsisls of the six instruc tion s appearing above. whic h we might store in instruc· Using this instruction
set. we wou ld describe our Desired program
tion memo£) location ... 0 through 5. <411
0: RF[O]=D[OI
earlier program that com- 1: RF[l]=D[l]
EXAMPLE 8.3 Evaluating the time to carry out a program putes Df91;DfOI+Df I f as 2: RF[2]=RF[0]+RF[1]
Deh!mline the number of clock cyclc~ required for the processor of Figure 8.4 10 execute the si.x· shown in Figure 8.7. 3: D[9]=RF[2]
Computes
instruction program of Example 8.2.
The procc"i\or require'\ 3 cycle\ (0 process cach inst ructi o n: I cycle to fetch the instruction. I
(0 decode! the fetched in~truction. and I to execu te the in~ t ruction . A t 3 cycles per instruction. the
Not ice that the first
four bits of each instrucLion •
Instruction memory
0: 0000 0000 00000000
I
0191= 0101+0111

are a binary code tilat indi-


=
(olal cycle\ for 6 in\LnJClion~ i\: 6 in\lr * 3 cyc leslinstr 18 cyc l c~.
cates the instruction's
1: 0000 0001 00000001
2: 0010 0010 0000 0001
operation. Those bit are 3: 0001 0010 OOOOtOOl
8.3 A THREE-INSTRUCTION PROGRAMMABLE PROCESSOR known as the instruction's
operation code. or opcode
A First Instruction Set with Three Instructions for shan. "0000" means a
Thc v. ay v.e repre,ent in,tructions in the in!>lruction memory. and the list of allowable move from data memory to
in,truction,. arc known as a programmable p roces~or's illstrtlctioll set . Let's assume that a register file. "0001" means
processor uses 16-bi t instructions. and that the instruction mcmory I i 16-bits wide. a move from register fi Ie to
In,truclion set, typically r~se rve a certain number of bit s in the instruction to denote whal dat a mcmory, and "0010"
operation to perfonn. The remaining biL, pecify any additi onal infonl1ation needed to means an add of two reg is-
perform the operalion. ,uch a, the source or destination reg isters. We define a simple, three· ters. bascd on the
In,truction ,el. with the most significant (meani ng leftmost) 4 bi t identifying the appro- instruction set defined in the
priate operation and the lea,t ,ignificant 12 bi t> containing register fi le and data memory bullcted list above. The
'l'
add re"c,. fo\low,: remaining bits of the Figure 8.7 pn.lgram illal ,-ompUl<' D['I);D[O]+D[II.
i1Jod In'truclion 000 rJ r 2r l r O d, d6dSd,dJd2dldo: Thi, in;truction specific a in tnlction represent oper- u:;ing a 2h"en instrul'li n set. \\'~'\e IOsened ~P'k-~
move of daw from the dilt ~1 memory local ion whose addre<,<, is speci fied by the allds. which indie,lle what be t\,"'t'eo"'lhC: instRu:tion memof) 's bits for re3dJbllH~
dma to operate on. onl~-Iho:.e Sr:ll~!'> donOt e\i~t 11\ tht' n~mof) .
nih d~"d,d"d,dldld" into the regl,tcr hie rcgi,tcr who,e loca tion is specified by

~ ----
·no Programmable Processors 8.3 A Three·lnstruction Programmable Processor 431

We cou ld \\ ritc a different program 0: 0000000000000 101 /I RF[OJ ~ 0[5J


u~ing the ~aJ11C lhrec- in:'lfucti on instruc- 1: 0000 000100000110 /I RF[1J ~ 0[6J
lion set. For example. we could write a 2: 0000 0010 00000 111 /I RF[2J ~ 0[7J Turning on a personal com pUler causes the opera ting computing a long time claims a different origin. One
program that compute, DI51 = D[51 + 3: 0010 0000 0000 0001 /1 RF[OJ ~ RF[OJ + RF[1J system to load, a process known as "booting" the
D16} + DI7f. We mu,t perfonll tha t way of loading a program inro the instruction memory
/I which is 0[5J+0[6J computer. The co mputer executes instructions of earl y computers was to create a ribbon with rows of
computation lIsing instru cti ons chosen 4: 0010000000000010 /1 RF[OJ ~ RF[OJ + RF[2J beginning at address O. which usuall y has an holes. Each row might have enough room for say 16
fr0 111 the three-instruction instruction /I now 0[5J+0[6J+0[7J instruction that jumps to a built-in small program that holes, thus each row would represent a 16-bit machine
se t. \<\Ie might \\ rit e the program as 5: 0001 0000 00000 101 /I 0[5J ~ RF[OJ loads the operating system (the small program is often instruction-a hole meant a O. no hole a 1 (or vice
sho\\ n in Figure 8.8. The number before called the bas ic input/output system, or BIOS). Most versa). A programmer would punch holes in the ribbon
the colon represents the instruct ion's Figure B.8 f\ program to co mpul e computing dictionaries Slate that the term "boot" to store the program on the ribbon (using a special
addrr" in the instruction memory I . D/5/=D/5/+D/6/+D/ 7/ lising the three- ori ginates from the popular expression "to pull oneself hole-punching machine). and then feed the ribbon in to
The text following the two fo rward instruction instructi on sel. up by one's bootstraps." whic h means to pick yourse lf a compUlcr's ribbon reader. which would read the rows
slnshes (1/) represe nt commen ts. and are up wi thout any help. though obviously you can' t do of Os and 1s and load those Os and 1 s ima the
not part of the instru ctions. thi s by grabbing onto your own boot traps and computer" instruclion memory. Those ribbons might
Ole how that program ultimate ly computes the de ired sum . Thi s migh t be the first
pulling-hence the cleverness of the expression. Since have been several feet long. and looked a lot like the
time that lOU have had to think o f computations in terms o f low- level programmable pro-
the computer loads its own operating system. the straps of a boot. hence the term bootstrap. hortened to
ces,or instructions. Think.ing in te rms of suc h registe r- leve l o pe rati ons can be difficult at
computer is in a sense pi ck ing itself up without any boot. Whichever is the actual origin. we can be fairly
firs!. but become easier as you see and develop mo rc programs at that level.
help. The term bootstrap eventuall y got shortened to sure the term "boot" comes from the bootstraps on the
:\Iachine Cod e versus Asse mbl y Code boot. A colleague of mine who has been around boots we wear on our feet.
As you have seen. the instructions of a program ex is t in instruction memory as as and Is. A
program represented as a s and Is is known as machille code . Writing and readi ng programs
represented as Os and I s are tasks that humans are not panic ularly good at . We humans can't Using those mnemonics. we could rew rite the program D{9}=D[O}+ DII} as follows :
understand those Os and Is easil y. and thus will li ke ly make plent y of mistakes when writing 0: MOY RO, 0
such programs. Thus. earl y computer prog rammers developed a tool , known as an assem·
I: MOY RI. I
bier (wh ich itse lf is just another program ). to help hu ma ns write othe r programs. An
as embler allows u to write inst ructions using IIl1l emoll ics. or symbols. that the assembler 2: ADD R2, RO. R I
automaticall y translates to machine code. Thus. an as e mbler m ay tell us that we can wri te 3: MOY 9. R2
instrUction, from o ur three-instruction instructi on et using the fo llowing mnemonics:
That program is much easier to understand than the Os and Is in Figure .7. A
Load in stru ction-I\(OY Ra. d : pec ifie the operati on RFlaj= Dldf. a must be 0. program wri uen using mnemonics that will be translated to mac hine code by an as ern-
I ..... or 15-so RO means RF/Oj. R I means RFII f. etc. d must be O. I ... .. 255. bier is known as assembly code. Hardly anybody writes mac hine code direclly these day.
Store in,tntction-MOY d. Ra : specifi cs the o pe rati on Dld}=RF/af. An asse mbler wou ld automaticall y translate the above assem bly program to the mac rune
code shown in Figure 8.7.
Add instruction-ADD Ra, Rb, Rc: s pecifics the operati on RF[a}=RFlbl
You might be wondering how the assemble r can distinguish between the load and
+RFl cf. store in tructio n above, when the mnemonic fo r both instrU tions i the ame-"~(O ."
The assemb ler di stinguishes those two types of instruction by looking at the first char-
~ COMPUTERS WITH BUNKING LIGHTS. acter after the mnemonic "MOY"-if the first character i an "R." then that operand i a
1,. off light' meant 0,. Today. nobody in their ri gh t register, and thus thut instructi on must be a load instrUction.
Big computer., shown in the mo\ies often have many
ro'" of ,mall bltnking light,. In the carly day, of mind wou ld try writing or debugg ing a program by
compuung. computer programmer; progrummed u"ng u,ing machine code. So computer, today look like big Control Unit and Datapath for the Three-Instruction Processor
machine code. and they cntered that code tnto the boxes-with no row, of light,. But big plain boxes
Fro m the definition of the three-instruction instrUction set and an und rs!anding of the
tn,trucUon memory by nipptng "''' tehe, up and down don't make for IOlcre~lIng bac"gro und~ in movies, so
basic Oturol unit and datapath archi tecture f a programmable proces' or as -ho\\ n in
to repre",nt 0, and h To enahle dcbuggtng of the movie make" continue to U\C rnO\, II.: prop, wi th lots or
Figure .-1 . we can desi gn a complete digital circuit for a three-instru lion progrnmmable
program. a., "'ell "' to ,how the computed data. tho", bhnl.ing Itgh t' tu rcpre,ent computc,,- lights that IlrC
processor. The de 'ign process is actuall y vet') similar to the RTL de 'ign proces i
earl y compute" u\Cd row, 01 Itghh-on Itght' meant u clc"", bUI cnh;r1i.H11 IIlg.
hap tcr - .
432 Programmable Processors 83 AThree·lnslrucllon Progrommoblo Procossor 433

We begin wilh a high· 2x I mux in front of Ihe re' ,. . .


conlrol " I ' gl Icr hie , 'HHe dm.1 pon . hllall). \I . IIl1\e uh u includ '.1 Ihe
level Slale machine descriplion . Igna , lor Ille dOlJ m'~l1lol"). \\ III h \\C J"UIIlC ha, .1 'llIg.!.: addr'" pon . tlml 'nn
I
IIU suppon .onl) a read or a \I rile. hIII IIlJ1 rolll '1IlIllil.llleou,1 . Tile <lain 111'111<")' hlh
of the syslcm . show n in Figure
256 words. Mn e the ,"'Inl ~ ll0 I h~"s' hit, \\llh \\ )l1eh to .Idd u:" Ih ' lInlU I11Cll1Ur .
8.9. Assume Ihal 01' i, , hon · II on)
Th e dalapmh " no\\ ahle 10 .. II ' . .
hand ro r IRI I S.. / 2/. meaning . h . eMl) mit ~I 01 Iht: lo;td/... tul"\: 0pC I.IIHHl , lIml ullt hnH.: II C
Ihc leflmosl four bil;' of Ihe operallon'l at \I e need lor Ihe lugh· lc\d '1U1C maellonc 1'1<"" FIgure H.'!. Tllu, . we ell ll
instruction rcgi~tcr. Likewise. proceed 1.0 Ihe IIHrd , Iep of Ih c. R'rl . dc'lgn prllCC" ul connertlll!! Ihe Jawpal!! Wllh II con·
a,s ume Ih al ra i, . hOrlhand for Iroller. FIgure .10,11 '" Ihl"e COlllleCIII"". (h \l ell,,, Ille e<llU1CeIl0l" " I III ' cnu lfllll'r
10 the PC and IR re eoui'IC'"' III II Ie• l:Olltro
. I lllll! . ;I IlU 1011)(.' 11I,IIII C1HHl IlI CIII{HY I .
IRIII ..8 1. rl> i, shonhnnd for
I RI 7.. .JI. rc is , horlhand for n,e 10.'1 ,Iep of Ihe RTf.
IRI 3..01. and d i, , hOrl hand fo r dc;ign procc" b 10 dem e
PG-o-I
Ihe coniroller"> FS I. \ e Inc- 1
IR/ 7.. 0/.
Recall Ihal Ihe nexi SICP in can do Ihi, 'imighlforwurdl)
Ihe RTL design process was 10 High- level Slalt! maclllne dc ... riplion of::t
b replacing the high· Ie, cI
crellle Ihc dalapalh . We already thrcc!-in tnlction progmll1mablc procc,,",or. aClion of the :...HlIC machine
erealcd Ihe daaapaah in Figure in Figure 8.9 by BIKlle:m
.4. which we refine lO , how every cOnlrol ; ignal from the comrollcr. a. hown in Figure opermion, on Ihe Con •
8. 10. The relined dalapath ha, comrol signals for cach read and wri lc pon of Ihe regi ler file lroller', inpu l and oUIPUI
(sec Chapler 4 for informaaion on regisler file;). Thc regisler fil e has 16 regislers because lines. 3; ,hown in Figure
Ihe inslrllelions have on ly 4 bils wi lh which 10 add ress rcgisle",. The dmapmh has a conlrol 8. 11 . (Remember Ihal Of'. d,
signal 10 Ihe ALU called a 1 u_s O-we' lI ass ume Ihc ,i mplc AL adds ils inpul when ra. rb. and rt: arc , honhancl -AF{11tr-otdl D{dl RFtr'l- AFt"'~Ftrb,"",
a 1 u_s O- I. and jusl passes inpul A whcn a 1 u 50 =0 . The cialapalh has a ,c1eellinc for lhe nOlalion, for IRIIS.. 121. o addr~d o oddr d - AFjreJ-
IRI 7.. 01. IRI 11 .. 81. IRI 7.. -l I.
o rd. 1 o wr I RF Rp oddrerb
AF s. 1 AF 6 X RF Rp rd ~ 1
and IRIJ..OI . re'peclively.) RF Waddr. r. AF Ap oddr ,ro RF 9- 0
We could Ihen fini,h Ihe AF W wra l RF Rp rd I RF Rq addrerc
RF Rq rd~ 1
con troller's design by con· RF W addr. ra
veni ng the FSM 10 a ,laiC RF W wr. l
regiMer and combi nalionnl alu sOlZ l
logic, using Ihe mClhod, Figure 8.11 .. M for fhe IhrcC ~ III\ lrUCIHIIl prucc" ur\ confrollcr.
rrom Chaplcr 3.
We would have Ihu;, de,igned a programnwblc proce"or.
leI's trace lhrough Ihe comroller\ FSM behavior 10 ,ec how a program would
execule on Ihe Ihree· in; lrucli n proce~,or. A, II rcminder. remcmber Ihlll we follow Ihe
FSM convenlions Ihal all Ira n ~ili on ;, are implici ll y A Ded wi lh a ri , illg clock edge. and
ihal any comrol ;ignal nOI explicilly a"igned a va lue in a SlalC i;, implicill y as;,igned a O.
The FSM inilially Sian, in Male Illit . which SCi, PC c 1 r - I, cau, ing Ihe PC reg·
iSler 10 be cleared 10 O.
The FSM on the neXI clock cycle enlcr;, the Fetch SlaIC. in which Ihc FSM reads
lhe inslruelion memory al address 0 (because PC i, 0) and loads Ihe read value
inlo IR-lhal read value will be Ihe inWuclion Ihal wa... ' IOrcd ill 1/01. Allhe same
lime. the FSM incremenls the PC's value.
The FSM on the nexl clock cycle enlers Ihe Decode SlaIC, which has no aClions
bUI which branches on the nexi clock cycle to one of Ihree ; lales. Load. Store , or
Add. depending on Ihe value of Ihe highe; 1 four bilS of Ihe IR regisler (lhe currem
instruclion' opcode).
Figure 8.10 Refi ned dalapalh and control unil for the three·inslruction processor.
.B~ Programmable Processors
8.4 A Six-Instruction Programmable Processor 435
In the L{I(/d ,tatc. the 10 M sets the data memory address line, to the low eight bit
a can be 0, I , ... , Or 15 A . , .
of the IR and ,ets the data memory rcad enable to 1. setS the 2x I mux's select line c can be - 128 - 127 . ssumlng two s complement representation (see Section 4.8).
to pa" thc data memory output to the register fil e. and sets the register fil e write instruction f ' . ...• O•...• 126. 127. The "W' enables the assembler to distinguish this
rom a regular load Instruction.
addrc" to IRIII .. BI and the write enable to 1. causin g whatever gets read from We continue by introd ucin . .
the data mcmory to be loaded into the appropriate register in the register fi le. ters similar t dd " g an lI1StructlOn for perfonning subtraction of two regis-
• . 0 a iliOn of two registers. having the fo llowing machine and assembly code
Likewise. the Sto re and Add states set the control lines as needed fo r the store and representatIons:
add operations.
SlIbtracl. instruction-1l100
f , a),aZ,al,a o rbj,b, rblrbo rCj r~ rclr<il: specifies
Finall y. the FSM rctllrl1S to the Felch state. and begins fe tching the next su.bthractlon 0 two reoe ister fi le '
h registers speci'fi cd -by rb)rb, rb,rbo and rC3rc,rc trco·
instructi on. WII t e ~:s u lt stored in the reg ister fi le register specifi~d by ra3ra, ralr~. For
NOtice that becau,e the Sture state does nOt write to the register fi le. then the value of example, 0100 0010 0000 0001" specifies the instruction RFlil= RFfOI-
rhe registe r fik' ~ mux se lect lines don't mUHer. so we've ass igned signal RF _ s=X in thai RFI II . The mnemoniC for thi s instruction is:
~ tate. meaning the signal's value does not maLler. Using slIch don't care values (see
SU B Ra, Rb, Rc-specifies the operation RFfal=RF{bl - RFlcl
Section 6.2) ca n help u 10 min imize log ic in the controller.
Let's also introduce an instruct 'Ion l,at
I aII ows us to Jump
. to other parts of a program:
You may wonder why the Decode state is necessary when that state contains no
action ~--co liid we not have j ust had Decode' s transiti ons originate in stead from state llllllp-if-zero instruction-1l101 raj raZratra O O,06050.0j010100: specifies thai if
Ferch" Recall fro m Section 5.3 that register updates listed in a state do not actually occur th~ contents of the register specified by ra3raZra i rao is O. we should load the PC
until the nex t clock edge. meaning that transitions ori ginati ng from a state use the pre- W I ~ Ihe current value of PC pl us o,06050.030Z0tOO' which is an 8-bit number in
\" iou register \alues. Thus. we could not have originated Decode's transitions from the two s complement fonn representing a positive or negative offset amount. The
Ferch , tate. because those transitions would have been using the old opcode in the mnemonic is:
instruction register IR. not the new value read durin g the Ferch state.
J MPZ Ra, ofTset-specifies the operation PC = PC + offser if RF{al is O.
By using two's complement for the jump off et. which allows representation of positive
8.4 A SIX-INSTRUCTION PROGRAMMABLE PROCESSOR or negati ve numbers. the program can jump backwards in the program, thus imple-
menting a loop. With an 8-bll offset, the instruction can specify a jump forward by 127
Extending the Instruction Set addresses, or backward by 128 addresses (-1 28 to + 127).
Table 8. 1 summarizes the six-
Clearly. having onl y a three-instructi on instruction set limits the behavior of the programs TABLE 8.1 Six-instruction instruction set..
instructi on instruction set. A program-
that we can wri te. All we can do with those instructi ons is add numbers. A real program-
mable processor typicall y comes with a Instruction Meaning
mable processor wi ll support many more instructions. perhaps 100 or more. so that a
databook that lists the processor's MOY Ro. d
"ider variety of programs can be written. RF[a] = O[d]
instructions. and the meaning of each
Let's extend our programmable processor s instruction set with a few more instruc- MOY d. Ra O[dl = RF[a]
instructi on. using a fonnat similar to the
tions. in order to give yo u a slightly better idea of how a programmable processor with a
fo rmat of Table . I. Typical program- ADD Ra. Rb. Rc RF[ol = RFIb]+RF[ 1
full instruction set would look . mable processors have dozens. even
We'll begin by introducing an instructi on able to load a constant va lue into a register MOY Ro. #C RF[al = C
hundreds. of instructions.
file register. For example. suppose we wanted to compute RFIOI = RFI I I + 5. The 5 is a SUB Ro. Rb. Rc RF[al = RFIb]-RF[ 1
constant. A cO llsta ll1 i, a va lue that is part of ou r program . not somethi ng to be fo und in
data memory. We need an instruction that all ow uS to load a constant into a register, after Extending the Control Unit and Datapath l MPZ Ro. ofrset PC=PC+<>ffset if RF[a]~
which we could add that regiMer to RFf II using the ADD instruction . T hu s. we introduce The th ree new instructions req uire some
a new instructi on with the fol lowing machine and ",sembly code representations: extensions to our control unit and data-
pat h of Figure 8.1 0. with those extensions shown in Figure . 1_. Fin;!. th~ load con ranI
Load-collslall l in\truction--{)OI J ' j ' Z'1 ' 0 c, c"csc. CjCZC lco: specifies that the
binary number represe nted by the bit ' C,c6clc.CjCZCICO . hou ld be loaded into the instruction req ui res that the register file be able to load data from IR{ .. OJ. in addition to
register specified by rl rZrlrO' The binary number being loaded is know n as a co,,· data fro m data memory or the LU output. Thu . we \ iden the register file" ' multiple,.r
from 2x I 10 3x I. add another muX control signal. and al 0 create 3 ne\\ signal oming
\10111 . The mnemonic fo r thi s in,truction i, :
from the ontrol ler labeled RF_III_tflllo. whi h will onnect \\ith lR{ .. OJ-these banges
\IOV Ra, #c- 'pccifies Ihe operati on RVlal=l' are high lighted by the d", hed circle labeled .. r' in Figure . I~ . e ond. the ' ubtract

. -'''-~.--~-
-'36 8 Programmable Pr ocessors 8.4 A Six- Instruction Programmable Processor 437

s1 sO ALU operation
o 0 pass A through
o 1 A+B
-
t 0 A-B

addr rd data I I D addr S


addr D
ID rd rd

:I I ~R
% 16 frDwr wr 256x 16

~~
.•......... p' w·aar~·'I'l'·Ela!~ '.
Id
..', I ".
/~F_W ~
u
~I 16 ~I dataS t f1 6 D_addr=d RF _Rp_addr=ra
RF _Rp_addr=rb RF_s1= 1 RF_Rp_addr=rb
... I 2 1 0 .: D_wr=1 RF_Rp_rd=1 RF _sO=O RF_Rp_rd=1 RF _Rp_rd=l
f i1~ 51 51 16.br\ ••, RF_sI=X RF_sI=O RF W addr=ra RF _s1=0
'/ ~ • sO 3x 1 ••••• RF_sO=X RF_sO=O RF=W=wr= 1 RF_sO=O
~RF_s6·········· · ·· ······T6 RF _Rp_addr=ra RF_Rq_add=rc RF _Rq_addr=rc
RF_Rp_rd=1 RF_RQ..rd=1 RF _RQ..rd=1 e
RF W addr 4 W data
RF_W_addua
RF W wr=1
RF W addr=ra
RF=W=wr=1
'"d
W_addr - alu=sl~O alu 51=1 a:
RF W wr alu=sO=O ...1
W _wr alu_sO=1
Controller RF Rp addr 4 a:
Rp_addr
RF Rp rd 16x 16
Rp_rd
RF Rq addr 4 RF Figure 8.13 COnlrol unit and dat.path fo r the six-instruction processor.
Rq_addr
RF Rq rd
Rq_rd

....j;; l.~ Rp dala Rq data


JUlllp-iJ-zero, and JUlllp -iJ-zero-jll1p, for the three new instructions. The new in truction
RF _Rp zero • ~ ~.
__ ....~I~.~.i ':::'~~"'''''r-_-!l
A_1_6_1-!:B_16_..., states perfoml the following funclions on the data path:
In Ihe Load-cOI/Slalll state, we configure the register file mux to pas the
> : alu sO : ~6 ALU
RF _W_da ta signal. and we configure the register file to write 10 the addres pec-
••••• •• ~... Datapath 16
Control unit ified by I'a (which is IR( 11 .. 8]).
Figure 8.12 Control unit and datapath for the six·instruction processor. In Ihe Sublracr Slate. we perfonn the same action a in the Add tate. except that
we configure lhe ALU for subtraction instead of addition .
instruction require that we use an ALU capable of subtrac ti on. so we add another ALU
In Ihe Jlllllp - iJ-~ero state, we configure lhe register file 10 read the register pee-
control signal-highlighted by the dashed circ le labeled '"]" in the figure. Third, the
ified by ra onto read pon Rp. If the value of the read register Rp i all 0 .
jump-if-zero in truction req uires that we be able to detect if a register is zero, and that we
be able to add IR(7 .. 0( to the Pc. Thus, we insen a datapalh component to detect if the RF _Rp_zero will become 1 (and a otherwise). Thus . we in lude two transi-
register file's Rp read pon is all zeros (that compo nent wo uld just be a NOR gate), labeled tions from the JlIIl/p-iJ<ero slate. One tran ition will be laken if RF _Rp_zero i
as dashed-ci rcle "3a" in the figure. We also upgrade the PC reg ister SO it can be loaded O. meani ng the read regisler was nOI all OS-lhat transition takes the F M back
wi th PC plus IR(7.. 0j. labeled as "3b" in lhe fi gure. The adde r used for thi s also subtracts to the FeTch state. meaning no actual jump occurs. The other tran ition will be
I from the sum . to compen;ate for the fnct that the Felch state already added I to the Pc. taken if RF _ Rp_zero is 1, meaning the read regi ter was all Os. That tran iti n
We also need 10 extend the FS M for the conlro lle r within lhe control unit to handle the goes to another Slate, JUlllp- iJ-~ero-j/llp. which hould actually carry out the
three additional in'>tructions. Figu re 8. 13 shows the exte nded FS M. The Illil and Felch stales jump. That slate carries OUl the jump simply by etling the load line f the Pc.
,tay the same. We added three new transitions from the Decode state for the three new
Notice Ihal with Ihe addi tion of a JUlllp- iJ-~ero instru tion. the proce or may take up
,",truction opcode... We made a minor revision to the UllId, 1Ore. and Add tates' action
to four cycles 10 complete an instruction. nmely. when the ra regisler of a JIlTllp-(t:~em
(the new action, are italicized) s ince the regi ~te r file mux has a mux with two select lines
instnlction is all as, Ihen an extra slate is needed to I ad the PC with the address f the
,",tead of ju,t one. Likewise. we revised the Add ,tatc action .. to confi gure the ALU with
two conlrol hne, '"'tead of one. We added four new ,tate.,. /"I}(/(/-('oll'</OIII. SlIblracI. instrucli n 10 \Vhi h 10 jump.
-'311 8 Programmable Processors 8.6 Further Extensions 10 the Programmable Processor • 439

8.5 EXAMPLE ASSEMBLY AND MACHINE PROGRAMS lhose labels, such as "ski I" .. .. .. ,... ..,
live labels lhal hel p and done. or Fred and George. It s best, lhough. 10 u e descrip-
A P people readll1g lhe a sembly code 10 undersland the program.
Usi ng the ,i x-in, lrucli on inslruc lion sel o f lhe previo u, TABlEB.2 Instruction opcodes. n assembler wou ld .
Figure 8.14(b). For aUlomallcally convert lhe assembly code 10 the machine code shown in
$eclion. we no\\ prov ide an ex ample of a~st! lllbl y- lan ­
Instruction Opeode type by I k' each as~embly IIlstrucuon. the as~embler detenmnes the speCific instruction
guuge programm ing lI ~ ing the six-inslructio n proce ~so r prime 0 ~ 1n ~. at the mn~monic as well all the operands if necessary. and thcn outpu ts the appro-
to perform a pa rti cul ar tas k. and we show how the MOV Ra. d 0000
assen blpe e Ils (four blls) for lhat inslruclion lype. as defined in Table 8.2. For example the
" " cmbly code wo uld be conve rted 10 machine code by 10V d. Ra 000 1 1 er would look al lh fi ' . .
leuers "MOV" lhat " e rSl II1s1rucllon "MOV RD. #0" and lhu know from the fi rst lhree
an a"e mble r. n a»c mbler wo uld make use of lhe tab le A DD Ra. Rb. Rc 00 10 oper d . thiS IS one of the data movement in st.ructions~ the assembler wo uld look at the
shown in Tab k 8,1. which maps inslruc li ons 10 opcodes. fina l!an s,
h and
. seeing
. "RO" wouId know thiS . .IS either
. a regular load or a load-constant ins truction;
MOV Ra. #C 0011 putt" y. t ~ assem bler would see the "#" and conclude this is a load-constant instru cti on, thus out-
SU B Ra. Rb. Rc 0 100 . St mg. t e Opcode " 00 11" for a load-cans ram instruction. as shown in the first machi ne
EXAMPLE 8.4 Asse mbly and machine programs for a simple program In ruCllon of the figu re.
\Vritc a program that COll n t.;; the number of words that arc n OI JMPZ Ro. offsel 0101 "00;;° :ssen~ble~, co~~ertS lhe operands 10 bilS also, converting "RO" of the firsl instruction 10
equal to 0 in daw, mt!l11ol) l oc~\l iom, ...j. and 5. and that stores the . and #0 10 00000000 ," as shown in lhe firsl machine inslruction of the figure
result in da ta memor) location 9. Thu ~. the possible res ult s that if The
' JMPZ ins ' requ .ires some extra handling. The assembler rccogni zes this as a Jump-
lruellon .
\\Qu ld be ~(ored in location 9 arc Lero. one. or two. ~.~~~ Ins~ruclion ~nd lhus OUlPUlS the opeode "0101 . " The assembler converts lhe firsl operand.
U\lI1g the in-,tfuction ... et of Table 8.2. we can write an a~:-.c mbl y program as shown in Figure . 10 00 10 . The as emblcr then reaches lhe second operand. "lab I." and does not know
.I..l(a). The progrJm maintain.., the count in register RO. which the progra m initializes to O. The ~7 hat ~llS to output. since the assemb ler does n't yet know the address of the instruction labeled
program mil} need to add I to lh b register latcr. so the program loads the value I into register RI. labl. ' as lhe assembler hasn' l reached lhal instruclion yel in lhe program. To solve lhi problem
The program nex t load, data mem ry locmion 4 inlO regi:-tcr R2. The program then jumps 10 the many assemblers ac tuall y make I WO passes over the assembly code: durin g th e fi~t pass. the assem~
in,tTUClion labeled ~~ "lab I" if the \'alue of R2 is zero. If R2 i:-. not l Cro. the program will ex.ec ute b~er creales a lable of all labels and lheir addresses. and lhen on lhe second pass the as embler
an add imlruetiOn that add ... one to register RD. and will th en proceed to the instruction labe led ? (PUlS. machine code. Such an assembler would th erefore kn ow durin g the second pas thal th
"Iabl" ... inee that in"'lruclio n i~ the nex t instructi on. The instruction labeled "Iab l" loads data II1S1rucllon label:.d "la.~ I.' · is al ~n add ress lWO addresses beyond lhe first JMPZ in truclion-specif~
I c~ lIy. ~hat the lab l instructi on IS at address 5. whlie the JMPZ III truclion is at address 3
(assuming lhal lhe firsl inslruclion is al address O. nOl I). Thus. lhe assembler would amp t
MOV AO, #0; /I initialize resuil lo 0 0011 0000 00000000
off,sel of 2 10 jump forward 2 addresses. I alice lhal lhe labels "Iab l" and "lab?" do nOl ap u an
MOV Rl , #1 ; II constant 1 for incrementing result 0011 0001 00000001 lhe mac h'me cod e-thcy are merely a convenience construct thai the assembler - provides pear 111
fo r the
MOV R2, 4; /I get data memory location 4 0000 0010 00000100 asscmb l y~langu uge programmer. ~
JMPZ A2, labl ; II if zero. skip next inslruction 0101 001000000010
ADD AO. AO, Al ; /I not zero, so increment result 0010 0000 0000 0001
tabl :MOV A2, 5; /I get data memory location 5 0000 001000000101 ~ 8.6 FURTHER EXTENSIONS TO THE PROGRAMMABLE PROCESSOR
JMPZ A2, lab2; /I if zero, skIp next inslruction 0101 001000000010
ADD RO, AO, A1; /lnol zero, so incremenl ,esull 0010 0000 0000 0001 Instruction Set Extensions
lab2:MOV 9, AO; /I store result in dala memory location 9 0001 0000 00001001 EXlending the instruclion sel wilh further instructions would require similar ty pes f
(b ) eX le nSlons and mod ifications 10 the control unil. datapath. and FSM . A prog:ramm b~
~Q\e ~at:
(a)
processor mighl contain dozens more dolo movemelll instructions. which
Figure 8.14 A program 10 counl lhe nu mber of nonlero numbe" in Df.Jj and Dj5 j. sloring lhe
between data memory and lhe reg isler file. or belween regislers. Fo r example. a processor
r",uil 10 Dj9j: (a, ""embly code. and (b) corre_ponding mac hine code generaled by an assembler.
1llIghi have in lruclions for copying lhe contenlS of one regisler 10 another (e .g .. !\IOV
The "pace~ In Ihc machine code\ 16-bi t i n ~ lruc ti on '" nre lherc for your cOlwcniencc as you read this
RO. R I. which would copy RJ' contents into RO). a nd would ca rry out that instru tio n
hook; actual machine code h a~ no , lIch \ p3Ce\,
uSlIlg a tale thal reads lhe source reg ISler. pas es .the read \'alue through the AL
memory locallon 5 1010 reg"lcr 112. The program jump' 10 lhc in' lruclion labe led "lab2" if Rl is unchanged. and wriles the ALU oulPUl 10 the desllnatlon reglsler. A s ano ther exam Ie
fero. II R2 1\ not fCro. thc program execute' an add in"truclio ll that add, one L O rcgi ter RO. and I:rocesso~ mighl ha ve inslruclions lhm would use lhe COnlenlS of a regi ' ler as the a:~
lhen proceed' to lhe neXl '"'lrucllOn. "hich i, the in' lruelion labeled "lab2." ThaI ilm ruclion SIOres from wh Ich to read data memory. known as IIld,reC( addre' slIlg.
the conlcnl\ 01 rcgl\ICr RO to data memory l oc~'lIon 9. programmab le processor would also conlain dozens I' arilhmeticflogic in
In ,""rltlng the .' ......emhl)' program. we arbllr~m l y cho ... e tIll! regl\ tcf' th aLwe used to ~tore the lions. whi h perf nn arilhmelic and logi operalions on registe rs in the register file~
rc'\ull. the e,:un ... t,lOt I. ~nd the u'ila memory locatinn COI)Y. We coul d h;,ve u... cd any registers for example. a processor mighl include nol ju ' t add, and sublracl instru ti ns . \.luI a lso i~ :
thoCi.C purpn\C,. r'or example. we could huvc u\cd r~g l ... tcr 1<7 10 ho ld the rc\uit . meani ng all occur· menl, complemenl. decremenl. AND. OR. XOR. shIft left. shift right. and o the r
renl:C" of RO In the codc would In\lcnd h.lve neen }(7 h n1 hcnTIorc. In writing the assembly
insllllctions that could be carried OUI by an AL .
p"'gr .. m "c ,lfhll,"n ly eh .. ," lhe label' "Iabl" "nd "1,lh2," We cou ld have pIcked olher nu,"c\ for

_ _ _ _ _ _ ..... -.J_ ,
Programmable Processors
8.7 Chapter Summary 441
A programmable processor would furtherlllOre ~olllllin sevcra IJlow-of-colltrol illstruc- Performance Extensions
tiolls . \\ hich detenninc the next value o f the Pc. I-or example. a processor IHlght lOciude
not ju,t a jump-if-lcro insLnlction: but also a jump- if- not -z~ ro . lin un.condlllonal jump, an O ne difference between real processors and the basic processor architecture in this
indirect j ump. and perhaps e\en jump-Ji -negall vc and sImIl ar such InstruCllons. Further- chapter IS lhat many real processor are pipelined (see Section 6.5 for an introduction to
more. a proce»or may include instruc tions that can jump farther than j ust a small offset plpeltnmg). The basic. three-instruction arc hitecture uti lized a controller with three
fTom the c urrent Pc. and perhaps even to an absolute address rather than an offset address. stages: ferch, decode. and exeClIIe. By inserting appropriate pipeline registers througbout
the deSign and mod ify ing the controller app ropri ately. we could pipeline the fetch,
decode, and execute stages. In other wo rds, as the control un it decodes instruction I. the
Input/Output Extensions 256x16 D
~:dr ==:: control unit could be simultaneously fetching instruction 2. Next. as the control unit exe-

!~
Section 1.3 in trod uced a basic mic roprocessor cutes lnstruction I, the control unit could be decoding instruction 2. and felching
htl\ in2 ei~ ht inputs 10. II . .... 17. and eight wr InstructIOn 3. Thus, rather than processing one instruction every 3 cycles, the control unit
PO.
outpu7s PI . .... P7. We ca n extend the basic
programmable processor of Fi gure 8.1 2 to
implement such exte rnal inputs a nd outputs .. One
method for such an exte nsion wou ld ullhze a 240: . o6~~~.
23gB I 10 ..
could be processing one instruction every cycle. Each instruction still takes 3 cycles to
process (3 cycle latency), but the pipelining resu lts in single cycle throughput. The net
result wo uld be that programs wo uld execute three times faster.
Another extension involves creating deeper pipelines. Thu . rather than just three
'pecially designed data me mory. In that data 241: ' 00:y:)~11 stages (fetch, decode, execute), we might break the stages down to stages of even finer
memol). we mi ght re place the last 16 words of 248:: . granul ari ty (e.g., fetch. decode, read operands. execute. store reSUlts). Creating finer
II-......,.-+~ PO
the memory by direc t connections to the input grained stages may shorten the longest register-to-register delay. which enables a fasler
and output pins. as illustrated in Figure 8.15. 255:' II-......,.-+~ P7 clock freq uency. The net result would again be faster program execution.
The data memory stores locati ons 0 through 239 Another extension involves having multiple ALUs in the dalapath. The control unil
in a normal RAM. Location 240. howeve r. is may then perform mUltiple ALU operations simultaneously in the datapath. One fonn of
aCLUally a special wo rd whose high 15 bits are this extension involves a processor whose instruction set use in tructions with multiple
all Os. and whose lowest bit comes from a OIP- opcodes and associated operands in a single instructi on. known as a Very Large Instruc-
flop loaded every cycle with the value on Figure B.15 Connecting to tioll Word (VLlW) processor. Another form uses a processor with a control unit that reads
externa l input pin /0. Thus. reading locati on 240 ex ternal pins. in multiple instructions simultaneously and then ass igns those instructions to execute
will ""uil in eithe r 0 0 . .. 0 1 (i llleger I). or sim ultaneously on available ALUs, known as a superscawr processor. A high-end
00 ... 00 (i nteger 0). depe nding on the va lue appearing at 10. Likew ise. location 24 1 is des ktop processor may support perhaps 5 simultaneou instruction . with per~p 10
connected to pin II . location 242 to 12, and so on. Wi th locallon 247 connected to 17. stages of pipelining. Thus. at any moment. such a processor may be in the middle of pro-
Location, ~.j8 through 255 arc connected to pins PO thro ugh P7. except the pms are can· cessing 5*10 = 50 different instructions. Needless to say. modern proces or architectures
nected to tho.e locati ons' flip-flop outputs rather than input • . For example, writing to can become quite complex.
location ~55 \\ rite the Oip-fl op with either 0 or 1 (onl y the low-orde r bit matlers during This chapter described the basic idea of how a programmable proces ors d ign
the \Hite). and lhat flip·Oop drive external output pin P7. works and how the design cou ld be extended to support a fuller instruction set. We lea:'e
Thu.,. an a"cmb ly- Ianguage progra mmer can read or write a mic roprocessor's the ro le of describing a complete processor. as well as modern processor de i!!n lecb-
external data pin' ,impl y by readi ng or writing parti c ular data memory locations. niq ues for improved performance (such as pipelining. caching. elc.). to lext~ks on
computer architecture.
EXAMPLE 8.5 Motion·tn-the-dark detector in assembly language
Secll()n 1.3 IIleluded an example. ill u'lwted in Figure 1.13. Ihat ulili/.cd to microprocessor to imple·
menl a mOllon ' IIl'lhc-dar~ deteclor. That ,coli on utili7ed C code 10 c mpulc the expression PO •
8.7 CHAPTER SUMMARY
[0 && ! [1 In th" example. we ,how thc underlying ""cmbl y code Ihm would implemenlthat III this chap ter. we stated (Section 8. 1) that programmable processors are \\idel)' u"ed for
C ex pre,,,on. A"umlllg lhat the lIlicroproce"or', eXlemal pin' 10.. 17 'II1d PO.. 1'7 are mapped to implementin g a system 's desired functionality. due in part to their easy :\\'ailabilil\ and
Uo.s ltJ memory hx:atlon ... a., 10 Figure 8 15. we can program Ihe c"' pre,\lOn 111 a"'5,clllbly as follows: short design ~ime (namely. writing software). We provided (ection ·- l the basic ~hi­
o MOV RO. 240 /I move 0/240/. whIch" Ihe value at pin 10. IIll0 RO tec ture of a programmable processor. consisting of a general-purpose datapath ha"ing a
MOV R I. 241 /I mOve 0/241/. will h " that va lue at pill II . into RI register file and ALU: a ol1trol unit having a controller. Pc. and IR: and memories ~ r
'(JT R I R II/compute 'II . ""unllng e,,'lencc "f a complemenl instruction storing the program and the data. The control unit would fetch the ne~( in tru tion from
AND RO. RO. R II/compute 10 && ' II . ""ullll ng '"I AND I", truction program memory. de ode the instruction. and th.en execut~ the I~Stru U n b~ nfiguring
\110\ 24K. KO 1/ move re,u lt 100/248/. wh,ch" pill PO the datapath to carry alit the instru tiOIl 'S peCltied perau n. \\e then de ' igned t tion
442 ~ 8 Programmable Proc essors
8.8 Exercises ... 443
8.3) a simpl e Ihree·i nstruclion prog ra mm able processor. and showed how a program SECTI O 83' A THREE I ST
. . " . R CTi ON PROG RAM~ I RLE PRO ESSOR
wo ul d be re presented as Os and Is (machine code) in the processor's program memory. 8.6 If :1 processor's in struction has 4 bilS for thl! 0 'f '.
We wenl fu rther 10 des ign (Sectio n 8.4) a six· in struc tio n processor. a nd discussed how processor Suppan? pcode. ho\\ many 1>O~' lblc 1Il 'llnl tion., can the
further eXlensio ns could be made to add mo re instructio ns and hence achieve a more rea·
8.7 \~hat does the following a\SCl11bly pro mill which II .~ . ' '.
sonable processor arc hilecture. We provided (Secti on 8.5) an example of assembly and MOV R5 19' AgD'
Ihl s chapter. com pUle') toe, the th rcc- IIl!'oo lruCIi On Ili stme ti o n SC I of
. " D R5. R5. R5: MOY 20. R5.
machine code for Ihe six·instruc ti o n processor. We discussed a rew ex te nsions to the pro· 8.8 Whal doe, Ihe following , . bl .
grammable processor a rchitecture (Section 8.6). thi s eh a ~CIll y program . which u 'c~ Ihe Ihn.:c-i ll ~ lrlU': li on in!<.lruc( . r
. apler. COm pUle? MOY R4. 20: MOY R9 18' ADD R4 R4 Rl). . 'On SCI a
Programmabl e processors are typicall y produced in huge quant ities (nu mbering in R4, R5: lOY 20. R9. '" . . MOY R5. 30: ADD R9.
the tens o r milli ons. or even bi ll ions). a nd so tre me ndou s a!te nt ion is given to thei r 8.9 Using th e three-instructi on ill~lnlclion .sct of Ihi!- chapter. wri te Ull th~cOl I
design. Readers should rea lize that the programmab le processor des igns in this chapte r updales Ihe (lain memory D as follow.: DIOI = DIOI + Dill . '. b y program Ihal
are ex tre mel y simp listic and used for illustration purposes on ly. Ye t, seeing even the si m· 8.10 Using Ihe Ihree·· I ' " .
'" inS nlCIl 11 II\Mructl on ~c l of 1111:-' chupler. write nn a~,c l1l bl )r .
pl istic desig ns, yo u hopefully now have an understandi ng o f the principle of how a Update' Ihe dala memory D a' follows: DI41 = DIII. 2+ D1 21. y I ogr,II" Ihal
programmabl e processor works. Modern comme rcial processo rs are based on the same 8.11 ~ol1\'c n. the following :I\scmbl y program 10 machine code ba,,"-cd on the Ih . ' .
principles-instruc ti ons arc stored as machine code in program me mory, control units "" Irucllon ,el of Ihis chapler: MOY R5. 19: ADD R5. R5. R5: MOY 20. R5. rce·m,lruClion
felc h. decode. and execute Ihe instructions. and datapaths support the operations of the 8. 12 Lisl Ihe b'Isic
" regl!-.
. IcrI memory IrJnsfcrs and opcmlion... Ihut OCcur duri ng each clock c
instrucli ons using regis te r liles a nd ALUs. Modern processors just do a muc h better job,
the followlflg progmrn. ba~cd on the Ihrce-ill \ truclion i ll~lntClion ~CI of Ihi J yclc for
usi ng conc urrency. pipelining. and ma ny othe r techniques to o btain high ciock freq uen. I: MOY RI. 9: ADD RO. RO. RI. . ' lOpler: MOY RO.
cies and fast program execlItion .
SECTION 8.4: A SIX· INSTR U TlON I'ROG RAMMA IJ LE I'ROCE SO R
8.13 Li !o.I the basic regbtcr/mclllOry transrcr~ and operali on) Ihal occur during c'lch J k
Ihe follOwing progmm. bn~cd on the \ ix-instruction il1 \ tnl clion \ct f thi s' h-C oc - cycle for
8.8 EXERCI SES Ihal Ihe COl1lenl or DI 91 i,• O.' MOY R6.#• I' MOY R5. 9'• JM""Z ,-,- 1,
'5. "
I I>0 II' : cADD
,'pIer.
R5"",uming
R
label I : ADD R5. R5. R6. Whal b Ihe vallie in R5 after Ihe program Compieles? . 5. R6:
SECTION 8.2: BASIC ARCHITECT URE 8.14 Add u new j ll ~truclioll to the ~ i x-i n structi on in~tnlct ion \Cl of thi ... dl'lpler Ih '
b' ' . ' al perfo
t;;:";J.S 8.1 If a proce ssor's program counter is 20-bits wide, up to how many \,,-/ords can the processor's
instructi on memory hold (ignoring any special tricks to ex pand the inslruc tion memory size)?
II wlse AND of two registers and stores the rC5ult in .a third rcgiMcr E:<tc d h
contr I unit. and the controller's FSM as needed. . , n l c dat<Jpalh.
nns a

Wh ich of lhe fo llowing are legal single·cycle dalapalh operalions for Ihe dalapalh in Figure 8.15 Add a ~l~W in ~ tructjon to thc six-instructi on in~ truct i oll :-oct of this ch3ptcr that rfom '
~.s 8.2
8.2? Explain your answer. uncondlt,onal Jump Uu mps always) 10 a location specified by a 12.bil on,cl E pe IS an
(a) Copy data from a memory location into another memory location. palh, control un il, and Ihe controller's FSM as nceded. . Xlend Ihe dala·
(b) Copy two registcr loca ti ons into two memory locations. 8.16 ~dd a new. instruction to the six-instruction in truction sc t of this Ch3plcr th at perform.s .
(c) Add dala from a regisler fil e localion and a memory localion. sloring Ihe result in a ,f Iwo reglSlers are equal. 10 a localion specified by a 12-bil offset. EXlend lhe d:.J ~mp
memory loca tion. control un'l. and Ihe cont rollers FSM as needed. lapalh.
r---.
PLUS
8.3 Whi ch of the follow ing are legal single-cyc le datapath operations for the datapalh in Figure 8.17 Us i~g the six-instruction instructi on set of this chapter, wri te .111 assembl y program for th
8.2? Explain your answer. 10wll1g C code, which com pUles Ihe sum of Ihe firsl N num bers. where is an h e fol·
(a) Copy data from a reg ister fi le locat ion into a mcmory locati on. D191 . Hillt: Usc a regisler 10 fi rsl SlOre N. Ol er name for
(b) Subtract data fro m two memory location s and store the result in another memory location.
i -l :
(c) Add data from a register fi le localion and a memory location. storin g the res ult in the same
sum-O :
memory location .
wh i Ie (i ! ~ N)
8.4 Assume we are using a dual-port memory from wh ich we can read two locations simulta-
neously. Modify Ihe d3lapath of Ihe programmable processor of Figure 8.2 10 support an sum sum + i;
instru cti on that performs an ALU operation on any two memory locations and stores the i 3 i + l:
re ult in a register fil e locati on. Trace through the execution of thi s opera ti on. as illustrated
in Figure 8.3.
8.5 Delerm ine Ihe operalions requ ired 10 instrucl the datapalh of Figure 8.2 to perform Ihe opera· 8. 18 Using Ihe eXlended inslruclion sel you designed in Exercise 8. 16. wrile an 35sembl
for Ihe C code in Exercise 8.17. Y program
lion: DI 81 = (D[4] + D15J) - D[71. where D represents the data memory.
44-' Programmable Processors

SECTIO N 8.5: EXA MPL E ASSEMBLY AND MAC HINE PROGRAMS


8.19 Define new daw movement instructions for the . . ix-i nstruction of thi

9
twO inSlruction sel
chapter. Extend the datapath. con tro l unit. and the controller's FSM as needed.
8.20 Define two new arilhl1lcLic/logic instructions for the six-instruction instruction set of this
chapter. Extend the datapath. control unit. and the controller's FSM as needed.
8.21 Define two neW now-or-control instruc tions for lhe six-instruction instruc tion se t of lhis
chapler. Ex.tend the datapath. contro l unit. and the con tro ller's FSM as needed.
8.22 Assuming th at the microprocessor's ex tern al pin s 10.. / 7 and PO.. P7 are mapped to data
memory locations as in Figure 8. 15 and an AN D instruction has been added to the six-instruc- Hardware Description
tion instructi on ~et of this chapter. create an assembly program th'll will output 0 on P4 if all
eight inputs 10.. 17 are is. Languages

architecture. "This was a unique opponunity to define a


9.1 INTRODU CTION !
Carole grew lip in 3. country DoorOpener
where the best swdcnts went to processor 'from scratch.' Technically thi s was a very
In thi s book, we have been drawi ng the circu it. that
engincering school. as challenging project. and working with so many top notch
we destgn. . For exam , pIe. .III Chapter 2. we deSIgned
.
engineeri ng was highly respected. architec ts was very enriching. But I also learned what it
takes to bui ld somethin g big. involving a very large team,
an aUtO~lall ~ door opener circuit and drew the circu it
"1 W:lS good in school. so
and two large companies. The two companies had different shown til F,gure 9. 1. A drawi ng has more in forma-
engineering secmed like a natural
cultures. diffe rent methodologies, and reconciling the tlon .than is really necessHry to descri be the ci rcllit. In
option. I \Vas also very interested
differences was sometimes more challenging than solving particu lar, the drawing gives information about the
in building things. and very
curious abou t how one bu ilds new the technical problems. But this is all pan of 'building locatton of the inputs and outpUts: in the draw ing or Figure 91 D .
th ings: and th is was a gre.u lesson in leadership." Ftgure 9.1. the inputs are on the Icft, the output on . ruwn clfcuit.
thing s-so I was attrac ted to
engineering aI an early age. around 10 years of age." \Vha( Carole likes mos t about her career is "the the n ght, and the c input is on the top, the h input in the midd le. and the in
Carole has worked at Intel for 15 years. She was Olle of constan t change. After 22 years as a computer architect, I bottom . The drawlIlg also gives infomlation about the size alld I . P put on the
. I . . . . ocm lon of th
the original architects of the popular MMX (Multimedia a111 still doing new things every day. Computer science is nents III tle Clrcull: the IIlvertcr IS Ht the top the OR gate below tl ' e compo-
D h ' , ' l C IIlvcner th
a work in progress. and it offers new opponunities that
Extension of the Intel Architecture) pan of Pentium " ate
. on . ft e nght
" . and each component is abollt a half inch by ,a1I alf II1 ' e I1. Th, e d A. D
processors... It was fascinaling to learn the algorithms one has 10 grab. and run with . Thi s is where the fun is." gtves til ormation abo ut the wIres tOO' the wire from the inverter e rawlllg
Asked to give some advice to students. Carole suggests d h . . ' goes to the ri h h
used to compre s video and audio, and to inven t new own, t .en to the nght agalll. for example. However, all that information . g t. t en
instructions for th e Intel Arch itecLUre to run these two things: drawlIlg tS really Irrelevant, and has nothing to do with how the des ign will be a~out the
applications efficiemly. It is not always easy for processor "Stay at school as long as possible. Get a PhD if you tmplemented. We . had to draw the circuit somehow. so we chose to dr'I' \V th e CIrCU . P ystcall
Il ' hy
architects to quantify the benefi ts of new fCJlUres, and to can. To be able to adapt to constant change, you will manner s hown III the figure. But we could have drawn the circui t many oth III t e
motivate the expense in si licon area (or chip die size) for need a very robust. and theoretical foundation. Only drawlIlg of a circuit is commonl y referred to a a ci rcui t schematic. er ways too. A
new instructions. In the case of mu lt imedia applicutions. learning how to do things is not enough; it will get you A problem with drawing all our ci rcuits arises when we deal with large . .
the benefits are well understood: running a video clip at a a job for 2 years. but then yo ur skills wi ll be obsolete." . . F" r CtrcullS Does
tlle sc hema~c III tgure 9.2 mean anythlllg to you? That schematic has J'u t a couple .d
few frame s per second. or running it in real time (about 30
"Be open for change. It is imponantto build an in-depth components-what if there were a couple t1lOusand eomponelllS. as is ' oZen
frames per second) makes a huge. vis ible difference to . . . ' qUlle commo ?
expenise in one area. in my case, it is computer archi- D rawlIlg a large CtrCUIl would reqUIre tremendous effort on Our pan to fig n.
everyone:' As is the case with so many engineers. she is . ' . ~=~t
tecture. But one has to be ready to use thi s expertise in P ace each componelll 111 the drawIIlg. and how to route Wires among the co
' fI
very proud of what she accomplished: "When the first v 0
many differe nt projects. with different people, and more mponenLS A d
Pentium processor with MMX came up. it was really t . a tool generated the circuit, the tool wou ld have to spend compute time to fio . n
and more in different pans of the world. Fifteen ye"" vtsually-appealing way to draw the circuit (rather than a paghctti-like me ) "ure OUt a
rewardin g 10 think th3t a sma ll piece of my mind was in . . . . d '11 I . 5S . and ueh
ago multimed ia applications were the focus of many com put all on tS tlme·consumtng an Sll may not resu ttll a good drawing. Funhenno
all of these machines running video rea l time popping up
computer architects. Today it is bioinformatics and data ti les used to store such schematic would be very large. as those tiles would re, the
everywhere."
mining. Change requires a lot of work to learn new . <. . I . d' f eontmn a ll
Carole was also one of the architects on the Intel I th at ex tra lIl/ormallon about the prectse ocallon an SIze 0 every component . All th at extra
domains. but not adapting to change is not an option."
Hewlett·Packard tearn that defined the Itanium computer
I Substamial content or thischapter was contributed by Roman Lysccky.

44S

-------- --- -
446 Hardware Descriplion Languages 9.2 Combinalional Lo . D " .
glc escrlptlon USlllg Hardware Desc ription Languages ... 447
10 describe the slruclural illlerconnections .
descri be Ihe beh ' r • or componelll,. hUI abo II1clude melhods for t" 10
aVlor a componelll Ihen I ' Mod . . .
the use or HOLs al ' II . 1>e Vo,. em dlgllal de"gn relics heavily on
. . ,t tage; of de.<lgn.
We II prOV ide n brief imroducti n I
ouages-V HDL ¥ '1 10 I1C mOSI popu lar hardwnre de,cri plion lan-
o . en og. and YSlcmC 111 Ih' h b I
one may wanl 10 con; uit I : -:- "c aplcr. ul 10 rca Iy Icam each language.
this chapler ca n be ~~Ibook' SpeCifica lly ded i :Hed 10 each Iangllage. Each seclion or
aft er Chapler 2 S' cover~ 3 IIllmedlately after correspondi ng carlier chaple" (Seclion 9.2
arler Ch'lpler 5>-.:,cI, on . after Chapler 3. eClion 9.4 aft er hapler 4. and ection 9.5
c11'1pters' F i r Il1e,e sections muy be covered all at onCe "rler compleling Ihose earlier
' r S· un 1ermmore. each seclion hus three pa riS. one ror VHOL one ror Verilog and
one lor ystemC Each of Ih se n.-, " d "
. '. . pa I 111 ependcllI or Ihe mher paris or Ihe seclion ;0 "
rea der II1leresled only 111 one or Ihe HOI " '1 . ,
' h . " -'. say ven og. can rend only Ihe Verilog part' or
eac seCllon, Sklppll1g Ihe VHOL Or SyslemC pan,.
HDLA ;ea~er II1lere ted in comparing Ihe three HOI., may ",,,d Ihe ,ecli ons of " II Ihree
Figure 9.2 Schematics become h.. rd to read beyond a dozen or so components-the
'1 . s. ~ . oll1g so. YOllmay nOllce Ihat Ihe HDL, have ,i milar capabililies di frering prim.t-
graphical inronllation bccolllc!ol a nuisance raLhcr than an ai d.
n hY IIlHI eII' symax. Tilli , after leaming One HOL th ro ughly. a de'igne; can likely le" ~
0 1 er OLs qUickly. '
efron. file size. and lime. would be needed for somelhi ng Ihal is reall y nOI very u erul-
humans can'l comprehend circuil drawings or more Ihan perhaps a hund red or so gale. so
what 's Ihe poim or drawi ng such circuils? What we reall y wanl is a way 10 just describe the ~ 9.2 COMBINATIONAL LOGIC DESCRIPTION USING
ci rcuil ilse lr- whal arc Ihe in puts and outpulS. whal components ex isl. and what are Ihe con- HARDWARE DESCRIPTION LANGUAGES
neclions? Ideall y. we wou ld do this description in a texwal language, a that we humans
could Iype such descri pli ons wilh a compuler keyboard. just like we type email messages and Structure
C programs.
We coul d Iherefore describe Ihe circuit in Fi gure 9.3(3) using Ihe lex tual language or This chapter's introducli on soughI 10 describe a circuil lIsing a lex wal language. We now
English as shown in Figure 9.3(b). We've given names to each gate in the circuil and 10 show how so~ e dlrre r~m HOLs descnbe a circll il. The lerm stTllcture is somelimes used
10 rerer to a CII'CUII. wllh slructure meaning an interconnecti on or componenl.
the illlernal wi res in Figure 9.3(3).
VHDL
(aJ (b) We'll now describe a circuit whose name is DoorOpener. Fi gure 9.4(c) shows a VHOL descripli o~ of Ihe DoorOpeller circuil or Figure 9.4(a). For
The external inputs are c, h and p, which are bits. convellience, we've 3 1. 0 shown the Eng" h descri plion in Figllre 9.4(b), and Lhe correspon-
The external oUlput is I, which is a bit. dence belween Ihe English descnpllon and Ihe VHDL de cripLion.
We assume you know the behavior of th ese components: Th ~ d ~cripl i on begins with an elltity dec laralion, which defin es the de ign's name and
An inverter, which has a bil inpul x, and bit output F. Ihe deSign s IIlpUts and outpulS. known as ports. An entilY declaration says nothing aboul
A2-input OR gate, which has Inputs x and y, and bit output F.
A2-input AND gate, which has bil inputs x and y, and bit output F. Ihe IIllemals or Ihe deslgn-:-Just the deSign's .name and interrace. The description lists the
port names and defi nes thCll' Iype. which III thiS case is Iype s td _ l og i C. That type es en-
The circuit has internal wires n1 and n2, both bits.
The DoorOpener circuit internally consists 01: liaJiy means a bil, bUI isn'l bui ll imo VHDL (Ihe predefined bit type in VHOL is too limiled,
An inverter named Inv_1. whose input x connects to for rea ons beyond our scope here). To use s td_l 09 i C, we aCluall y musl include Lhe stale-
external input c, and whose oUlput connects to n1. ments: " library ieee ; use ieee . std_logic_1l64.all ; " a1thelOp oflhe fi le.
Figure 9.3 Describing a circu il using a A 2-input OR gate named OR2_1. whose inputs connect to external
inpuls hand p. and whose outpul connects to n2. The description continues wilh an architectllre definiti on, which descri bes the intemals
tcxtual language rather than a graphical A2-input AND gate named AND2_1, whose inputs connect to n1 of the design. We named Ihe archilecture Circllit. bUI we could have named il anything we
drawing: (a) schemalic. (b) lextual and n2, and whose oUlput connects to external outpul I. wanted; DoorOpellerCircllit, DoorOpellerStructllre, Structllre. or even Fred, although we
description in the English languagc. That's all. want a name Lhat is helpful in underslanding Ihe architeclure. The architecture lans by
or cou rse, Engli sh is not a good language ir you want to use a computer tool to read in declaring what components the design will be uSing-Lhose components must be defi ned else-
the descripti on-a computer tool requires a language with a precise syntax and precise where, perhaps earlier in Ihe description's file. or perhaps in another file. We' II discuss those
meaning ror every language construcl. CompUler-readable languages thus evolved in the componenls' definiti ons later- for now, as ume they are omehow already defined. Each
I970s and I980s ror describing hardware circuits. Such languages became known as hard- componenl declaration mUSI define the inputs and OUlputs or each componenL and those
ware descriptiol/ lal/guages, or HDLs . Hard ware descripti on languages not only enable us inputs and OUlpUIS mUSI match the component 's entity declaration (found el ewhere) exactly.
448 9 Hardware Description Languages
9.2 Combinational Logic Description Using Hardware Description Languages <II 449

DoorOpener . The bold words in the desc"Ip t'IOn represent reserved words. abo known", keywords
tnv_' 111
.
YHDL.
.
We cannot use reserved \YOTdr ... ..
S lor names of entit ies. arch itectures. signal:::,
1I1stantlated components. etc.. as those words have spec ial 1l1eaning that guide YHDL
tool to understand Our descriptions.
Summarizing. the YHDL structuml description has an entity that de crihcs the design's
library ieee i na1l1e, "'puts. and OUlputS: a declaration of what componen ts wi ll be u,ed: a dec1amtion of
use ieee.std_logic_1164.alli 1I1tem31 s t g~a l s: and finally, nn in !antialion of all component,. along with their
.... entity DoorOpener is Interconnecti ons .
(a) ........... / / port (~c, h, p : in std_logic;
... "/,,,- _... _... ---~~. f : out std_logic The entity thm we've just defined could then be used as 11 component in another enti ty.
We'll now describe a circuit whose name is DoorOpener, ..................... -- .J, ;. •..
The external inputs are c, hand p. which are bits .......... - .' .• nC:! DoorOpener;
Ycrilog
The extemal output is f. which is a bit. ---------------- .' Figure 9.5(c) s h ~ws a Yerilog description of the O()orOpeller circuit of Figure 9.5(3). For
architecture Circuit of DoorOpener is
conven ience. we ve also hown the English description in Figure 9.5(b). and the corre-
We assume you know the behavior of these components: _------ component Inv .
An inverter, which has a bit input x, and bit output F. ----- port (x: in std_logl.c: spondence between the Engli h deSCription and the Yerilog descri ption .
A2-,nput OR gate, which has Inputs x and y. ____ F, out st,,-logie 1:
and bit output F ---- end component :
A 2-input AND gate, which has bit inputs x and y, ........ ------- component OR2 DoorOpener
and bit output F. ......... port (x. y: in stcLlogic; Inv_'
.... F : out std_logic):
The circuit has internal wires nl and n2, both bits..... .......... end component ;
.. ........ component AND2
The OoorOpener circuit internally consists of: ------~::>.... port (x, y: in std_logic;
An inverter named Inv l ,whose input x connecls 10 .. ""'.. ...... F: out std_logic);
external input c, and-whose output connects to n1 . "'.. '" .......... end component ;
A 2-input OR gate named OR2_" whose inputs ........ ''''''',""'... signal nl, n2 : s td-log ic; - - in ternal wires
connect to external inputs hand p, and whose oufPul... ','
connects to n2. .. ........... ' .. ~egin (a)
A 2-input AND gate named AND2_' , whose inputs -___ ......... ' Inv_l : Inv port map (x=>c. F=>nl); , m04u1. tnv(x. F):
connect to nl and n2, and whose output connects 10 ---... _...... ". OR2_1: OR2 port map (x=>h. y=>p. F=>n2) ; We'll now describe a circuil whose name is DoorOpener. \ ,/ input x;
externaloutput f. --'AND2_1: AND2 port map (x=>nl,y=>n2.F=>fl; The external inpuls are c, hand p, which are bits,,' \ I output F:
That's all. ____________________________________________ end Cireui t; The external output is I, which is a bit. , ~/ I I deta i 1s not shown
" \ " \ endmodu1.
(b ) (c ) We assume you know the behavior of Ihe~'CfOmpo~lt~tt: \_ .. - .a4ule OR2 (x. y. F);
An inverter, which has a bit input x, and bit ou.tput F.Al-......... \ input x. y;
Figure 9.4 Describi ng a circuit using a textual language ralher than a graph ical draw ing: (a) schematic, (b) tex tual A 2-inpul OR gate, which has inputs x and y, ~('---- \ \ output F;
description in the English language. (c) textual description in the YHDL language . Bolded words are reserved words andbiloutpulF \ \ \ /1 details not shown
A 2-inpul AND gate, which has bit inputs x and y, ~, ___ :~ \ .ndmodu1.
in YHDL. and bit output F. \ ~~-"',\lDOd.u1e AND2lx, y. F);
\ \ \ input x. y;
The description then includes a declaration of the des ign's internal sigllals, which are The circuit has intemal wires n1 and n2, both bits. " \ \ \ output P;
essentially internal wires. Next to that declaration, the description includes an example of ' ....,. '" \ ~ ....~ ~ .. details not shown
The DoorOpener circuit intemally consists 01: ., \ \~~ul.
a YHDL comment : "-- i nterna 1 wi res". Comments start with "--" followed by An inverter named Inv_1, whose input x connects to ........" \ \ \
any tex t we want on the rest of the line. That text is ignored by YHDL tools, but is useful external inpul c, and whose output connects to nl ......, .. ' .... ~U1. DoorOpener(c, h, p, f);
A2-input OR gate named OR2_' . whose inputs '_, -__ '" \ 'input e. h. p:
to us humans who must read the descriptions. connecllo external inputs hand p, and whose output ... _... ...., ..' ....: output f;
Fi nally, the descri ption instanti ates the circuit 's components and defines those com- connects to n2. --...... " .. wire n1. n2;
ponents' connections. For example, the description instanti ates a component named A2.input AND gate named AND2_', whose inputs-____ -"<
Inv Inv_l (e. n11:
connectton' and n2. andwhoseoutputconnectsto ----____ OR2 OR2_1(h. p. n21:
11111_1, which is a component of type 11111 (which we declared earlier in the YHDL descrip- extemal output f. --AND2 AND2_1(nl. n2. fl:
That's alf. ____________________________________________ ' - u l e
tion), and indicates that 1/l1I_l's input x connects to c, wh ich is an external input. An
alternate, more concise port map notation omits the port names . Using this notation, we (b) (c)

cou ld instantiate ou r inverter by writi ng "Inv_l: Inv port map (c . nlJ :". The Figure 9.5 Describing a circui t using a textual language rather than a graphical drawing:
order of the signal s in the port map of IlIv corresponds to the order of the ports in the (a) schematic. (b) textual description in the English language, (c) textual description in the
component definition of Illv. We wi ll use this alternate notation in subsequent examples. Yerilog language. Bold words are reserved words in Yerilog .

. •' ~ __ w_ J -
450 Hardware Description Languages 9.2 Combinational Logic Descrlptlon
. . USing
. Hardware Desc ription Languages ~ 4S I

Tbc description begins by defi ni ng modules for an inverter 1111'. a 2- input OR gate
DoorOpener
OR2. and a 2-input AND gate AN02. We' ll skip discussion o f tbose modules, and begi n Inv_'
our discu"ion wi th tbe defini ti on of the founb modu le OoorO"eller.
T be dcscription declares a modllie named OoorO"eller. The module declaration
defi nes a des ign's name and the names of tb at design's inputs and outputs. known as
pons. Tbe modu le declarati on says nothin g about tbe intcrn als of the design or the .include • systemc . h·
pons-just the design" name and interface. , .include "inv.h"
/, 'include ·or2.h·
Tbe descript ion tben defi nes tb e type of each pan , assigning the types illplI l and :/ , .include 'and2 .h'
0"11'"1 in thi s example. (a) :: /
Tbe descrip tion tben i ncludes a declaralion of tbe design's internal wires. named II I We'll now desc~bea circuit whose name is ooo-rop-e-n-e-r~---iY· ~C_MODOLB(OOorOpenQr I
The exlernalmpuls are c, hand p, which are bits. -- - ----~L.---- -- .o 10<
and 112.
Finall y. tb e dcscripti on instantiates th e circuit 's co mponcnts and defines th ose com- The external output is I, which is a bit _________ ___ __ (tl _______.c=out::~~~~~> C i h, p: f
ponenls' connecti ons. In tbe OoorO"ell er modul e. tbe descripti on instantiates a We assume you know the behavior of these components' ~ / i~ter~al wire~
component named 1111'_ / . wbicb is a componenl o f type III I'. T be connecti ons to the inputs An inverter. which has a bit input x and bit output F '
A2·inp~t OR gate. which has mpuls x and V, .
,/ I ;-~o:n- ~.~-dog 1c>
,I'I' lnv Ine~7en
n2;
cc arat ons
nt·
and outputs of tbe i nstantiated components arc specified in tbe order in wbich the compo-
and bIt output F. / OR2 OR2 l'
nent's modu les declare th e inputs and outputs. In tb e instant iati on of 1111'_1, the input c is A 2"nput AND gate, which has bit Inputs x and y, / lIND lIN02 i·
con nected to the in pu t x of the IIII' component. In Veri log, th e module does not need to and bIt output F. // II compo;;e~t instantiations
specify the interface of a component witbin tbe mod ule instanli aling th e component. For T .. . / " SCS'l'OR tDoorOpenerl . Inv 1 (' Inv 1'1
he CirCUIt has Internal wires nt and 02, both bits/
example. the OnorO"eller modu le does not inc lude a declarati on o f which components it ,-/ ,
-C'/ OR2 - 1 ('OR2 -
1')
' AND
' 2-1
- (' -2
AND _1 )
.'

wi ll instantiate or any informati on regardi ng tb ose components. The components, of The DoorOpener circuit Internally consists of: , /" /, .. ,:, :. lnv 1 x (c I .
course. must be defined elsewhere. perhaps earlier in th e same fi le as shown in Figure An inverter, named Inv_l, whose input x connects to ..!'~:::~~ -~;;"'"-
eX,ternallnput c, and whose output connects to nJ......... <_______
lnv:1: F (n1
OR2_1. x (h) j
i j

9.5(c). or perb aps in anotb er file. For reference purposes. tb e example shown here pro- A 2'lnpul OR gate named OR2_'. whose Inputs .<, _____ ~;r OR2_1. y (pi;
vides incomp lete speci fi ca tions for tb e III I'. AN02. and OR2 components in order to connect to externallllputs hand P. and whose output / OR2_1 . F (n2) ;
c~n nec l s to n2 . / ________ N-ID2_1. x I nIl;
clearly show the pons and inter face For lhese component s. In place o f speci fying lhe
A 2'lIlput AND gate named AND2_' , whose Inputs ,~-------- lIN02_1 . Y(n2 I ;
interna l bebav ior of th ese components. we simply included an example of a Veri log com- connect 10 01 and n2, and whose output connects to AND2_1 . F (f) ;
ment. Com ments stan witb •. I I" and th en any tex t we wa nt on th e rest o f th e line. ~xterna l output f.
Thats all. -------------------------------- ------------.....'--
),:..
. _ _ _ __
The bo ld words in th e descriplion represent reserved word s. also kn own as keywords.
i n Veri log. We cannot use reserved words for names of modules. pons, w ires, instantiated (b) Ie)
components. etc .. as those words have spec ial meaning that guide Veri log tools to under- Figur~ ~.6 ~csc ribing a circuit using a tex tunl language rather than a graphical drawing: (:'1) \C hClmllic. (b) textual
stand our descriptions. ~escnpl1on In the English language. (c) Icx lUal dc\criplion in the SY~lcm language. Bold word" arc n.!\crved words
Summarizing. the Veri log slructural description has a mod ule th at describes the In Systcmc.
des ign name. lists the module's inputs and OUlpu ts, and spec ifies the type for each input
and output : a dec larati on of intern al wi res: and fi nally. an instanti alion of all components, module declaration says nothing aboutlhe intern als of th e de,ign- justthe de; ign's name.
along with lheir il1lercon nection s. Within the module descri ption, the input and ou tput pons of the design are specified. using
the sc_ill<> and sc_olIl<> statements respecti vely. The descripti on li sL~ the pon names and
System C defines thei r types. which in this CHse is type 5c_ 1 og i c. which speci fies a si ngle bit.
Figure 9.6(c) shows a SystemC descripti on of th e OoorOpener c ircuit of Figure 9.6(a).
The description then includes a declaration of th e des ign' internal signal , specified
For conven ience, we've also shown th e Engli sh descripti on i n Figure 9.6(b), and the cor-
as sc_s iglla/, which are essenli ally intern al wires. Next to that declarati on, th e descriplion
respondence between the English descripli on and the SystemC descripti on. The SystemC
includes an example of a SystemC comment: ,. I lin te rna 1 wi res". Comments stan
language is built on top of th e C++ programming language, but it is not necessary to be
Wi lh "I In and then consist of any text we want on th e rest of th e line.
an expen C++ programmer LO use Systemc. However, it is imponantto keep in mind that
The module then decl ares what components the design will be using. The SystemC
cenain restrictions ex ist as a result , such a not using C++ keyword s to name modules,
module does not need to specify the interface of the component , but ralher ju t the type
ports. signal . etc.
of component as well as a unique name for each component wi thin th e design.
Before defin ing th e circuit behav ior. we musl include th e statement "IIi n c 1ude
The module defines a constructor functi on SC_CTOR that is responsible for instanti-
" sy s t emc . h"" at the top of each SystemC fi le. The description begins with an
at ing and connectin g the components within our SystemC design. The conslructor fu nclion
SC_MODULE declarati on, which defines the design's name, in thi s case OoorOpeller. The

__ ...: . . . -...I"l._ .... - .. --


-'52 Hardware Description Languages 9.2 Combinational Logic Description Using Hardware Description Languages 453

takes as an arg ument the name of the curre nt Syste mC module. wh ich is in this ca e Door- "process(x . y) ". which mean the process should execute from beginning to end
Opener. Following the SC_CTOR statement after the colon is a list of component whenever there's a change on x or y-in other words, the process is seflsitive to x and y.
instantiations. The Systc mC module's instantiations arc used to ca ll the constructor func- A process body (the part between the process's begin and e nd) can contain sequential
lions of each componcnl bei ng instantiated. However, we poi nt out that the connections state ments, just like sequential statements in C, but with a different syntaX. The process
between the individual component are nO! specilied at this point. Instead. the state ments shown has onl y one such statement. assigni ng the value of "x or y" to F. "or" happens
within the constructor fi na ll y define the connections between the components. For example, to be a built-in operator in VHDL, making the internal description of the OR gate imple.
the inve rter 11II'_ I's input x is conne ted 10 c. wh ich is an exte m al input. In SystemC, the As another example of a behavioral description, let's revi sit our DoorOpener
module does not need to specify the interface of a component within the module. The com- example from Figure 9.4(c). for which we created a n arc hitec ture havi ng a structural
ponents. o f course. must be complete ly defi ned e lsewhe re. perhaps earlier in the same file, description. We can alternatively create an architecture having a behav ioral description-
or perhaps in another fi le. In our SystemC DOO/'Opener desc ription, the desc riptions for the a VHDL entity may have multiple architecture descriptions for that same entity.
h II'. AND2. and OR2 components are pec ified in other Syste mC li les. In order to use those Assuming the same entity declaration as in Fi gure 9.4(c). we show an alternative archi-
components. we must include a state me nt at the beg inning of the current file indicating tecture definition in Figure 9.8. The behav ior consists o f a process that is sensitive to
where we can find this descript ion. For example. our DoorOpener description includes the Inputs c, h, and p. When the process executes (w hich is whe never c. h. or p changes), then
statement "1/ i nc 1ud e "i n v . h "". and the descri ptio n o f the component Illv can be found
the process executes its one statement, which updates the value off
wi thin this fi le. In designing the DoorOpeller circuit,
The bolded wo rds in the descript ion represe nt reserved words. a lso known as key-
we might start with the behavioral descrip- archi tecture beh of OoorOpener i .
wo rds. in Sys temC and C++. We cannO! use reserved words for names o f modules, ports, lion, and run a simulation to ve rify correct begin
signals. instantiated components, e tc .. as those wo rds ha ve spec ia l meaning that guide behavior. We migh t then create a structural process (c. h. p)
Syste mC and C++ 100is 10 understa nd our descriptio ns. description, and run simulation again to begin
veri fy that the circuit has the sam~ func-
Summarizing. the Sy temC structural desc ription has: a module that defines the f <= Dot (c) and (h or pi;
design name: a list of inputs and outputs o f the module specify ing the ir types, a declara- end procells ;
tionaliry as the behavior. In fact, tools exi st end beh;
tion of internal signa ls: a decl aratio n of compone nts prov iding the name for each that automatically convert such behavior to
compone nt. a constru ctor function insta ntia ting the module's co mpo ne nts, and fin ally, the a circu it. Figure 9.8 Behavioral VHDL descriplion of
compone nts' interconnecti ons. When writing a VHDL process the DoorOpeller design.
describin g a combinational CirCUli S
behavior. care must be taken 10 include all the circuit's inputs in the proces 's sensiti,~ty
Combinational Behavior li st. Omitting an input is not a VHDL error. but such o mission results in different
HDLs typically suppon the ability to describe the internals of a design as behavior rather behavior than combinational behavior-wi th an input omitted, the output does not change
than as a c ircuit. This abi lity enables us to de cribe the bOHom-level building-block com- when that input changes. meaning there must be some storage in the circuit.
ponents that we use in a des ign. such as the behavior of a n AN D gate or OR gate .
Verilog
VHDL Figure 9.9 contains a behavioral description of
library ieee; module OR2lx.y.F);
Figure 9.7 contains a behaviora l descrip- use ieee.std_logic_116 4 . all; a 2-input OR gate, which you'lI recall we input x. y;
tion of a 2-input OR gate. whic h used as a component in Fi gure 9.5. The output F;
entity OR2 is reg F;
you'lI recall we used as a compone nt in description begins by declaring the module
port (x, y: in st<t-logic;
Figure 9.4(c). The de cripti on begins wi th F: out std_log ic named OR2 and specifyi ng that the modu le a1wa.ya @ (x or y)
the declara ti ons necessary to use );
has three ports named x, y. and F The descrip- begin
5 td_l og i C. It then decl a re the e ntit y
end OR2; F <= x I y;
tion the n defin es that the ports x and yare end
with the na me OR2 as ha ving two input arcbi tecture behavior of OR2 i. both inputs and the port F is an output. The enc!module
ports x and y. and having output pon F, a ll begin description the n defines the output F to be a
proce •• (x. y) Figure 9.9 Beh., ioml Veril02
of type 5 d log i c . whi ch means bit. begin reg output. In Veri log, all ports are by default
The de, cription then defines an a rc hitec- F <::0 X or y: assumed to be a wires . whic h do not store description of an OR gate. -
ture named behavior for OR2. That en4 proce •• ; va lues. Instead . wires can onl y creme connec-
end behavior;
architecture con'I S!!' of a process. whic h i, tions between components. If we want to assign a a lue to an output pon. we mu t
the VHDL con,truct that describes defi ne the port to be a reg. which indicate the output pon stores the value - we i20
Figure 9.7 Behavioral VHDL de,criplion
behavior. The proce" decla ration he re i to the port. The Vcrilog code for our design continues with an always procedure that
of an OR gate.
454 Hardware Description Languages
9.2 Combinalional Logic Descri ption Using Hardware Descri ption Languages 455
delines a bloc k of code Iha l wi ll be rc pealedl y c xcc uled w hcnever a c ha nge occurs on process wi ll exec ule the c· . beh ' .. .
an input in Ihe block', inpul li s t. The a lways procedure declaralio n is "a 1ways @(x . h orcUI[ aV IOr desc nbed on the funcuon comb/ogic w henever there
IS a c ange o n x or y. In other words, the process is sellsilive to x a nd y. The process body i
or y ) ". which Illeans Ihe procedure should exec llIe from begi n ni ng 10 e nd w he never
defined on the funcllon comb/agic and is declared as "v 0 i d comb 1 09" c ( ).. The
Ihere is a change on x or y-in olher words. Ihe proced ure is sellsitive to x and y. The f . (h . process
unCUon t e pan belwee n the open brace " {" and close brace "}" ) can contain sequential
always procedurc's Sialements (Ihe pan bel ween Ihe procedu re's begill and elld stale-
sta!emenlS, JUSI li ke seq uential sta!emenlS in C or C++. bUI somelimes requires different
men l) can contai n seq uential Slalements. j us l like seq uemial state me nls in C , bUI with
synt ax. The process shown has onl y one such Slalement. writing the value of " X . re a d () I
a different sy11lax. The block s hown has on ly o ne suc h Siale me nt. ass ig nin g Ihe value
of " X I y" lO F. w he re I is a buil t- in Veri log operali o n 10 compute an OR.
y . rea d ( ) ~' 10 F, where I exec ules an OR operalion . In SystemC, one can read the c urrent
value ~f an onput pon using Ihe readO function and can wri te a va lue to an o utpul pon using
As a no lher examp le of a behaviora l
lhe wn leO funcllon. Wh ole we can use other melhods of accessing lhe inpul and Outpul pons
descriplio n. lel's revi s il ou r Doo rOpelle r
module DoorOpener (c, h, p, f) ; lhe readO and wrile() functi ons are recommended. '
example from Figure 9.5(c). for w hi c h we input c, h, Pi As anDlher examp le o f a behavioral
c realed a s lruclural verilog de cripli o n. We ea n output fi
reg f: description, let's rev isil ou r Door- 'include • systemc. h·
alienlalive ly creale a behaviora l description.
Opel1er exampl e fro m Fig ure 9.6(c), for
Figure 9.10 presems a behavioral Veri log always @(c o r h or p) w hi ch we crealed a slruclural SystemC ~C_HODOLE (OoorOpener)
de cription of Ihe DoorOpeller c irc uit. The begin
f <= (-c) & (h I p); descriplion. We can a ltern atively creale Bc_ in<8c_ logic> c, h, p;
module declaralion is s imilar lO Ihe struclural sc_ out<8c_ logic> f:
end a behav io ra l descripti on. Figure 9. 12
descriplion of Figure 9.5(c). but in Ihe behav- endmodule presents a behav io ral SystemC descrip- SC_ C'I'OR (DoorOpener)
ioral description we need 10 declare lhe outpul tion of the DoorOpel1 er ci rcuit. The (
f as a reg. The beha ior consis ls of an a lways Figure 9.10 Behavioral Veri log
module declaration is the same as lhe SC_METHOD {cornblogic) ;
procedure sen . itive 10 inputs c. iI. a nd p. When descriplio n of Ihe DoorOpener design. sensitive « c « h « Pi
slructural descriptio n of Fig ure 9.6(c).
the procedure execules (wh ich is w he neve r c. The behavior cons iSIs of a single pro-
iI. or p changes). Ihen lhe procedure execu les a si ng le s lateme nl thai upd ales lhe value cess. named comb/ogic, Iha! is sens ilive vo i d comblogic ()
off, by ass ig ning Ihe value .. (-c) & ( h I p) ". w he re - . &. a nd I perform the inven , to inputs c, II , and p. W hen the process
(
.write ( I-c. read () & (h. read () I
AND. and O R o peralion s. res pective ly. execules (which is whe never c, ii, or p p. read i ) ) ;
In designing lhe DoorOpelle r c ircuit. we mighl sian wi lh Ihe be hav ioral descriplion, )
chan ges), Ihen the process executes ils
);
and run a imulation 10 verify correCI behavior. We mig hl lhen creale a slructural desc rip- o ne statement, which upd ales Ihe value
tion. and run a simulati o n again lO veri fy Ihal Ihe ci rcuil has lhe ame fu ncti onality as the of f by ass ignin g lhe value FiIgure 912
. Be h,vlora
. I S YSlemC deSCription of
behavior. In fact. tools exist lhm aUlomalicall y conve n such behavior to a c irc uit. "(-c . read(» & (h.read() I the DoorOpellerdeSlgn .
p . re ad() )", where - performs an
SystemC
invert o peralion . & performs an AND operation. and I perfomls an OR operation .
Figure 9.1 1 contain~ a SyslemC behaviora l 'include ·systemc h-
In designing Ihe Door?petler circuit, we might stan Wilh the behavioral description.
de criplion of a 2-inpul OR gate. which SC_ MODOLE (OR2) and run a SImul atIo n 10 venfy correct behaVIor. We mIght then creale a tructura] des .
ti o n, and run s imulation again 10 verify lhat lhe circuil has the s:u."e functionality .:~
you'll recall we used as a compone nt in (
Figure 9.6(c). The SystemC description Bc_ in<Bc_ logic> x, y i
8c_ out<sc_ logic> F: behaVIor. In fac l, lools exist lhat automati cally conven such behavlOr to a circuit.
declares lhe module wi lh the name OR2
and has IWO inpul pons x and y and one
OUIPUI pon F. all oflype sCl ogi c . indi -
Testbenches
SC_ METHOD (comblogic) ;
cati ng each inpul and outpu t is a n sensitive « x « Yj
One of Ihe main uses of an HDL is Ihal of si mulating a new design to ensure that th
individual bi!. Thc modulc defines th c con- desig n is correcl. To simulate a design, we need 10 sel Ihe de ign 'S inputs to certru::
lructor function SC_CTOR that consisls va lues, and then check lhal the design's o utput values are whal we expecl them to be. A
void comblogic ()
of a ~ingle proce" named comh/ogic syslem Ihm sets inpul values and checks output value IS known as a leslbellch . \ e now
(
defined a' a SC...METflOD . SC...METflOD P . writ. (x. read () I y. read !)); show how 10 create an HDL test bench to test our DoorOpeller circuit.
)
is o ne Sy'tcmC con'lnlct lhat describes
};
behavior. The proces~ declaration here i~
··SC.METHOD (comblogic); s en - Figur.9.11 Behavioral Sy'lcmC de.'criplion
S i i v e ~ < /, << y; ". which mean' Ihe of an OR gale.
-
~56 Hardware Description Languages 9.2 Combinational Logic Description Using Hardware Description Languages 457

VHDL Verilog
library ieee;
Figurc 9. 13 shows a VHDL use ieee. st.d_logic_1164. all;
Figure 9.14 shows a Veri log test-
module Testbench:
leslbench for Ihc DOOl'Opeller bench for the DoorOpeller design of reg c, h. p;
design of Figure 9."(c). I mice entity Testbench is Figure 9.5(c). Notice that the wire f;
end Tes tbench;
that the entilY. named Tesr- module, named Testbellch, has no OoorOpener DoorOpenerl (c. h. p. f ) ;
bellch. has no pons- the entilY architecture behavior of Testben ch i8 pon s-the module is self-contained
is self-co ntained. req uiring no component DoorOpener requiring no inputs and generatin~ initial
port ( c, h, p: in std_logic ; begin
inputs and generating no out- f: out std_logic no outputs. The module fi rst II case 0
pUIS. The archilccillre dec lares ); declares three registered signals c, c <= 0; h <= 0; P <= 0:
Ihe componelll Ihat we plan 10 end component : h, and p and a single wire f. The . 1 $dioplay (" f = %b". f);
signal c, h, P, f : std_logic; II case 1
Icst-namely. the DoorOpeller begin Signals c, h, and p are declared as c <= 0; h <= 0 ; p <= 1;
component. The architecture DoorOpenerl: DoorOpener port map (c, h, p, f 1 ; reg because we must assign values t l $dioplay (" f = tb". f);
instantiates one instance of the II (cases 2-6 omitted from figure)
process
to the signals that wi ll be connected /I case 7
DOOIOpellercomponent. which begin to the inputs of the design we are c <= 1; h <= 1: p <= 1;
we named DoorOpeller /. A -- case testing. However, because we do 11 $dioplay (" f = tb". f);
single process in the architec- c <= '0'; h <= ' 0 '; p <= '0'; end
wait for 1 n8 ;
not need to assign a value to the endmodule
ture sets the inputs of the assert (f=' 0') report -Ca se a failed- i output we are monitoring, the
component and checks for signal ! is declared as a wire. The Figure 9.14 Behavioral Verilog deSCription of
correct output. This testbench -- case 1
test bench then instantiates one DoorOpener testbench.
c <= '0'; h <= '0' i P <= '1' ;
tries all poss ible cases of the wait for 1 ns ; instance of the DoorOpeller compo-
three inputs. of which there are assert (£='1') report ·Case 1 failed M
;
nent : named DoorOpeller/ , and connects the inputs and outputs of the component t
eight cases. Many components -- (cases 2-6 omitted from figure)
our Internal ignals. The testbench then contains an initial procedure that defines 0
block of code that will be executed onl y once when executi on of the te tbench begins~
-- case 7
have too many inputs to Lry all c <= '1' ; h <= . 1 '; p <= '1';
possib le cases-in that situ a- wa it for 1 ns ; The IIlIHal procedure sets the inputs of the DOOiDpeller component and di splays the
tion. we might try border cases assert (f=' 0') report "Case 7 failed M
;

resulting value of the component'S output. This testbench tries all po sible cases of
(e.g .. all Os. all 1s) and then wait ; -- process does not wake up again the three Inputs. of which there are eight cases. Many component have too man
some random cases.
Each case sets the three
end process ;
end behavior;
Inputs to try all possible cases-in that situation, we might try border case (e.!! .. at;
Os, all Is) and then some random cases. -
inputs of the component to a . Each ca e sets the three inputs of the component to a particular input combina_
panicular input combinaLion. Figure 9.13 Behavioral VHDL descriplion of DoorOpeller lion , and waits for those values to propagate through the component-we arbitrarij
and waits for those values to lestbench. wait for I unit of simulated time using the delay contTol statement "1/1". but we cOUI~
propagate through the component-we arbitrarily wai t for I ns of simulated time, but have picked any length of time, since we didn ' t actually create a time delay within
could have picked any time, since we didn ' t actually create a time delay within the com- the component. The Veri log language does not define standard. time unit. uch as
ponent. But we do have to wait for some Lime. as VHDL simulation is defined such that nanoseconds. but Instead simply defines lime 111 term of lime uruts. which a de .
no signal is updated instantaneous ly. but rather after an infi nitely small period of simu- can use wit. h'111 a sm1Ulallon
. . .
environment. e 0 ave t 0 waH. f or some time. asIgner
W d h
the
lated time. After waiting, each case checks for the correct value on the output!. using an ass ~g nm e nt s within the testbench are nonblockmg statements that are not Updated
assert statemen t. If the condition of the assen statement eva luates to lme, simulation pro-
untll the current simulation time completes. After wUlung. 7.ach case outpUts the value
ceeds to the next statement. But if the condition evaluate to fal se, the corresponding of the output ! USlllg a $disp/ay statement. The statement $dlsplaY ( "f lb"
error me<;sage Wi ll be reponed and the simulati on will terminate. flU outputs the value of! in binary. For eXal~ple. If the value of! is l. then th~
di splay . tatement will output "f = 1.... The display stalement consist of a format
slnng followed by a comma-separated hst of wIres. regl ters. or pons . \ nil th
. strint!_ of our display statement.
format . . the '7cb' indicates
. thut the value
. of.u,e
L e
111I!!nal
speCified after the format string Will be displayed m bmaf)'. After SImulation has -
pleted. we can compare the values output during simulation to the expe ted \ lue om-
determine if our ircuit is working correctly. .t
458 Hardwa re Description Languages 93 Sequenual Logic Description USing Hardwaro DeScription LangUBgos ~59

SystcmC to the Ilt!\( MalClllcnt. But if the condition c\~\luntl" '" tahl' , "IHulnlmn \\111 ' lOP and
Figure 9.15 , how' "Sy'lem Ic't- the corre'ponding emIr m~"ug~ \\ III be reported
'include ·systemc. h-
be nch for the Doo,.Opel/cr de,ign In Y,lem . '"llI~' ",ch." 0 "nd I ,Ire Intc~~r ',lluc' "nd IIut logl' ' UlliC'. 1", lcild.
of Figure 9.6(c). I olice that Ihe se_ MODULE (Test.bench)
Sy'tcmC define' thc ,,,Iuc' ·CJOGIC .O"nd . _IOGle I Ih.1t corre'I","d h'lhe logic
{
module. nam ~d Tellbel/ch. h,,' three ,alue, of 0 nnd 1. re'pcl·tI\<!). \\hlch \\c u,cd IIllh~ dC'Cnplltlll
8C_ out < 8C_ logic > C_t. h_t. p_t;
ou tpul port'>. C_I. h_l. and 1'_1. and 8c_ in<8c_ logic > f_t;
one inpul purt JJ In Sy"cm . we 9.3 SEQUENTIAL LOGIC DESCRIPTION USING
SC_ CTOR (Tes tbench)
de<ign Ihe le'ihcnch ci rcuil '" II
separate module thai connects 10 the
( HARDWARE DESCRIPTION LANGUAGES
se_THREAD (testbench...,proc) ;
design we arc le'ting. Therefore. for Register
every inpul port on Ihe circuit we void testbench-proc ()
arc leMi ng. our ICMbcnch will have
The mo;.t bo" componenl 111 equentlal Inglc I' a rc!,!"la. We no\\ , 11,,\\ h'" 10 llIodel II
(
II case 0 basic regi,ter 111 IIDL, .
corr"polltling outpu l port . Like-
II
c_t .write ISC_LOGIC_O);
wise. for every outpu t port on h _t . write ISC_LOGIC_O); I-IDL
the circuit we arc leMing. OUf tcM- p_t . write ISC_LOGIC_O) ; Figure 9.16 ,ho\\, " ba\lc
wait l!. SC_NS); Hbrary i . I
bench wi li have a corresponding 4-bit register m I IDL. u.. _lovlc 11f.4.all,
a •• ert ( f_t. read () =.: SC_LOGIC_O );
in put port. The t"tbench module The register is identical In
entity R V4 1.
define"> a I\inglc proce" nallled II case 1 that dC\cribed In Figure
c_t. write I SC_LOGIC_O) ; P rt I I in d loQle velar I J downto 01:
le.HIJel/ch-Ilro C. Th e tes tbe nc h 3.30. The entll} <Ichne, the . out td lOQlc v clot () CSownto 0):
h_t. write (SCLOGIC_O) ;
proce s;. is defined a, an p_t. write ISC_LOGIC_l) ; data mpul I and the <Ial,1 lk in ld 1000Ie
);
SC_TNREAD . which is sim il ar to wait ll. SC_NS); output Q. "' \Veli u, the en4 Pf!lg4:
sssert ( f_t. read () == c l oc~ input ciA . The inpul I
an SC_METHOD process excepl
and outpu t Q of th i' de"gn architeoture "'hl)v~or of Rf"U4 1.
that Ihe SC_THR EAD ali ows LI S to / I (cases 2-6 omitted from fi9\lre)
begin
lise Ihe ",aitO function within Ihe / I case 7 corrcspond to 4-bll vil lucs. proc ••• tcUc:)
c_t. write ISC_LOGIC_11 ; In ~ t ead of u,ing eight lIl<li-
process body to control Ihe timi ng begin
h_t . write I SC_LOGIC_11 ; it (elk '1' and clk' ev. ne ) t hen
behavior of Ihe process. In contraSI. p_t . write I SC_LOGIC_l I ; vidual .Ild_logi inl"l" and
wait ll. SC_NS); output s. the entity", I and o
< 1:
SystcmC doc;. not allow us 10 usc assert ( f_t. read () == SC_LOGIC_O );
_"" U ;
Q port, arc defined as • .04 proc •• I :
the waitO function within an
.<ld_108ic_,·ec/Or. A .f lle • .04 behevior:
SC_METHOD process. The lest-
) toxic_,'ector j, a vector. or
bench process controls the inputs of array. of multiple sl(elallic Figure 9.16 BchavlO,"1VIIDL UC"'"P""" 0[" 4-bn rcgl"cr.
I;
the circuit we arc testing and checks
clemen IS. For example. Ihe
for correct output. Thi s testbench Figure 9.15 Behavioral SYSlemC description of type dec laration "s td 1ogi e vee or (3 down °
0)" dehne, a 4·bil vector of
tries all poss ibl e cases of the DoorOpe1ler tcslbench. SId_logic clement. where the bit po~i ti on s within the vector arc numbered from 3 to O.
Doo,.Opel/e,.·,. three inputs. of The dowllto statement defines the ordering of the elemenl s within the vector. indicati ng
whi ch there arc eight cases. Many that element 3 is located in Ihe leftmo,t po\it ion. The ,wtcment "I <- . 1000 ·" would
componellls have too many in puts to try all possible cases-in that situalion. we might try thus assign the value '1' to position 3 of the veclor I (lnd the value '0' to the remaini ng
border cases (e.g .. ali Os. all Is) and then some random cases. three posi li ons. When assigning a value to a .f ld_logic"ector. the vector'b value I1lU ~t . be
Each case sets the Ih ree inputs of the Doo,.Opeller circuit to a particul ar input com- specified within double quotalions. For example, the deCimal value 5 would be specified
bination. and wai ts for lhose values to propagate through the component-we as a 4-bit sl(U ogicveclOr as "0101". . . ' .
arbitraril y wai l for I ns of simu lated time , but could have picked any time, si nce we The architecture describe the register behavlonllly, u, ms a proce,s Matement. rhe
didn't aCluall y crea te a time delay within the component. But we do have to wait for process is sensitive to its elk input only-because the proce~, should only update i t~ output
some time. as SystemC si mulation is defined such thai no signal or pon is updated during a rising clock edge. the process need not execute If mput I changes. If elk change.
instantaneously. but rather after an infinitely small period of simul ated time. After th ocess begins executing its statements. The first statement checks If the process began
waiting, each case checks for Ihe correct output by reading the portLt using an assen ex::~ting due to a rising clock edge (0 to 1), as opposed to a fallin g ~Iock edge (1 to 0).
statement. If the condition of the assert state ment eva luates to true, simul ation proceeds The statement checks for a ri ing edge by cheekmg If the elk mput Just changed

.-.~~.---.
-'60 Hardware Description Languages 9.3 Sequential Logic Description Using Hardware Description Languages 461

(c 1 k ' e V en t) and that change was 10 a 1 (c 1 k= ' 1 ' ). If the process began executing due value must be specified wi thin double quotations. For
.include ·systemc.h-
10 a rising clock edge. then the process updates the reg ister's contems using the statement example, the decimal value 5 would be specified as a
" 0 (= I ". For a fa lling clock edge. the process will begin executing, check the i{statement 4-bit sc_lv as "0101". Notice that in defining the SCJ<ODtILE I Reg4)
condition. and then reach the end of the process and hence stop executing, without updating input port for I , we included a space between the two (
8c_ in<8c_ lv<4:> > I;
Q. Ideally. YHDL wou ld have a way to beg in executing a process only on a rising clock closing angle brackets, >, the space being required in 8c_ out <8c_ lv<4> :> 0;
edge. but YHDL has no such feature. SyslemC. 8c_ in <8c_ logic> elk;
In VHDL. output ports are a type of signa l. and signalS have memory in simulation. The module consists of a single process, named
Thus. ass ign ing I to Q causes Q to retain the new value. even when the process stops exe- seq_logic, that is sensitive to the positive edge of the elk
cuting, thus implement ing the storage part of the reg ister. input. specified using the sellsilive..JJos statement fo r SC_ METHOD (seQ....-logic) ;
defining the sensiti vity li st- because the module aenaitive-P08 « elk;
Yerilog should only update its output duri ng a rising clock
Figure 9. 17 hows a basic 4-bi t register in Verilog. The edge, the seq_logic process need not wake up if I
module Reg4 (I. Q, elk); void seCLlogic ()
register is identical to that described in Figure 3.30. The input [3, OJ I; changes. On the positi ve edge of the clock. the register (
module defines the data input I and the data output Q , input elk;
updates the regi ter's coments using the statement Q. write (I. r ead () ;
output [3 , OJ Q ; I
as weli as the clock input elk. The input I and output Q rell [3,OJ Q; ·'O . write(l.read(»". I;
of this design correspond to a 4-bit value. Instead of In Systemc. output ports are a type of signal,
using eight individual inputs and outputs, the module's always @ (poaedge elk)
and signals have memory. Thus, assigning I to Q Figure 9.18 Behavioral SystemC
begin
I and Q ports are defined as veClOrs. For example. the causes Q to retain the new value, even when the descriplion of a 4-bit register.
Q <= I;
type declaration " i npu t [3 : 01 f" defines a 4-bit end process is done executing, thus implementing the
input vector where the bit positions wi thin the vector endmod.ule storage part of the register.
are numbered from 3 to O. The [3:0] defines the
ordering of the elements within the vector, indicat ing Figure 9,17 Behavioral Veri log
Oscillator
that element 3 is located in the leftmost positi on, The descri ption of a 4-bit register.
statement "I <=4' blOOD" would thus ass ign the value YHDL
The register presented in Figure 9.16 has a clock inpu!. We thus need to define an oscillator
1 10 po ition 3 of the vector I and the value 0 10 the remai ning three positions, When
component that generates a clock signal . Figure 9. 19 illustrates an oscillator de cribed in
assigning a value to vector. we must specify the number of bi ts wi thin the value we are YHDL. The entity defin es one output , elk. The architecture consists of a process. but notice
as igning, the base in which we are specifying the value, and the value itself. For example, that process does not have a sensitivity lis!. By default . such a proce executes irs tate-
the decimal value 5 would be specified as 4-bit binary value 4 'bO IOI, ments as if they were enclosed in an infinite loop. So the process sets the clock 10 O. leeps
The module describes the register behaviorally, using an always procedure, The proce- until iO ns of si mulated time passes. sets the clock to 1. sleep another 10 ns of simulated
dure block i sensi tive to the positive edge of the elk input , specified using the posedge time, goes back 10 the first statement in the process that ets the clock to O. and so on.
keyword-because the modu le shou ld only update its output during a ri sing clock edge, The output waveform for such an 0 cilialOr will be identical to the waveform shown in
the always procedure need not execute if I changes, On the pos itive edge of the clock, the
Figure 3.17. library ieee;
procedure update the register's contents using the statement "0 <= I ", Because we The wait/or statement in YHDL tell the us. ieee. std....logic_1l64 .all;
defined the output Q as a reg, ass igning I to Q causes Q to retain the new value, even when simulator the amount of simulated time that
the procedure is done executing. thus implementing the storage part of the register, entity Osc i.
the proce s should not execute. A proce s port ( elk: out stCLlogic );
SyslemC w i lho l/I a sensitivity list IIIlISI have at least one end Osc;

Figure 9, 18 shows a basic 4-bit register in Systemc. The register is identical to that described wait statement. otherwise the simulator will architecture behavior of Ose i .
in Figure 3.30. The module defi nes the data input I and the data output Q, as well as the clock never fini sh simulating that process (because begin
input elk. The input I and output Q of this design correspond to a 4-bit value. Instead of using the process is in an implicit infi nite loop). and proceS8
begin
eight individual sc_logic inputs and outputs, the module's I and Q ports are defined as sc)v thus the simu lator will never get a chance to clk<='O';
logic vector. An .fc_fl· is a vector of multiple se_logic elements. For example, the type decla- update outputs or to simulate other proces es. wait for 10 n.a ;
On the other hand. a process lI'ilh a sensitivity elk <= '1';
ration "5c_l v<4)" defines a 4-bit vector ofsc_logic elements where the bit positions within wait for 10 n.a ;
the vector are numbered from 3 to O. In Systemc. the orderi ng of the elements within the li st call/IOI include wait statements. because end proce8. ;
vector is defined such that the leftmost position i the most sign ificant bit. For example, the by defi nition. the sensitivity list defi nes when end behavior;
statement "I <= " 1000 "" wou ld thus aSl>ign the value 1 to position 3 of the vector I and the process should execute.
Figure 9.19 HDL oscillator description.
the value D to the remaining three positions. When assigning a value to an sc_lv. the vector's
462 Hardware Description Languages
9.3 Sequential Logic Description Using Hardware Description Languages 463
Verilog simulator will never fi1I11' Sh Simu lating that process (becau se the process tS
' III
. an infinite
The register prescntcd in Figure 9. 17 has a cl ock input. Wc thus I
modu1e Osc (e lk) ; oop), and thus the simulator cannot update outputS or simul ate other processes .
need to define an osc illator component that gcnerates a clock output elk ;
re~ elk ;
signal. Figure 9.20 illustrates an oscillator desc ribed in Verilog.
Controllers
The module defines one output. elk. The modu le consists of an a lways
olll"Q."s procedure. but notice that the always procedure does not begin Recall that a common type of library ieee;
have a scnsitivity list. By defau lt. such a procedure executes its elk <= 0; use ieee. st~logic_1164 .all
81 0; sequential circuit is a controller
statements as if they were enclosed in an infinite loop. As suming elk <= 1 ; which implements a finite-stat~ entity LaserTime r is
we arc using a time scale of nanoseconds, the always procedure #1 0; machine. The controller consists of port (b : in stCLlogic;
sets the clock to O. delays for 10 ns of simu lated timc. scts the end x: out std_logic;
endmodu l e a state register and combi national
clock to 1. delays for anot her 10 ns of simulated time, goes back logic. elk. rs t: in std logic
I;
to the first statement in the procedure that sets the clock to 0. and Figure 9.20 Veri log end LaserTimer;
VHDL
so on. The output wavefonn for such an osc ill ator will bc iden- osc illator description .
tica l to the waveform shown in Fi gure 3. 17. Figure 9.22 shows one way to arc hitecture behavior of LaserTimer i .
type statetype is
The delay control statement. specified with the # character. tell s the simulator the model a controller in VHDL. The (S_O ff, S_Onl, S_On2, S_On3);
amount of simulated time that the procedure should not execute. A procedure lVilhol/l a sen- controller modeled is described by signal currentstate. nextstate:
the FSM shown in Figures 3.38 statetype;
sitiviry list 1111151 have at least one delay cOnLrol statement, othcrwise the simulator wi ll never begin
fini sh simulating that procedure (because the procedure is in an implicit infinite loop), and and 3.39. The VHDL en tity, named statereg: proceaa (clk, rst)
thus the simulator will never get the chance to update outputs or to simulate other procedures. LaserTIlller , defines the controller'S begin
inputs and outputs. if (rst= '1') then -- int.ial state
On the other hand . a procedure lVilh a sensitivity li st COIIIIOI include delay control statements, currents tate <= 5 Off·
because by definition the sensitivity list defines when the procedurc should awake. The VHDL architecture e1aif (clk= ' !' and clk' e';'8Jlt ) then
describes the behav ior of the entity. endc~~ ~entstate <= nextstate;
SystemC The archi tecture consists of two
#include "s ystemc h" end process ;
The reg ister presented in Figure 9. 18 has a clock processes, one modeling the state
input. We thus need to define an oscillator com- SC_ MODULE (Osc) register, the other modeling the cornblogic : procesa (currents tate ,
( begin bl
ponent that generates a clock signal. Figure 9.2 1 combinational logic, that form the case currents tate ia
illustrates an oscillator described in Systemc. standard controller archi tecture when 5_0f f =>
The module defines one output. elk. The module from Figure 3.47. x <= '0'; -- laser off
i f (b='O' I th"n
consist of a single process, named seq_logic. The first process descri bes the nextstate
SC_THREAD (se~ logic) ;
implemented as an SC_THREAD . By default. an controller's state register. That pro- elae
cess, named stafereg. is sen ilive to nextst.ate
SC_THR EAD prace s is only executed once. In end if ;
v oid seCLlogic () inputs elk and rsl. If the rsl input is
order to ensure the process executes continu- (
ously, we enclose the statements within the while (truel {
enabled, then the process asyn- x <= '1'; -- laser on
elk . write (SC_LOG IC_OI; chronously sets the Cllrrell/s/{/Ie nextstate <= 5_On2;
proce s in an infinite loop. implemented using when 5_002 =>
wait (10, SC_NSI ; signal to the FSM 's initial state,
the tatement "\01 h i 1e ( t rue )". Thus, the loop
will execute the statement included withi n the
elk. write (SC_LOGIC_ll ;
wait (10, SC_NS I;
S_Ojf. Otherwise, if the clock is ~e::s~!~~ :: ;~~~; still on
rising, the process updates the state when 5_0n3 =>
braces forever. During exec ution , the process sets x <= '1'; laser st.ill On
) register with the next tate. nextstate
the clack to 0, suspends execution for 10 ns of ) ; S_Off;
end case ;
simul ated time, sets the clock to 1. sleeps another end process ;
10 ns of simulated time, l>ets the clock to 0, and Figure 9.21 SystemC oscillator Figure 9.22 Behavioral VHDL end behavior;
so on. The output waveform for such an osci llator description. desc ripli on of the LLrser7imer controller.
will be identical to the waveform in Figure 3. 17. The c"rrell/stale and lIexlstale signals are defined as a u s~r-defi ned type. named
The wail() functi on in SystemC tells the simulator the amount of simulated time that slatelype. The statetype is defined by the type. statement and speCIfies the po ible values
the process :,hould not execute. For exa mple. the statement "wa i t ( 10 . SC_ NS); " will a signal of that type can represent. In SpeCtfYlllg slatelype . whIch repre ents the tates of
su pend the execution of the process for 10 nl>. An SC_TfIREAD process explicitly an F M, the Iype declaration consists of the names of all the states in Our controller s
implementing an infinite loop IIIU.51 havc at least onc wa it sWtCl11cnt . otherwise the cifically S_Ojf. S_OIlI. S_01l2. and S_O/l3. . pe-
464 Hardware Description Languages 9.3 Sequential Logic Description Using Ha rdware Description Languages 465
The second process describes the cont roller's combinational logic. That process, edge of the rSI input. On the positive edge of the rSI input , the procedure wiII wake asyn-
named cOlI/btog ic. is sensitive to the input s to the comblllatJOnal logtc of FI gu re 3.47, chro nously and sets the currelllSIGle signal to the FSM 's initi al state, S_OiJ. On the ri sing
namely. the external inputs (in Lhis case. b). and the state regts~e r outputs (curreIlISIGle). edge of the clock input, elk, if the reset input is not enabled, the procedure updates the state
When either of Lhose items change. the process sets the FSM s outputs, In thts case x, register with the lIeXISlme value determined by the combinational logic procedure.
with the appropri ate value for the current state. The process al so detenlllnes what the next In Verilog, we must explicitly specify the size of the state registers as well as define
state should be, based on the current state and the va lues of lllpu tS (Le .. the condllJOns on the values associated with each state within the FSM . With in the LaserTimer module we
Lhe FSM transitions) . The next state will be loaded into the state regtster by the state reg- declare four parameter values, namel y, 5_0ff, 5_0111 . S_01l2. and S_ 01l3. whi ch specify
ister process on the next ri sing clock edge. the values assigned to each state within the FSM . For example. "5_ 0 f f 2' bOO"
Notice that the architectu re defines the state name S_Off and assigns the 2-bit value "00" to this state. We can then
module LaserTimer {b, x, elk. rst);
declares two signals, CUrre/llSlOle input b. elk, rst; refer to thi s state th ro ughout the modul e using S_Off instead of using spec ifi c bit val ues.
and lIeXISlate. Signals are visible output x; Whi le not required to define a state machine, using para meters increases the readability of
reg x; our design and makes revisions to the FSM much easier. As the LlIserTill/ er s FSM has
across all processes in an architec-
ture. The CUrreJ1fstate signa l parameter 5_0ff 2 'bOO. four states, we need a 2-bit state register. and we therefore declare the currelllSlale and
5_0nl 2 'bO l, lIexlstale signals as 2-bit registers.
represents the actual storage of the
5_0n2 2 ' blO, The second procedure is the combinational procedure implementing the control logic
Slale register. The ll exfs(Q te signal 5_0n3 2 ' bll ;
represents the value coming from of the FSM. That procedure is sensitive to the inputs to the combinational logic of Figure
the combinationa l logic and going reg [1 : 0] currentstate; 3.47, namely, the external inputs (i n this case. b). and the state regi ster OutpulS (cllrrelll-
reg [1: 0] nextstate; Slate). When either of those items change, the procedure sets the FSM 's out pulS. in this case
to the state register. Notice also that / / state register procedure
the architecture declares those always @ (posedge rst or posedge elkl x, with the appropriate value for the current state. The procedure also determines what the
begin next state should be, based on the current state and the values of inpulS (i.e .. the conditions
signals as rype slOl eryp e. defined in if (rst==!) / I initial state on the FSM transitions). The next state wi ll be loaded into the state register by the tate
the architecture as a rype whose currents tate 5_0ff ;
register procedure on the next positi ve clock edge.
value can be eiLher S_OiJ. S_OIlI. else
curre n ts tate nextstate i Notice that the modu le declares two signals, CllrrelllStale and lIeXISlale. Signals are
S_On2. or S_01l 3.
end visible across all procedures in a module. The c"rrelllstale signal represenlS the actual
II combinational logic procedure
Verilog always @ (currents tate or b) storage of the state register. The /l exlstale signal represents the value com ing from the com-
Figure 9.23 shows one way to begin binaLional logic and going to the state register.
model a controller in Veri log. The case (currents tate)
S Off : begin SystemC
controller modeled is described by -x <= 0; II laser o ff Figure 9.24 shows one way to model a controller in Systemc. The Controller modeled is
the FSM shown in Figures 3.38 and if (b==OI
nextstate described by the FSM shown in Figure 3.38 and Ftgure 3.39. The module. named Laser-
3.39. The Veri log module, named
else Jimer. defines the controller's inputs and outputs.
LaserJimer. defines the controll er's nextstate The module consists of two processes. one modeling the SLate regi ster named
inputs and outputs. end
S On1 : begin sralereg, the other process modeling the combinational l og~ named comblogic. that
The module consist of two
-x < = 1; / I laser on too ether form the standard controller architecture from Ftgure ~.47 .
procedu res, one modeling the state nextstate <= 5_0n2; o The state register process is sensiLive to the positive edge of the rSI input and the po _
regi ster. the other modeling the end
S On2: begin iLive edge of the elk input. The state register has an asy~chronous reset signal. In order to
combinational logic, that togeLher -x <= 1; II laser st ill on model the asynchronous reset. the state regtster process IS senSlllve to the po itive edge of
form the tandard controller archi - nextstate <= 5_0n3; the rSI input. On the positi ve edge of the r~1 Input. the process wtll wake ~synchronously
tecture from Figure 3.47. end
S On3: begin and sets the Cllrrell/sta le signal to the FSM s mlllal state. S_OiJ. On the nSIng edge of the
The state register procedure is -x <= 1 ; /1 laser still on clock input, elk, if the reset input is not enabled. the process updates the state regi ster
sensitive to the po itive edge of Lhe nextstate S_Off; wi th the /l eXISlllle value determined by the combmallonal logtc proce 's.
rSI input and the positive edge of the end
endcaee The cllrrelllstale and lIeXlslate signals are defined as a u:er-defined type, na me<!
elk input. The sLaLe regi ter has an end Sta lelyp e. slllIelype .tS de fi ned by the elllllll
. .statement and spect
, . fY Ing
- the possible values a
asynchronous reset signal and in enc1module signal of that type can represent. In pectfymg stalelype, \\ htc.h. rep~'enls the state ' of an
order to model the asynchronous FSM. the elll/III declaration consists of the names 01 all the states tn Our controller. spe-
reset. the staLe register procedure Figure 9.23 Behaviora l Vcrilog description of the
cificall y S_Off. S_O/l/ , S_01l2 , and S_O/l3.
must be se n~iLi ve to the positive Loser7imer controller.

~-~ ----~
-'66 Hardware Description Languages
9.4 Datapath Component Description Using Hardware Description Lang uag es 467
The second process. ~ 9.4 DATAPATH COMPONENT DESCRIPTION
.include "systemc.h"
named cOll1blogic. is se nsi-
tive to the inputs to the
USI NG HARDWARE DESCRIPTION LANGUAGES
combinational logic of
SC_ MODULE (LaserTimer) Full-Adders
Figure 3.-'7. name ly. the {
externa l inpu ts. and the sc_ in<sc_ logic > b, elk. rst; Recall that a full-adder is a combinational ci rcuit that adds three bits (a, b. and ci) and
Bc_ out <sc_ logic > x;
state regi ster ou tputs. When generates a Sum (5) and a carry-out (co) bit. Th is secti on shows how to describe a fu ll-
sc_ signal <statecype> currents tate . nextstate;
either of those items adder behav iorally in an HDL.
change. the process sets the SC_ CTOR (LaserTimer) {
VHDL
SC_ METHOD (statereg) ;
FSM 's out puts. in this case sensitive-pos « rst « elk; Figure 9.25 shows a I'u ll-
library ieee;
x. with the appropriate SC_ METHOD (comblogie) ; adder described behav- use ieee.stcLlogic_1164 . all;
value for the cun'ent state. sensitive « currents tate « b;
iorall y in VHDL. The
The process also deter- full- adder design corre- entity FullAdder is
port ( a, b, ci: in std_logic;
mines what the next state void statereg () { sponds to the full -adder s, co: out std_logic
shou ld be. based on the if ( rst. r ead () SC_LOGIC_l)
described in Figure 4.3 1. );
currentstate S_Off; I I initial state end FullAdder:
current state and the val ues else The VHDL entity, named
of inputs (i.e" the condi- eurrentstate nextstate; Fill/Adder, defin es the arc bitecture behavior of FullAdder i .
tion s on the FS M full -adder's three inpllls begin
void comblogic ( ) process (a, h, eil
transitions). The next state switch (eurrentstate) a. b, and ci and two
beg in
will be loaded into the state case S_Off: outputs s and co. 5 < = a xor b xor ci;
x. write (SC_LOGIC_O); II laser off The architecture co <= (b and cil or (a and cil or (a and bl;
register by the state regi ster i f ( b .read () == SC_LOGIC_O end process ;
process on the next rising nextstate S_Off; describes the behavior of end behavior;
clock edge. With in the fi rst else the full -adder. The archi-
nextstate = S_Onl;
state. we determi ne the next tecture consists of a Figure 9.25 Behavioral VHDL descri ption of a fu ll-adder.
break ;
state depending on the case S_Onl: single process describing
value of input b by per- x . write (SC_LOGIC_l); I I laser on the combinational behavior of the full -adder. TIle process is sensitive to all three inputs (a.
nextstate S_On2; b, and ci ) of ule full-adder. When any of the inputs change. the process executes its two
forming the compari son break ;
statements updating the values for the sum (s) and carry-out (co).
" b . read() SC case S_On2:
x . write (SC_LOGIC_l); I I laser st ill o n
LOG I C_O ". Note that the nextstate S_On3; Verilog
compar iso n for equality break ; Figure 9.26 haws a fu ll-adder
module F-ullAdder (a. b, ci, 5, co);
uses the syntax "== " case S_On3: described behaviorally in Veri log. input a, b, cit
x. write (SC_LOGIC_Il; II laser stil l o n
Instead. if we accidentall y nextstate = S_Off; The fUll- adder design corresponds output s, co;
reg 5, co;
used the syntax " =", which break ; to the full -adder descri bed in
is a valid statement. our Figure 4.3 1. The Veri log module, always @ (a or b or eil
)
design wo uld function named FIIIlA dde r, defines the full - begin
); 5 <= a
A A b ci;
incorrectly. adder's three inputs a, /l. and ci and co <= (b & cit I (a & ei) I (a & b);
Notice that the modu le Figure 9.24 Behavioral SystemC description of the two outputs s and co. elld
endmodule
declare~ two sc-""igl/a/s. Ltlser7imer conlroller. The modu le describes ule
currentSlale and nex/sltl te. behavior of the fu ll -adder and Figure 9.26 Behavioml Vcrilog description of a
Signals are visible acros~ consists of a single always proce- fu ll-odder.
all proce, 5es in a module. The currell istale signal represents the actu al storage of the state dure descri bing the combinational
register. The lIeXl.\IlIIe signa l represents the va lue coming from the combinational logic behavior of the full -adder. The pro-
and goi ng to the state register. Notice also th at the arch itectu re declares those signals as a cedure is sensi ti ve to all three inputs (0. b. or ci) of the fu ll-adder. When any of the input
type ,f/alelype. defined in the architecture as a type who,e va lue can be either S_OjJ. Change, the procedure execute its two statements updating the value for the sum (s) and
.5_0"1 . S_01/2 . or S_01l3. carry-out (co).

.~--.
468 Hardware Description Languages
9.4 Data path Component Description Using Hardware Description Languages ~ 469
SystemC #inc1ude "systemc.h~ which was described in the previous section. TIle design has three internal signal, co l. co2.
Figure 9.27 shows a full-adder and c03, that are used for internal connection between the full-adders. The architecture then
de~cribed behaviora ll y in Sys- SC_MODULE (FullAdder)
instantiates four Fill/Adder components. In VHDL. each instantiated component must have
{
teme. The fu ll -adder design Be in <sc_ logic > a, b. ci; a unique name. The four Fill/Adder components in this design are uniquely-identified by
corresponds to the full-adder sc= out <sc_ logic > s. co: the names FIII/Adderi. FIII/Adder2, FIII/Adder3. and FIII/Adder4.
described in Figure 4.3 1. The In VHDL. the std_logic_vector type provides a convenient method of specifying
SC_ CTOR (FullAdder)
SystemC module. named FIll/ - pons or signals consisting of multiple bits. However. a design may need to access the
Adder. defines the full-adder s ( SC METBOD {comblogic); individual bi ts of these vectors. The individual bits of a s{(Clogic_vector ca n be accessed
three inputs C/, b. anel ci and se~8itive « a « b « ci;
by specifying the desired bit position with in parentheses after the vector's name. For
two outputs s and co. example, to access bit 0 of the 4-bi t input {( of thi s design. one would use the syntax
The module describe the void comblogic () "a (0 )". In defining the connections to the instanti ated components in the carry-ripple
behavior of the full -adder and adder. indi vidual bits of the inputs a and b and output s are 'lccessed using this yntax.
5 ,write(a. r ead () "" b. read () " ci. read (»;
consists of a single process. co .write ( (b. read () & ci. read {)I I The first full-adder, FIII/Adderi. connects bit 0 of the inputs a and b as well as the carry-
named combiogic. describ ing (a . read { 1 & ci. read (1I I ripple adder' carry- in , ci, to the full-adder's three inputs. The s output of FlIllAdderl is
the combinational behavior of (a .read () & b. read ( III;
connected to bit 0 of the 4-bit adder's sum output. s. represemed as s(O) . The design then
)
the fu ll -adder. The process is ); connects the carry-out bit of FIII/Adderl to the internal signal co l , whi ch is ubsequently
sensitive to all three inputs (a, connected to the carry-in input of the next full -adder. FIII/Adder 2. The component COn-
b. or ci ) of the full -adder. When Figure 9.27 Behavioral SystcmC description of a full-adder. nections of the remaining three full-add ers are connected in a imilar faShion. with the
any of the inputs change. the s m (s) and ca -out (co) exception of the last full-adder in the carry-ripple chain. The carry-out from that last full -
process execu tes its two statements updating the values for the u rry . adder. FIII/Adder4 , is connected to the carry-out output (co) of the carry-ripple adder.
Verilog
Carry-Ripple Adders module CarryRippleAdder4 (a. b, ci, s, CO);
library ieee ; Figure 9.29 is a Veri log descrip-
use ieee. std_logic_l164. all ; input (3:0) a;
We now show how to struc- tion of a 4-bit carry-ripple adder input (3:0) b;
turally describe a 4-bit entity CarryRippleAdder4. ia with a carry-in, as appeared in input ci;
carry- ripple adder using the port (a: in std_log~c_vector( 3 downto 0); output (3 :01 .:
Figure 4.33. The Veri log module, output co;
full-adder we designed in b: in std_logic_vector(3 downto 01;
ci : in std_logic; named CarryRippieAdder4 , has
the previous section. s: out std_logic_vector (3 downto 01; two 4-bit inputs. a and b, and a wire col, co2, co):
co: out std_logic carry-in input. ci. The carry-
VHDL FullAdder pullAdderl{aIO). bioI , cit
Figure 9.28 is a VHDL end '~arryRiPPleAdder4; ripple adder outputs a 4-bit sum, s(O) • col) ;
FullAdder FullAdder2 (a (1)' bill •
descri ption of a 4-bit carry- s, and a final carry-out co. 5(1) • co2) ;
COl.
archite cture structure of CarryRippleAdder4 i. The module structurally
ripple adder with a carry- in, component FullAdder
FullAdder FullAdder3 {a (21 • b(2) •
Co2,
describes the carry-ripple adder 5(2) • c031;
as appeared in Figure 4.33. port ( a, b, ci : in std_logic;
FullAdder FullAdder4 (a(3), b(3) .
The VHDL emity, named s, co: out std_logic composed of four full-adders . co3,
• (3!. co) :
I; The design has three internal ._ule
CarryRippieAdde r4 , has end component ;
two 4-bit in puts, C/ and b, signal col, co2, co3 : std_logic ; wires, co l , co2, and c03, that are Figure 9.29 Structural Verilog descriplion of a 4-bit carry_
and a carry-in input. ci. The begin used for inlernal connection ripple adder.
FullAdderl: FullAdder
carry-rippl e adder outputs a port map (a 101. b(OI. ci, 5 (01. col) ;
between the fu ll-adders. The " .
4-bi t su m, s, and a final Fu llAdder2 : FullAdder module instantiates four Fill/Adder components. In Venlog. each IIlstanttated component
carry-out co. port map (a (11. b(ll, col, 5 {ll, c021; must have a ul1lque . name. The four Fill/Adder components III thI S deSIgn are uniquely_
FullAdder3 : FullAdder . 'fi ed by th e names FIIIIAdderl , FIIIIAdder2. FIII/Adder3.. fand
The architecture struc- port map {a{21. b(21. co2, 5 (21. c031;
Identl . FIII/Adder4..
turall y describes the carry- FullAdder4: FullAdder In Veri log. vectors provide a convenient method of SpeCI Ylll~ ports or SIgna] coo-
ripple adder composed of port map {a {31. b{31. eo3, 501. col; sisting of multiple bits. HOlVever. a design may need to a~ce:s the l~tvldUa] blls ~f these
end structure ; vectors The individual bits of a vector can be acce se y SpeCI ymg .the de Ired bit
four full-adders. The archi -
tecture begins by declaring .. ' 'th ' brackets after the vector's name. For example. to access bit 0 of the 4-bit
Figure 9.28 Structural VHDL descript ion of a 4-bit carry-
the component Ful/Adder, ripple adder.
position
input a ofWttillS
. IIIdestgn,
.
one wou Id use the syntax "a [0 J". In defi ning the connection to the
470 Hardware Description Languages 9.4 Datapath Component Description Using Hardware Descri ption Languages • 471

instantiated componcl1ls in the carry-ripp le adder, indi vid ual bits o f the inputs a and b and The module structurall y describes the carry-ripple adder composed o f four full -adders.
ou tput s are accessed using this symax. The first full -adder. Fill/Adder I , connects bit 0 of The desIgn has three internal signals, co l , co2, and c03. that are used for interna l connec-
the inputs (1 and b as well as the carry-ripple adder's carry- in. ci. to the full -adder's three tIon between the full -adders. The module first instantiates four Fill/Adder components. In
inputs. The s outpul of FIII/Adderl is connected to bi l 0 of the 4-bit adde r's sum output, s, SystemC, each instantiated component must have a unique name. The four FlIllAdder com-
represented as s{O). The desig n then connects the carry-o ul bi t of FIII/Adderl to the internal ponents in this design are uniquely identified by the names Fill/Adder_I . FlIllAdder 2
signa l co l. which is subseq ue ml y connectcd to the carry- in input of the next full-adder, FIII/Adder_3 , and FIiI/Adder_4. - .
FIII/Adder2. The component con nections of the remaining three full-adders are connecled Prev iously, we de fin ed multiple-bit inputs as an input vec tor using the sc_lv ty pe.
in a simil ar fas hi on. with the excepti on of the las t fu ll -adder in the carry-ripple chain . The However, SystemC does not support connecting individ ual bits within a s igna l o r POri
carry-out fro lll Ihe last full -adder. F"IIAdder4. is connected 10 the carry-out output (co) of of type sc_l v 10 a struc tural description . In our Can)'RippleAdder4 design. we instead
the carry- ripp le adder. defined the inputs and outputs, (I, b. and s, as arrays of sc_l ogic wi th four e leme nts
each, rather than using type sc_lv. The indi vidu al bits of the array ca n be accessed by
System C specifying the desired bit positi on wit hin brackets after the array's name . For
Figure 9.30 is a Syste mC desc ription of a 4-bit GUTy- ri pple adder wil h a carry-in, as
appeared in Figure 4.33 . The Sys temC mod ule, named CarryRippleAdder4, has two 4-bit
a
examp le, to access bit of the 4-element input array a of thi s design, o ne wo uld USe
the sy ntax " a[O ]". In defining the connec tions to the in sta nt iated components in the
inputs. a and b. and a carry-i n inpu t. ci. The carry-ripple adder o utputs a 4-bi l sum , s, and a
carry-rippl e adder, indi vidual bits of the inputs a and b and o utput s are accessed
final carry-out co. a
us ing thi s sy ntax. The first fu ll-adder, Fill/Adder _1, connects bit o f the inputs a and
#inc1ude ·systemc.h- b as well as the carry-rippl e adders carry- in . ci, to the full- adders three inputs. The
#include • fulladder. h- s o utput of Fill/Adder _ I is connected to bit 0 of the 4-b it adders sum ou tput . s. rep-
SC_ MODULE (CarryRippleAdder4) resented as s{O). The design then connects the carry-out bit of FlIllAdder _ 1 to the
( internal signa l co l that is subseq uentl y con nected to the carry-i n input of the ne x t
sc_ in<sc_ logic > a[4]; full-adder, Fill/Adder_2 . The compone nt connections of the re maining three full-
sc_ in< sc_ logic > b [4] ;
sc_ in<sc_ logic > ci; adders are connec ted in a simil ar fas hi on, with the excepti o n o f the last full-adder in
sc_ out <sc_ logic > 5 [4) ; the carry- ripple chai n. The carry-o ut from the last full-adder. Fill/Adder _4. is COn-
Bc_out <sc_ logic > CO; nected to the carry-o ut output (co) of the carry-ripp le adder.

FullAdder FullAdder_l: Up-Counter


FullAdder FullAdder_2;
FullAdder FullAdder_3 ; VHDL
FullAdder FullAdder_4; Figure 9.31 is a VHDL de cription of a 4-bit up-counter, a appeared in Figure 4.48. The
VHDL entity, name UpColllller, defines the counter's inputs and outputs. consisting of a
SC CTOR (CarryRipple4) :
FullAddecl I "FullAdder_I") , clock input elk, a counl enable control input CIII, the 4-bit count value C. and a terminal
FullAdder_2(MFullAdder_2-), count OUlput tc.
FullAdder_31"FullAdder_3"), The UpCOIlllter's architec tu re slructurally describes the de ign consisting of three com-
FullAdder_4 I " FullAdder_4" )
pone nts, namely Reg4, IlIc4. and AND4. Reg4 is a 4-bi t parallel load register with a load
FullAdder_l.alaIO]): FullAdder_l.bl bIOI): contro l inpul Id. IIIc4 is a 4-bit incrementer. AND4 is a fo ur- input AND gate u:at will output
FullAdder_l.ci(ci): FullAdder_l.s(s[O] );
1 if and onl y if all four inputs are I. The archi tectures furthe~ speCifies two signal . tempC
FullAdder_l . co(col) ;
and incC, used as internal wires within the structural deSCription .
FullAdder_2 . a (a II] ): FullAdder_2. blbll]):
FullAdder_2 . ci Icol); FullAdder_2 . sis II] );
FullAdder_2 .co(co2);

FullAdder_3.a la (21); FullAdder_3 . blb(21);


FullAdder_3.cilco2): FullAdder_3.sls12]):
FullAdder_3. co (co) ;

FullAdder_4 . ala(31); FullAdder_4 . blbI3 1);


FullAdder_4.cilco3); FullAdder_4.sls13]);
Figure 9.30 Siructural SY' teme FullAdder_4. co (co) ;
de;cri ption of a 4-bi l carry- )
"pple adder );

-In 9 Hardware Description Languages
9.4 Datapath Component Description Using Ha rdware Description Languages 473

library ieee; . In the UpColllller design. we need to connect the output of the 4-bit regi ster to the
use ieee. std_logic_1164 . all ; Incrementer. the A D gate. and the cou nter' s output port C. VHDL does nOt allow us to
entity upCounter is
connect multiple signals or ports withi n the port map of an instanti ated component.
port ( elk: in stcLlogic; Therefore. the architecture uses the tempC signal to connect Reg4 _ I's output to both the
cnt: in sed-logic; AND4_1 and Il/ c4_1 components. We still need to con nect the register's Output to the
C : out stc;Llogic_vector (3 downto 0);
tc: out stcLlogic
output pan c.. The architecture makes thi s connection by specifyi ng a proces , named
); OItlPll tC. that IS used to connect the output of the regi ter to the output pon C. The
end UpCounter; OWPlItC process is sensiti ve to the signal tempc. previous ly used as an internal wire
between the three components. Whenever tempC changes, wh ich corresponds to a chanae
architecture structure of upCounter ia
component Reg4 in the up-counter'S stored count, the OlltPlItC process assign s the new count to the outP~t
port ( I : in stcLlogic_vector(3 dowuto 0); port C.
Q: out std_logic_vector() dcrwnto 0);
elk, Id : in stcLlogic Vcrilog
);
end component ;
Figure 9.32 is a Veri log descripti on
conwonent Inc4 of a 4-bit up-counter, as appeared in
port ( a: in std_Iogie_vector(3 downto 0); Figure 4.48. The Veri log module, module Reg4tI. Q. elk. ld);
s : out st~logic_vector (3 downto 0) input (3 :0) I;
named UpColllller. defines the input elk. ld;
);
end cOl!lPOnent ; counter's input and ou tputs. con- output {3:0] Q;
component AN04 sisting of a clock input elk, a count II details not shown
port ( w,x,y,z : in std_logic : endmodule
F : out std_logic
enable control input CIII . the 4-bit
); count value C, and a terminal count module Inc4 (a, 5);
end component : output tc. input (3,01 a;
signal tempC: std_logic_vector(3 downto 0); output [3:0) 5;
signal incC : std.....logic_vector (3 downto 0):
The UpCollnter 's module struc- II details not shown
begin turall y describes the design endmodule
Reg4_1: Reg4 port map {incC, tempC. clk, ent); consisting of three components,
Inc4_1: Inc4 port map ( tempC, ineC); module AND4(w.x,y,z.F);
namely Reg4, II/c4, and AND4. input w, x, Y, Z;
AND4_1 , AN04 port map ( t empC(3) , tempC(21.
tempC(l). tempC(O) . tc); Reg4 is a 4-bi t para llel load register output F;
II details not shown
with a load control input Id. II/c4 is endmodule
outputC : process (tempC)
a 4-bit incrementer. AND4 is a four-
begin
Figure 9.31 Structural VHDL
C <= tempC; input AND gate that wi ll output 1 if ~:!:tU~~~~~~~Clk, cnt, C. tc);
end process ; and only if all four inputs are 1. The
description of 4-bit up- output (3,0) C;
end structure:
mod ule furth er speci fi es two 4-bit reg 13 , 0) C;
counter.
wires, tempC and il/cC, used as output tc;

intern al wires within the structural wire 13,01 tempe;


The architectu re i nsta ntiates each of the three components and spec ifies the con- description. wire (3 : 0] incC;
nections between them. Reg4 is the only seq uenti al component w ithin the up-counter The module instantiates each of
Reg4 Reg4_1 lince, tempe, clk. cnt).
and thu s the elk i nput on ly needs to be connected to the clock input of the register. the three components and speci fies Inc4 Inc4_1 (tempC, incc); .
We co ntrol the up-counter 's counting by connecting the count enable input, CIII, to the the connections between them. Reg4 AND4 AND4_1 (tempe 13). tempe 12)
load enable. Id, of the regi ster. The output Q of Reg4 _ I is co nnected to the internal is the only sequential component tempe Ill. tempCIOI: te);
signal tempC, whic h connects the register 's output to both the IlIc4_1 and AND4_1 w ithin the up-counter and thu s the alway. @(tempC)
components. In c4_1 receives the current count from the tempC connection and elk input only needs to be connected begin
C <= tempC;
ou tputs the incremented cou nt on its ou tput s, which is connected to the other internal to the clock input of the register. We
end
signal illcC. The illcC signal connects the incremented count fro m IlIc4_1 to the par- control the up-counter' s counting by endmodule
allel load input I of Reg4_ 1. The curren t count is al 0 co nnected to the four inputs of connecting the count enable input.
CIII . to the load enable. Id, of the Figure 9.32 Structural Veri log des ription of
the AND4_ 1 com ponent. The AND4_l' s ou tput F is then connec ted to the counter's
~-bit up-coumer.
register. The output Q of Reg.J_1 is
termina l cou nt output IC.

- ~ =---~ --_ ..
47~ Hardware Desc ription l ang uages 9.5 RTl Design Using Hardware Desc ription Languages • 475

connected to th e internal signal lempC. w hich connects th e register's output to both the 4-bit parallel load register with a load control input lei. IIIc4 is a ~ - bit incrementer. AND" is
IlIc-l_1 and AND-I_I co mponents. IlIc-l_ 1 receives the current count from th e lempC con- a four-input A ND gate that will output 1 if and onl y if all four input' are 1. The module
nect ion and ou tputs th e incremented count on its output s which is connected to the other further specifies two 4-bit signals. lempC and illc , u"cd as interna l w ires within Ille struc-
intern al signa l illcC. The illcC signa l connects the incremented count from IlI c4_ 1 to the tural descripti on, Additi onally. the module defi nes a four-element array of ,c_log ic signals.
para llel load input I of Reg4_1. The current count is also connected to the four inputs of named lempC_b. u ed to access the individual bits within the 4-bit Vec tor lempC.
th e AND-I_I component. The AND 4_ l' s output F is th en connected to th e counter' s ter- T he module first instantiates each of the three c mponelllS and th en specifies the con-
minal cou nt out put Ic. necti on between them. Reg" is the only sequent ial component w ith in the up-counter and
In th e UpCo llll le r des ign, we need to connec t th e output o f th e 4- bi t regi ster to the thu s the elk in put only needs to be connected to the c lock i nput o r the reg ister. We contIol
incrcme nt er. th e AN D gat e, and th e counter' s output port C. Therefore, the module th e up-cou nter's counting by connecting th e count enab le input . CIII , to th e load enabl e.
uses th e l empC sig nal to co nn ec t Reg4_ l' s o utput to both th e AND4_ 1 and IlI c4_ 1 Id, of th e register. The outpu t Q of Reg4_ 1 is connec ted to the in tern al signa l tempC,
components. We still need to which connects the register' output to IlIc4_ 1. IlI c4_ 1 receives th e current count from the
connec t th e reg ister 's output .include ·systemc. h" lempC connecti on and outputs the incremented count on its output " w hich is conne ted
to th e output port C. The ff:include "reg4 . h" to the internal signal illcC. The ill cC signal connects the incremented count from 111('4 I
modul e makes thi s conneC-
#include "inc4 . h" to the parallel load input I of Reg" _I . The current COun t is aL connected to the fo~ r
#include "and4 . h"
ti on by specify ing inputs of the AND4_1 component using th e lempC_b array to access th e individual bits.
procedure th at is used to
SC_ MODULE (UpCoun ter 1 The AND4_l's output F is th en connected to th e counter 's termin al Count outputlC.
(
connec t th e ou tput o f th e In the UpCol/lller deSIgn. we need to connect th e output o f the 4-bit register to the
sc_ in <sc_ logic> elk. cnt;
register 10 th e output port C. s C_ out <s c _ lv<4> > C ; incrementer, the AND gate. and th e counter's output port . Therefore. the module uses
th e lempC signal to connect Reg4_ l's output to the IlIc4_ 1 component and uses the
The procedure i sensitive to
lempC_b array to connect Reg4_l's output to the AND4_ 1 component. Thus, we still need
th e signal lempC. prev iously sc_ signal <s c _ lv<4> > tempC, ineC;
sc_ signal <8c_ logic > tempC_b [4] ; to connect the register's output to th e output port C and as" ign th e indi vidual bits of the
used as an intern al wire
register'S output to th e tempC_b array. Th e. ~lodu l e makes th ese connections by definin g
between th e three compo- Reg4 Reg4_1; a process, named comblog lc, that IS senSlll ve to the signa l lemp . Wh enever tempC
nents. Wh enever lemp C Inc4 Ine4_1;
AND4 AND4_1; changes. which corresponds 10 a change III th e up-counter's 5t red count. th e comblogic
changes. whi ch corre sponds
process assigns th e new count to the output port C. Additi onall y, the process assigns the
to a change in th e up- SC_ CTOR (UpCou nter) Reg4_1 ( " Reg4 _ 1 " ) , bi ts withi n vector tempC w the ,"d, vldual sc_log ic signll is w ithin the lelllpC_b llrray. In
counter's stored count. th e I n e4_1 ( "Ine4_1 ") ,
AND4_1 ( "AND4_ 1 " ) order to access the IndIVIdual bIts of the Vector SIgnal lempe. we use the syntax .
procedure ass igns th e new MtempC.read()[O) " .
(
count to the output port C. Reg4_1.I{ineC) ; Reg4_1 . Q{tempC) ;
Reg4_1. elk (elk); Reg4_l.ld{ent) ;
System C ~ 9.5 RTl DESIGN USING HARDWARE DESCRIPTION LANGUAG ES
Figure 9.33 is a SystemC Inc4_1. a (tempC); Ine4_1. s (inee) ;
description of a 4- bit up- We now show how 10 create RTL descripti ons using HDLs. We w ill show HDL descrip-
AND4_1 . w (tempC_b [0 J ); AND4_1. x (t empCb 11 J) ;
counter, a appeared in AND4_1 . Y (temp C_b 12 J ) ; AND4_1. z ( tempCjJ [31) ; tion s o f the startin g point of RTL deSIgn, namely. high-level state machines and of the
Figure 4.48. The SystemC AND4_1. F (te) ; ending point of RTL design, namely, connected controllers and data path s. RT L de igners
modul e, named UpCol/llter. w i ll common ly create a testbench to test the hIgh- level state machine description, and
SC_ METBOD (eomblogie) ;
defines th e counter's inputs sensitive « tempC;
then use th at same testbench fo r th e comroller/datapath descripti on, thu s helping to verify
and outputs, consistin g of a that the designer created a correct controller/datapath IlTIpl emcnti on.
cl ock input elk, a count void eomb l ogie ( )
enable control input ClII, the High-level State Machine of the laser-Based Distance Measurer
4-bit count value C, and a tempC_b [0 I tempC . read () [0 J
ternpC_b [11 tempC . read () [1 J VHDL
tenn in al cou nt output Ie. tempC_ b [2 J tempC . read () [2 J Figures 9.34 and 9,35 present a V HDL descripti on o f a hi gh -level state machi ne for the
T he UpColllller 's module ternpCjJ 131 tempC . read () [3 J
C. write (tempC) ; laser-based distance measurer shown tn FI gure 5. 15. The entity, named Laser·
structu rally descri bes th e
} DislMeasli rer. defines the ,"puts and output , II1cludin g a u ser- pre ed bunon input B. a
design consis ting of three }; la er sensor input S, a laser control output L, and a 16-bit o utput D fo r the distance
components, namely, Reg4,
figure 9.33 Slructu ral SystemC description of 4-bit up-counter. measured.
IlIc4. and AND4. Reg4 is a
476 9 Hardware Description Languages 9 ~ Rll Design USIOg Hardwaro 0 cnphon Languagos 477

librazy ieee; t _ '""" F~ 9:34/


u •• ieee . std_logic_l 164 .all; - I
u •• ieee.atd_lOO'ic_arlth.41I; [)cu - U Z :
U I 'I') tbea
.Dtity LaserDistHeasurer i . t , •
port ( elk. rst: in std_I09ic: • 1••
B. 5; in std_logic; t I.
~; out std_Iogic;
_
_ 52.
if
0: out unsigned(15 downto 0)
); L • 'j • 1ft. t on
• nd I..aserDist.Measurer; • A • S I
_ 6)-
architecture behavior of LaserDiatMeasucer i . t, c. '0', 1... 1 If
type statetype 10 ISO. 51. 52. 53. 54); Dctr c OCt.f' t 1,
if « t 1'1 tben
.igna~ state: 8tatetype: - S. r
aignd Detr ; unsignedllS _ t o 0); .1 ••
_ l it .
cone taut U_ZERO :
unsigned lIS cSownto 0) ;. '0000000000000000'; "0
D
•.SHRt,. lr, U 'mi.
con. taut U_ONE : unsigned(O 4owDto 0) :& -1- ;
begin • • <- 51:
statemaehine: p.roc ••• (elk, rst) Figure 9.35 Beh3>I<'I1I1 HI)l. .... c ••• ,
begiD descnpli n of. hl&h·l",el ,laiC _if
if (rst~' 1 .) th.n eDIt proee ••
L <=- • 0 ' ;
machine of lhe I.",r·b ",d eDd t. v r.r:
o <= U_ZERO; dl lance me ul'"er (conlln14rd)
Detr <. U_ZERO;
state <.sa: SO: -- i nitial SUtt. the OUlpUt!>. Land O. and the ,"Iemal cmmler "~"IJI . /)//, . Ie) Ihell d ·fallh vIIIII ,. '111e
el.if (clk::o: '1' and elk' event) th.n
ca •• state i . default value, hould corrC5poncJ 10 Ihe va lue, "'''g'' cJ III Ihe "~rllll' IIIlIhlll the lI1i" ,,1,Wlc
Figure 9.34 Behavioral "h.n SO :=> of our high.le\el 'tale machine. olice thm we defined U (0",1,1111. nllliled (J 11-.110. colre·
VII DL dc'eripliol1 of 3
hi g h-len.:! "laic nwc hinc
L <c . 0';
o <= U_ZERO:
laser off
clear 0
sponding 10 the 16--bll I/IIJ'gned vnlue "f l ero. When Ihe fl' "
11,,1 e"ahled. "" Ih ' ri,i "
state <= 51 clock. the proce ~ evalume, the current ,wte. ;I\"lln\ Ihe "pproprinle Ulilput, fnr Ihe currenl
o f (hc la ... cr-bascd di l\t:1llcC
(continued in Figure 9.35) state. determine the nexl ,wte. and updale, Ihe , lute reg"ler "gnu!. \ 1/11" . In uur III B.h level
IIlcal\urcr.
Slale machine dcscrip"on. we only need a "ngle tale reg"ler" n"l ln l1",del Ihe behuvinr
of our tate machme, in ~lead of the two ,'gnal, c urmllllale and IIr>lI ll1le we prevlllu\ly
InSlead of using a 16- bil ."d_logic_ ,'eclOr. we defined the OU lput 0 as ullsigned. For
used in the controller design hown '" Figure 9.22.
log ic operati ons. an ullsigli ed behaves the same as a sIlClogic_,'eclor. However. we can
The high·level Ulle mach me for Ihe IU'lCr·ba'ICd d"wnce meawrer perfnnm IwO
also perform ari lhmelic operations on ullsiglled values. Whenever u ing unsigned. __ e
arithmetic operalions. addilion and ,hi fling . By u\ing Ihe /II"i/l,wtl lype. 10 lI"remenl Ihe
musl include the Slalement ··use ieee . 5 td_1 og ; c_a ri th. all; ·· at the lOp of our
counter signal OClr in stale 53. we use the 'yntax. "Dc l r ( Dc t r + 1; .'. 'I h" stale·
YHDL descriplion. The use slatement spec ifies which package we will use within our ment will add one to the cu rrent value of OCIf and \lore the re\ult in Ik/r. In ,tute S4, we
design. The package ieee.sld_logic_arilir defines Ihe ullsig lled type as well as a set of calculale Ihe di \tancc. 0 , by dividing Ihe value of J)( If by 2. Iluwever. we perform Ihi.
operati ons and functions we can perform on ullsigll ed value. division u ing a righl·shift-by-one operation. To perfom) Ihe , hlft "nd a"igll the vulue 10
The entity also defin e a clock inpu l elk and resel inpul r SI . We assume that the clock the outpul O. we use the M31ement "0 (- SHR( Dc lr, U ONE) ; ". The funclion
input is 300 MH z. as was assumed in the laser-based distance measurer design shown in SHR() , defined within the; eee. 5 td_1 og ; c_a rl h package. \ hifl ~ the fi"t paramo
Fi gure 5.19. We omil delails of generating Ihe 300 MHz clock (see Section 9.3 for an eler. OClr, by the amount specified by the second parameler. V_ONE, where V_ONE i ~ a
exampl e of describing an oscillalor). constant we defined earlier in Ihe architecture.
The YHDL archilecture describes the behavior of the entity. Instead of using IWO
processes as shown in Figure 9.22. the archilecture consists of a single process describing Verilog
Figures 9.36 and 9.37 presenl a Yerilog description of a high· level \tate machine for the
Ihe behavior of our high·level state machine. The high-level state machine process.
laser-based distance measurer shown in Figure 5.15. The module, named Laser·
named slO lelllacirill e, is sen itive to inputs elk and rsl. If the rSI is 1 , then the process
DislMeasurer. define Ihe inputs and outpuIS, including a user·prc s~ed bullon inpul B. a
asynchronously selS the slate signal 10 the state machine ' initial state, SO, and initializes

.. --~- ... ~ -
4711 Hardware Description Languages 9S RTlD ~U

la'cr \CIl,or Input S. i.1 1~I\cr TIl<: hlgh·!c,(1 'IJI(


.odul. t.A erD stl'!e aut r (cl~. rl • B. S. L. j):
conlfol oulpul I.. ,anu .1 mJ ' hl"" t,l< lhe I. 'r hoi ,.s
input ... i<:. rat. B. S:
1(,· 1\11 OUIPUI /) f()r Ihe output L, tJl'.(.Uk:e mc~l urer ~rh nn,
output 5:01 D. 1\\\1 Jnlhrrk!lh. "perdlulCl.
tll'I~lnl;t: mca\urcu
reg L; .IUUUI\'11 .Ind ,hllung III
'I he mlluutc reg 1 D,
In.: renl<:nIIt!.: ulUnla I If '"
tkllllc' a cI"c~ InpUI elk
pax. . . ter ~ • l-bOOO. 'laIc \.1. \Ie u lhe ')01.'
.Ifld re'cl Inplll (II We • 1 bOO!.
",,"mc Ihal Ihe cll)(;~ IIlpUI 1 bOlO,
Thl' ,1.ICI11<:11I \I III JJJ """
.'(X) f',III/, \Va' S3 • j'bOll.
I' .Il
Sl • J blOO, Ul Ihe I.UfTCnl 'Jluc 1.1 I>' "
41"lIII1Cd III the la"cr·ba .. cd ,anu ,,,,,,. lilt.·
J"C,uh III I>'"
tip. lance I11ca\urcr dC'lgn reg 2 0) 4 .. In 'I.llt .\-I. \Ie (ut-lIl,lIe !l1<:
rev (l JC~~r
,hOWII In "'gurc 5. 19 We dl'IMce. D, Il) dll Idan~ 111<:
Olllit detail' 01 gcncr'lllIlg alway. 0 ( po • •dge r OT poa.dg_ • kl 'JlliC "1/>.,,,,) ~ Ih'\lc,el• FIIIU'•• 31 n I \ Itl.ll \, III, " Iip.tllll1 ttl II IIINh lr\rI 'Hil t'
Ihc .1(X) f',1I11 cI"c~ (,cc ,,-gin
if I r I "-gin
\\ e pcrhmn (hi
U 109 .1 nghl ,hili III lin...
dl\ I'u)n ,n hi t,I,' I I I ,I til lith: 11K I II""' ,."",,,,,,,./1
• cellon I) ., for .In e.'.Imple L 0,
of de,ennlng all ,,'eill.llon. D < 0 opewl"n 1" pert',"11 111<: Unclvd<> ·.Y h'
The Venlog moullic
beh.1\ lorall} dc,cril1C, the end
OCtr
,tate <
0,
I Inl i I at •
hili ilnU J"I~n It!.: 'diU' III
Ihe (lUlrUI 0 . \lC U-C III<:
,lalelllCl1I .. 0
" .. . II 'II

! .,lIH' rf)'\(J\/t 'tl\urt·r'\ IlIgh- .~.. begi n


ca •• Cst: e 1 : ", \I I1<:rc perln""' ,I
Ie'el 'I ,lie Ill.lct"ne In ,le.lu SO, "-gin n!!hl ,hili operaunn
or lI'lng (\\() proccdun;\ a' L < 0 II las r off
, 110ll II in l'igure 9.2.\. Ihe D < 0 I clear D S" lem('
8 ~te < 51: Igure' 'l ~H alld I) W rrc'<'n!
lI10dule con ... l . . h or a 'Ingle end
procedure.: dc-.crihlllg the il S~'ICI11 de...:"pu!>n III ,I
SI: "-gin
neha' inr or our high-level Dc r < O. II re.et count hl!!h·le'cl ,I;Ile IIIJlhan' tnr
i f 16 1) Ihc IJ\Cr·ha'Cd UI,lJIleC mea
'1:1Ie lIlae hine . The high. state 52: urer ,hown an IIFure ~ I~. K U1'1IOO I • • mMI"'hlnt!'I ,
bel ,laic llIaci,,"e proce· .1 •• ..n.ltt •• PO' « r. 1 I
st te 51: The module. n.lloco 1..1.11/ r·
dun.: I.. . ,cn ... ili\t.: 10 the
end 1J'flMwlllrrr. dchnc lhe
po,ili, e edge of Ihe re,el 52, "-gin anpul' and oUlrUI,. ancluu,"
inpul. r<t. and Ihe po,ili'e L <:0 1; II laser on
state <: 53; a uler'pre \Cd hunnn anpul H f r ' re&d O • S<.' 1.oGIC I I (
edge o r Ihe clt)(;~ inpul. de B. a l3.ler len",r anJlllI .S, J " write ..LOCI 01
end
If Ihe nl b enabled. Ihen write
13....:r control OlIlpUI L. Jnd J
Ihe procedure ,,,ynchro· (conllnued In Figure 9.37) at , II 'ni~i.l a (I'
16.bll oulpul D for Ihe d,,·
nou,l y 'e ll Ih e :'Ia lc lance mea~ured. )
Figure 9.36 Bch3\ ioral VHDL de,criplion of n hlgh·lel'el <lale .1 •• (
registe r. stllll', 10 the '-laIC The module 31\0 dehnc' ..itch • ' J
nuu.:hinc of Ihe lascr-bn. cd di,wncc me3\urcr. c •••
m:u.:hillc·s initial ~Ia I C . O. a clock inpul elk and rc\C1
L. write I 1.oGie 01: II I •• ~r of!
and inilializes Ihe QUIPl" '. inpul r.l/. We a"ume Ihal Ihe o, wri.e lOI, II c1"'o. r 0
L and D. and Ihe imernal cou mer regisler. DClr. 10 lheir defauh value. The defauh valu clock inpul i\ 300 MH7, a\ tAt.#>
bred ;
shou ld correspond 10 Ihe values as igned 10 Ihe ignal \Vilhin Ihe ini li al late of our high- was a,sumed an the laler, ca •• ('I

level SlalC machinc. When the rSI is nOI enabled. on the rising clock. the procedure evaJu· b3.\Cd di,wncc me3.\urer Ot: r O. 1/ cleat coun
H fB. rea4 11 • SC LOGIC 11
ale, Ihe curre nl 'laiC. assigns the appropri ale OUlput for the curren! stale. delennin the design <hown In Figure 5.19. atE> ,2;
neXI SlalC. and updales the laiC regislcr. In our high-level stale machine de criplion. \Ioe We omil detail; of generaung (conbnuod in FI(1UfO 9 39)
onl, nced a si ngle late register signal 10 mode llhc behavior of our Slale machine. instead Ihe 300 MHl clock (~e
of Ihe IWO rcgi slcr signal s CIIrr elllSfGle and lIe.tlSfGle we previously used in the controller Seclion 9.3 for an example of Figure 9.38 8th,"!>..JI Sy\ltm dClCflpulln of a high-level \laic
describing an O\Cilialor). machlllC of lhe 1.o,c;r·ba.\Cd d"tance me" Uf.r
design shown in Figure 9.23.
480 Hardware Descnpllon Languages

I he S} 'ternl module
beh,,, ",retll} de,enbe, Ihe (conlmued from Ftgure 9 38)
c •••
l..ill,.,!)"/il" ,{/, ,,,,,r', high· wr i t. ..LOGIC 1 • r oc
bel 'tate m.lehlne . In"e<ld """.. 3.
br.ak
oj u\lf1g two pnM.:c\\c\ J.

_t
c a ••
,hown III "'gure 9.24. Ihe L wri t . 01 • I off
lIlodule con'I'" (II ~I Inglc Dtr r.ad 1
if read LOCI 11 1 o 01
p«lCe" de,ertomg Ihe
oehav ;or 01 Illlr hlglt le,d - WI r
.1 ••
, tate lIIachlne I he 11Igh ~ "(It S3; arebit ttln .
br.u
Icn'l '1~llc 11l.U,.:hlllC prnec .......
po« I el
Ihll11cd \/tII,'fluu1111lt'. .... r r ••d »1)- Cal"'JI. e D
,cn"IIVC [0 the j'Xhlll\C
br.ak :
edge 01 Ihe re'el IIlrlil. n/.
and tlte 1'0'1 11," edge 01 lit·
dock Inrut. (/~ II Ihe 1'\1 "
en.lhled. Ihell lite prtlCc"
,t') lIehrlllllH"I) 'el' the
Figure 9.39 Rch.l\IClral \)'tcmC' ut.''-\.:nplwn of .. hl).!h·)e-.d
'Iel/L' ' Igna l III thc '1.l1c
"1.lle m.ll'hmc nf Ihe I."a~ ".hcd tJl"I.JIKC rncol,urcr (( (mtlnut',I,
lI1ach lne', Inll i.tI ",lIe. SII.
and 11111 ",II Ie' Ihe "UlpU".
L ,", d !). and Ihe InlCrIlu l wunle r "gmll. nur. III Ihcor IklJuh \.llue, The ddauh \alue
, llOuld corre'pond (() Ihe v" lue' ." "gned 10 Ihe Ignal, \\ lIhl11 Ihe 111111al ,I.IIC of our high·
leve l ,I: ll c lIladllf1e. When Ihe /'II " nul cn"bkd. on Ihe mlng dod. Ihe proce' C' aluJlc
Ihe curre nl ;.Ialc. a" lgn, Ihe "ppropn'lle OU IJ1UI' for Ih . eurrelll "a'c. determine, the nett
;'Iale. and updale' Ihe -talc reg l-ter ,ignal. \1111 ... In our hlgh·le\tl 'laiC machine de np-
lion. we onl ) need J , ingle 'lUle reg"la 'Igllal 10 model the heha\lor of our laic Figur. 9.40 SlnKtu",1
nwchille. in, lead of Ihe 1\\ 0 ' Iglla!> l'IIrr""/.lIl11t' .tnd ",'\/\/(/1/' we prc\iou,l) u<,Cd III I~ de'iCripllon of Iop-k\d
VHDL de\Cflptton or II",r·
w lli roller de,ign , lll1\' n in Figure 9.24 .
ba..cd dl lance mea urer
The 11Igh· b 'e l ,Wle machi ne for Ihc la,e r·b"ed dl,tance llleJ'urer perform, 1\\0 anth-
mClic opemli on;.. addill on and 'hifling. To incremenl Ihe counler o Clr III ,late 3. "c U The LosuDIJlM~Ufurer" ilfchltecture ItJUCIUroJly de n he' the ,onneellnll' of Ih '
Ihe ,) Illa \. "D c r Dc r . read () + 1: ". Thb ;'(Ulcmelli ,,, II ndd one 10 Ihe urren!
conoroller and dalapath component, 1llc :.rChllectUlc In'ta01 l11tc, Iwn elllllr<1rICI1t\.
va lue uf o ("/r and -tore Ihe re,uh in o C/r. III 'talc 5-1. \\C calcu lmc Ihe di'lancc. D. b~ WM Com roller I I lhe controller for the IU'lCr h.l\Cd d"wnee men",rer ano
di, iding Ihe ,aluc of DC11' h) 2. Howe\ er. we perfo rm Ihi, di,i,ion u illg a nght , hl ft b} ~ WM=oalOpmh_' IS the daUlpalh for thl\ de"gn The arehlletlurc Cflnnctt' Ihe enllly" ( lie.
ope ,,"ion. To perronn Ihe , hif! and ,,,sign the va lue 10 the Ulpul 0 , we u c the , lalemeOi rSI. B. and 5 inpuls 10 Ihe inpull of WM_ on/mll, r_I "nd eonnetl, Ihe coni roller', l a~c r
" D. wr i t e ( Dc t r . read ( ) »1 ) : " . where » perfonns a ri ght hil'l opeml ion. control OUIPUI to the corresponding OUlpul por1 L. Addill!mully, Ihe four "gnul ,. /)I"('II-"Ir.
Dreg_ld. DC1r_clr. and Del'J nt, connect the controller' four conlml "gnal, 111 the ruur
inputs of WMflalOp(lI"_I . 1llc l.lllrrDIfIMeaUlrrr dlllapalh ha, u \l nglc !)Ulput IJ , pro-
Controller and Datapath of the Laser-Based Distance Measurer
viding the distance mea.wred. that I~ connected to the outpul pon () of the cntny.
VUDL Figure 9.4 I is a VHDL descriplion of the La.JrrDwMell fl""" " duwpalh comr<mcnl
Fi gure 9.40 is 3 HDL descriplion of the laser-ba. ed di slance measurer hown in Figure shown in Figure 5.17. The entilY, named WMJJmuplIIit . deflOe, a clock Inpul elk. four
5. 19. The emit )'. named ulseroislMeasllrer. defin es the inpuls and outpulS. including a control inpuls Dreg_elr. Dreg_ld. Dar_elr. and DClrJ fI1. and a 16-bn dl\ Lancc Ouipul O.
user-pressed butt n inpul B. a I a.~e r sensor inpul . a laser conlro l outpul L. and a Iii-bit The architecture defines three componenls. a 16-bn up·counler, u 16-blt register,
OUlput 0 for the di slancc measured. The entil)' also defines a 300 MHz clock input cl and a 16-bil right shifter Ihat shifl right by one posillon. Up o"nlerl6 i, a . l6-btC up-
and re el inpul rSl for the design 's controller. cou nler wilh a counl control inpul en/ and a count clear IOpUI elr. Rell 16 1\ a Iii-bit
-'82 9 Hardware Description Languages 9.5 RTL Design Using Hardware Description Languages ... 483

Figure 9.42 and Figure


library ieee; library ieee;
9.43 are the VHDL description
use ieee. std_logic_1164 . all; u.e ieee.stc;Llogic_l164.all;
of the laser-based distance
entity LD}LDatapath is measurer's FSM controller entity LDM Controller is
port ( elk: in std_logic; described in Figure 5.2 1. The port ( ~lk. rst: in std.-logic;
Dreg_clr, Dreg_ld : in std_logic: B. S: in stcLlogic;
Detr_clr. Dctr_cnt: in std_logic; entity, named LDM COlli roller L: out stcLlogic;
D: out stCLlogic_vector(lS downto 0) defi nes a clock i~put elk. ~ Dreg_clr, Dreg_ld: out std_logic;
); Dctr_clr, Dctr_cnt: out std_logic
reset signal rSI, a user-pressed
end LDtoLDa tapa th i );
button input 8. a laser sensor end LDM_Controller;
a rchitecture structure of LDM_Datapath is input S, and five output control
component UpCounter16 architecture behavior o f LDM_Controller i .
signals, L, Dreg_cit; Dreg_ld. type statetype ia (SO, 51. S2, 53, S4);
port ( elk: in stdlogic;
clr. cnt : in std_logic; DCII'_el,; and DClrJ III. The signal currentstate, nextst8te: statetype;
c: out stcLlogic_vector(lS downto 0) output L is used to turn the begin
); statereg: proce.s (clk, rst)
laser on and off, where if L is begin
end component ;
component Reg16 1, the laser is on. The four if (rst='l') then
port ( I: in std_logic_vector(15 downto 0); other output signals are used 10 currentstate <= SO; -- initial state
Q: out std_logic_vector(lS downto 0). elaif (c!k='!' &nO elk'event) then
con trol the RTL design's data- currentstate <= nextstate;
clk, clr, ld: in std_logic
);
path components. enO if ;
end component . The VHDL architecture end proce •• ;
component ShiftRightOne16 describes the behavior of the
port ( I : in std_logic_vector(IS downto a). comblogic: proce •• (eurrentstate. B. 5)

S: out std_logic_vector(15 downt o 0) enti ty. Si milar to the controller begin


); design sholVn in Figure 9.22. L <= '0';
Dreg_clr <= '0';
end component ; the architecture consists of two Dreg_ld <= '0';
signal tempC std_logic_vector (15 downto 0);
signal shiftc : std_logic_veetor(15 downto 0) ; processes, one modeling the Detr clr <= '0';
stale register, the other mod- Dctr=cnt <= '0':
begin
ca.e currentstate is
Dctr : UpCounter16 eling the combinat ional logic. when SO =>
port map (clk, Dctr_clr, Detr_cnt, tempC); laser off
Shi ftRight: Shi ftRightOne16
The state register process, L <= '0';
Figure 9.41 SlruclUral Dreg_clr <= '1'; clear Dreg
port map (tempC, shifte) ; named stalereg, is sensitive to nextstate <= Sl;
VHDL descriplion of Ihe Dreg! Reg16 inputs elk and rSI. If the rSI is when Sl =>
laser-based dislance port map (shifte, 0, clk, Oreg_clr, Dreg_ld);
enabled. then the process asyn- Detr e!r <= 'I' clear count
measurer's dmapi.llh . end structure; it (B;' 1 .) tbeD
chronously sets the Cllrrelllstale nextstate 52;
signal 10 the FSM 's init ial
parallel load register with a register load control signal Id and a register clear signal elr. state, SO. Otherwise, if the nextstate 51;
end if :
ShijlRighlOllel6 is a 16-bit right shifter that shift s the input I ri ght by one posi tion and clock is rising, the process
assigns the shifted va lue to the outpu t S. The architecture instantiates an UpColllllerl 6 updates the state register wilh (continued In Figure 9.43)
compone nt named Delr, a Regl6 co mponent named Dreg, and a ShijlRighlOne l6 com- the next state.
ponent named ShijlRighl. Delr's in stant iation co nnec ts the datapath's DCII'_elr and The second process, Figure 9.42 Behavioral VHDL dcscriplion of laser-based
DCII'_ClII inputs to DClr's clear and count control inputs. Delr's co unt output C is then named comb/ogie. is sensiti ve distance measurer's controller.
con nected to the arch itecture 's internal signal lempC that co nnects the count value 10 to t.he inputs to the combina- .
the ShijlRighl shifter's input. The shifted coun t is thcn connected to the input of the tional logic of Figure 5.21. namely, the external inputs 8 and S. and the state regIster
Dreg regi ster using the internal signa l shijte. The instan tia tion of th e Dreg regi ster con- output c"neIlISlale. When either of those items change. the proce sets the FSM's out-
nects the register's clear and load control inputs to the datapath's Dreg_ell' and Dreg_ld puts. in thi s case L. Dreg_ell; Dreg_"'. DClr_el,; and DClr-,"I, wilh the a~propri ate value
input ports. Fina ll y, the register's data output Q is connected to LDM jlatapalh' s mea- for the current state. In the controller example of Figure 9:22. the FSM s output x was
defined within the case statement for all possible states. ~lIh five output that mu t be
sured di stance ou tput D .
defined in the LDM COlli roller and five possible states. as Ig~lIlg the values to all outputs
in eac h stat e would be cumbe rso me. Fu rthermore, find In g a mI stake and makmg
484 Hardware Descriplion Languages 95 RTL Des'gn USing Herdware Descnpllon Languagos -ISS

correcli on, or modificalion, 10 Ihe responding OUlpUI port L. ddilionnlly. Ihe rour imcnwl wire,. Ol\'g_clr. D"",~ _ltI.
(contmued from Figure 9 42)
com roller would become very di r- when 52 => Dcrr_elr. and DcrrJ ill. connc I the cOntroller' ~ ur cOlllrtJl ,ignnl' 10 Ihe rour inpulS or
fkul! in a larger FSM con,iqing L <::: '1'; -- laser on LDM_Dawpmh_l . Thc ulserDislA/PtU'If"l'r dillupmh hu, (I 'inglc OU lpUI D. providing Ihe
or more Slale, and having many nextstate <= 53; dislance mea ured. Ihal is connecled 10 the OUlpUI port I or Ihe modulc.
when 53 :::;>
more oulpul ,. The comblflgic L <= '0'; laser off Figure 9.45 i a
proces, u,e, a dirrerenl approach Decr_cnt <= . 1 • ; count up Vcrilog descriplion or Ihe
if (5.'1') then lIOdulo IIpCounter16 (elk, eIr, cnt, e):
in which a derau ll va lue for Ihe LaserDislMecwlfu·s dUla- iopu,t elk. clr. cnt:

_1.
nextstate S4;
OUIPUIS is Or>! as,igned and only .1 ••
palh componcOl shown in out""t 115,01 C;
Ihe deviali ons from Ihe deraul!, nextstate <= 53; Figure 5. 17. TI,e modu le. If detalla not Ih '"
end it ; named LDM_D{//lIJI{l/h.
arc a. , igned laler. The comblogic whe n 54 =>
process fir" as,ign, a deraull Dreo_ld ·1' ; load Dreg defi nes a clock inpul elk, _ 1 . RegI6(I, 0, elk, elr. Idl,
value or 0 10 all five OUIPUI . The Detr_cnt <= • 0' ; stop countino rour control inpuL~ DregJlr. i"""t (15,01 1/
nextstate <= 51;
iJ>put elk, elr. ld,
prates, Ihen eva luale Ihe currenl Dreg_ld. DClrJ lr, llnd out""t 115;01 0,
end ca •• ;
~ lalC and a~, i gn' Ihe.! value... to DClr_CIII. and a 16-bil di~­ I detail. not .hewn
end proce •• ;
Ihe OUlpU!> only when Ihc OUIPUI e nd behavior; lance OUIPUI D. .1IdIIodul.
, hould be 1. The prace" aha The dmapalh consist; lIOdul. ShHtRlghtOne16 (I, 51,
as,igns Ihe va luc 0 10 ,eve",1 Figure 9.43 Beha, 101111 VHDL de,cnpliOIl or la'er-based or three componenls. a iJ>put 115;01 I;
"ignal" wilhin Ihe "," ell "WIC-
Ihc~c a~~ign-
di,tn ncc I11Ctll.,UrCr', controller (cofllillll('d). 16-bil Up-couOler. a 16-bil
regislcr. and a 16-bil righl _1.out""t 115;01 5;
/1 detaill not Ihewn
men IS. howc;ver.
melli' arc included on ly 10 clearly indica le Ihe behavior or Ihe cOlllrolier (Ihey arc redun-
dalli. bU I help make Ihe descriplion easier 10 unde"wnd).
shiner Iha! hin, righl by
one position. Up Olllller-
16 is a 16-bi l up-counler
_1. LOll_De apath(elk, Oreq elr, Dug ld,
Dc r_cl r, Detr cn . 0):
Thc process 01,0 delcnnincs whal Ihe nexl "ale shou ld be. based on Ihe currenl slme i"""t elk;
and Ihe v:llues or inlulS Band S. The neXI 'laIC will be loaded in lo Ihe slmc regisler by wilh a counl control inpul i"""t Dreg_cl r , Dr.g_Id,
laput Octr_clr, Delc, cnt j
Ihe slalC regi, lcr process on Ihe neX I ri sing clock edge. elll and a counl clear inpul out""t (15;01 0,
elr. Reg l6 is a 16-bil par-
allel load regislcr wilh a wire (15,01 tempC, ahlttC,
Vcri log module LaserDistMeasurer(clk. rst. B. S. L. D);
Figure 9.4-1 is a Veri log regisler load control "ignal UpCounter16 Dctr(clk. Detr_cle. Dc c_cnt,
input clk. rst. B, S;
output L; Id and a regiSler clear tempCl,
descriplion of" Ihe laser-ba cd ShiftRlllhtOnel6 ShiftRlqht (tempC, ahlL LCI :
output 115,01 0; signal elr. ShiftRighlOlle-
di lance measurer shown in Reg16 Dreg(shiltC, 0, elk, Orell_elr, Dreg_ld),
16 is a 16-bi l righl shifter eDdIIo4ul.
Figure 5.19. The module. wire Dreg_clr. Dreg_Id;
wire Detr_clr. Detr_cnt ; thaI hi rls the inpul 1 righl
named ulSerDisrNfeasllrer.
by one poSlI,on and Figure 9.45 Slruclural Vtrilog de<criplion or Ihe laser-based
defines Ihe inpuls and oul- LD~Controller assigns Ihe shifted va lue distance measurer". dUlap.lh.
pUIS. including a user-pressed LD!-l_Controller_l (clk. rst. S, S. L.
Dreg_clr, Dreg_ld.
10 Ihe oUlpUI S. The data-
bUllon inpu l B. a lasc r cnsor path module inslanliates an UpCOlllllerl6 componeO! n ~~ed .Delr, a Reg l 6 componenl
Dctr_clr. Dctr_cnt);
inpul S. a laser control OUlpul LDM_Da tapa th named Dreg, and a ShiflRighlOnel6 cmnponenl n ame~ ShiflR'ghl. The module co.nncels
L. and a 16-bil oulPUI 0 for LD1-COa tapa th_l (clk. Dreg_clr. Dre9_1d. Ihe dalapalh ·s DClrJ lr and DClr_clIl mpulS 10 DClr S clear and eouO! .control mpu.lS.
the dislance measured. The Dctr_clr. Detr_cnt, 0);
respeclively. The counlers counl OUlpUI C IS then connecled 10 Ihe 16-bll mleroal wIre
module also defines a 300 endmodule lempC Ihat connecls Ihe count value 10 the ShiflRighl shiner's input. The shined count is
MH z clock inpul elk and Figure 9.44 Slruclura l descriplion of lop-level Veri log Ihen connecled 10 Ihe inpul or Ihe Dreg regisler usi ng Ihe 100eroai 16-bi l wi re shiflC. The
resel inpu l !"SI ror Ihe design·s descri ption of laser-bnsed distance meilsurer. module conneclS Ihe Dreg regislers clear and I ~ad ~o O!ro l inpuls 10 lhe dalapath's
con troller. Dreg_elr and Dreg_ld inpul port . Finally, the reglsler S dala OUlpUI Q IS con nected to
The LaserDislMeas/lrer structurally describes Ihe conneclions or Ihe controller and LDM_dalCtpalh·s measured dislance OUlpUI D.
dalapmh componenls. The mod ul e inslantiales IIVO componenls. LDM_Colllroller_ 1 is
Ihe cOnlrollcr ror Ihe laser-based dislance measu rer and LDM_Dataparh_ 1 is the dalapath
for Ih is design. The archileclure con neCIS the module·s elk. r SI . B. and S inpuls 10 the
inpuls of LDM_Collfroller_ I and con neCIS the con lroll er's laser control OUIPUIIO the cor-
486 Hardware Description Languages
9.5 AlL Design USing Hardwaro Dascnptlon Languages -'117
Figures 9.46 and 9.47 arc the Veri log de criplion of the laser-ba ed di tance measurer's
The se ond procedure i, sen-
FSM controller de;cribcd in Figure 5.21. The module. named LDM_Colllroller. defines a
siti ve to the inputs to the (conhnll6d from FtgUre 9 46)
clock input elk. a re~el ignal nl, a u..cr-pr« cd bUllon inpul B. a laser ensor input S, and Sl, be9in
combi naLional logic of Figure
five OUlput control signa l;. L. Dreg_clr. Dreg_id. DCfrJir. and DcrrJIII. The OUlput L i
5.2 1. namely, the ex ternal inpul~ Octr .c1r < 1;
i f 10 I)
II clear count
used to !Urn the laser on and off, where if L is 1. Ihe laser is on. The four oLher output Band S. and the ~t ate regi ter next.tAt.e.. 52:
signals arc used to conLrol the RTL design' datapaLh components. output ClIrrelllSlate. When either _1 ••
n xtat . . . SJ:
of those items change, the proce- eD4
module LOM_Controller (elk, rst, B. S. L. Dreg_elk.
Dre9_1d. Deer_clr, dure et the FSM's ou tpu l~. in S3, be9in
Deer_cnt) ; this case L, Dreg_clr. Dreg_ld. L <- I; II I ... r on
input elk, rst. 8. S: nex .tat..... S3;
DCIrJ lr. and DCIr_CIII . \ ith the eD4
output L;
output Dreg_elk. Dreg_ld; appropriate value for the current S3, bGQln
output Deer_ elr. Decr_cnt; state. In the controller example of L .. 0: /I lUG, oCt
reg L; Oc:tr cot 1; II Count up
Figure 9.22. the FSM', output .f i f (S I)
reg Dreg_elr. Dreg_ld;
reg Decr_clr . Octr_cnt; was defined within the ca>e SlUte- n xt.tate 64:
_1 ••
ment for all possi ble states. With
parameter SO 3'bOOO, n xt.tate S3r
51 3·b001.
five outputs thm mu>! be defined oDd
52 3 'b010, in the LDM_Colllroller and five S4, be9in
53 3 ·bOll. Dreg ld < t, /I load Dr 11
possible tates. as ign ing the Dc r ~cn < 0:
54 3'blOO; II .top countinQ
values to all ou tputs in each SLate next.tate <_ 51,
reg (2:0J currentstate: would be cumbersome. Funher- oDd
reg [2:0J nextstate; _040 •••
more. finding a mistnke und oDd
alway. @ (po.edge rst or po ••dge elk) making corrections or modifica- e~ul_
begin Lions to the controller would
if (rst==ll become very difficult in a larger Figure 9.47 Bchflviom l Veri log de,cription of In,er-bused
currentstate so; II initial state
el •• FSM consisting of more states dbt.n c ",casurer\ conlrolier (mlllilillcd).
currents tate nexcstate; and having many more outputs.
end
Instead. the procedure uses a dif-
always @ (currents tate or B or 5) fe rent approach in which a default va lue for all the ou tputs i, fir" assigned "nd only the
begin deviaLions from the defaults are assigned later. The procedure fi rst aSlignl a default va lue
L <= 0;
Dreg_elc <= 0; of 0 to all five outputs. The procedure then eva luates the current blate and assigns the
Dreg_ld <= 0; va lues to the outputs only when the output shou ld be 1. nle procedure also assign!> Lhe
Detr_elr <= 0: value 0 10 several signals wi Lh in Ihe case SLa lemenl~. however, Ihese assignments are
Dctr_cnt <= 0:
case (currents tate) included only to clearly indicate Ihe behavior of the controller (they ure redu ndant, bUI
50, begin help make Ihe descriplion easier 10 undersland).
L <= 0; II laser off The procedure also deLermines what the neX I Slale should be, based on the current
Figure 9.46 Behav ioral Dreg_clr <= 1; II clear Dreg
nextstate <::: 51; staLe and the values of inpuls B and S. The nex i Slale will be loaded into the SLale regisler
Verilog description of laser-
based di stance measure r's
end by the Slale regisLer proced ure on Ihe nexl posi live clock edge.
con troller. (continued in Figure 9.47)

The Veri log module behaviorally describes the LaserDislMeasllrer's FSM. Similar 10
Lhe controller design shown in Figure 9.23, the module consisLs of IWO procedures_ one
mode li ng the sLaLe regisler, the olher modeling Ihe FSM's control log ic. The state regiSler
procedure is sensi li ve 10 the po ilive edge of the reseL inpuI, r SI , and the positive edge of
the clock inpuI, elk. If the r SI inpul is enabled, Lhen the procedure asynchronously sets Lhe
ClirrellISIGle signal 10 Ihe FSM 's inilial sLaLe, SO. OLherwise, on the rising edge of Lhe
clock, the procedure updaLes the Slale regisler wilh the neXI stale.
"'ss Hardwa re Desc ription Languages 9.5 RTl Design Using Hardware Description languages 489

SystemC . Figure 9.49 is a SystemC descrip-


tIon of the ulserDislMeasllrer's .include • systemc. h·
Figure 9.-18 is a SystemC descrip- 'include "systemc .h" 'include ·upcounterl6 . h-
tion of the laser-based distance linclude ·LDM_Co ntroller . h"
datapath component shown in Figure .include ·regl6.h·
measurer shown in Figure 5. 19. 'include "LDM_Datapath . h" 5. 17. The module, named LDM 'include ·shiftrightone16 h-
The module. named w serDisl- Dawparh, define a clock input C/k~
SC_ MODOLE ( LaserDistMeasurer)
Measurer. defines the inputs and (
four comrol inputs Dreg_c/r, Dreg_Id,
ou tputs. including a user-pressed sc_ in<sc_ logic > elk. rst; DClr_elr, and DClrJIII, and a 16-bit 8c_ in<8c_loglc > elk;
bUlion input B. a laser sensor sc_ in<sc_ logic > B. S; distance output D. 8c_ln<8c_ logic > Dreg_clr. Dreg_ld;
sc_ out <sc_ logic > L; 8c_ ln<8c_loglc> Dctr_clr. Detr_ent;
input S. a laser control output L, sc_ out <sc_ lv<16> > 0;
The datapath consists of two 8c_ out <8c_lv<16> > D;
and a 16-bit outpu t D for the dis- components, a 16-bit up-counter, a
sc_ signal <sc_ logic > Dreg_cIr, Dreg_ld; 16-bit register, and a 16-bit right 8c_ 81gnal <8c lv<16> > tempC;
tance measured . The module also sc_ signal <8c_ logic > Detr_cIr. Detr_cnt j 8c_ 81gnal <8c:=lv<16> > shiftC;
defines a 300 MH z clock input shIfter that shifts right by one posi-
elk and reset inpu t rsl for the LDM_Controller LD~Controller_l ; tion . UpCollll lerl6 is a 16-bit up- UpCounter16 Detr;
Reg16 Dreg;
design 's controller. LDM_Datapath LDt-LDatapath_l; counter with a count control input
ShiftRightOne16 Shi ftRight;
CII I and a count clear input clr. Regl6
The w serDislMeasurer struc- SC_ CTOR (LaserDistMeasurer) :
rurally describes the connections LDM_Controller_l (-LDM_Control ler_l-). is a 16-bit parallel load register with SC_ CTOR (LDM_Datapath) :
Detr(-Detr-). Dreg(·Dreg·).
of the controller and datapath LDM_Datapath_lt·LDM....Datapath_l · ) a register load control signal Id and a ShiftRight (. ShlftRight·)
co mpo nents. Th e architectu re register clear signal elr. ShifrRighl
LDM_Controller_l. clk (clk) ;
in stanti ates two co mponents. LDf>CController_l. rst (rst) ; 0llel6 is a 16-bit right shifter that Dctr.clktclk) ;
Detr .clr (Dctr_clr) ;
LDM_Colllroller_1 is the co n- LDM_Controller_l.B(B) ; shi fts the input I right by one posi- Octr. cnt (Detr_ent) ;
LDM_Controller_l. S (5) ; tion and assigns the shi fted val ue
troll er for the laser-based di stance Dctr.C(tempC) ;
LDM_Controller_l . Dreg_clr (Dreg_c lr) ;
measurer and LDM_Datapalh_1 LDM_ Controller_l. Dreg_ld (Dreg_l d) ; to the output S. The datapath module
ShiftRight. I (tempC) ;
is the datapath for thi s design. LDM_Controller_ l . Dctr_clr (Dctr_ clr) ; in s ta nti ates an UpCo lllll er l 6 ShiftRight.S (shiftC) ;
LDM_Controller_l. Dctr_cnt ( Dctr_cnt) ; component named DClr, a Reg l 6
Th e modul e co nn ects th e Dreg. I (shiftC) ;
module's elk, r SI , B, and S inputs LDM_Datapath_l. elk (elk) ; component named Dreg, and a
Dreg . Q{D) ;
to th e input s of LDM_ LDM_Datapath_l. Dreg_clr (Dreg_clr) ; ShijrRigh rOlle l6 component named Dreg . clk{clk) ;
LDM_Da tapa th_l . Dreg_ld (Dreg_ld) ; ShifrRighr. The modu le con nects the Dreg.elr(Dreg_clr) ;
COlllroller_ I and connects the LDM_Datapath_l . Dctr_clr (Dct r_clr) ; Dreg .ld IDreg_ld);
comroller's laser comrol output to LDM_Datapath_l . Dctr_cnt (Deer_cnt) ; datapath 's DCTI'_elr and DClrJIII )
the corres ponding output pon L . LDM_Datapath_l.D{D) ; input to Dcrr's clear and count );
) control inputs, respect.ively. The
Additionall y, the four internal ) ;
cou nter's cou nt output C is then con- Figure 9.49 Structural Sy temC de cription of the
wires, Dreg_elr. Dreg_Id, DClr_ laser-based distance measurer' datapath.
elr. and DClr_cllI, connect the nected to the 16-bit intemal signal
Figure 9.48 Structural description of top-level SystemC
controller's four control signal s to rempC that connects the count value
descripti on of laser-based di stance measurer.
to the ShifrRighr shifter'S input. The shifted count value i then connected to the input
the fo ur inputs of LDM_Data-
of the Dreg reg ister usi ng the intemal signal shijrC. The module connects the Dreg reg-
palh_ l . The LaserDislMeaslirer data path has a sing le output D, providing the distance
ister's clear and load control inputs to the datapath's Dreg_elr and Dreg_Id input pons.
measured, th at is connected to the output pon D of the modu le.
Finally, the reg ister's data output Q is connected to LDM_darapalh's measured distance
output D.
-'90 Hardware Description languages
9.5 RTl Design Using Hardware Description Languages 491
Figures 9.50 and 9.5 1 are the SysLemC descript ion of the laser-based distance mea-
sets the curren/state to the FSM 's initial state, SO. Otherwise, on the rising edge of Lhe
surer's FSl\'1 controller described in Figure 5.21. T he modu le, named LDM_Colllrolle r,
clock, the process updates the SLate regisLer with the llextslOle.
has a clock input elk. a reset signa l rSI. a user-pressed bUllon in put 8, a laser sensor input
The second process, named eomblogie, is sensitive to the inputs to the combinaLional
S. and five output conLrol signals. L. Dreg_el,; Dreg_ld, Dell·_clr. and Dclr_clll . The
logic of Figure 5.21 , namely, the external inputs 8 and S, and the state regi ter output eur-
output L is used to turn the laser on and off; where is L is 1, the laser is on. The fo ur other
relllstale. When ei ther of those signals change, the process sets the FSM 's outputs, in Lhis
output signals are used to control the RTL des ign's daLapaLh components. case L, Dreg_ell; Dreg_ld, DClr_c1r, and Delr_ClI/, with the appropri ate value for Lhe
.include "system.h -
current state. In the controller example of Figure 9.24, the FSM's output x was defined
wiLhin the case statement for all possible states. WiLh five ou tputs that we must define in
anum statetype { SQ, 51, 52, 5). 54 }; Lhe LDM_Collllvller and fi ve possible states, assigning the values to all outputs in each
SC_ MODULE (LDM_Controller)
state wou ld be cumbersome. Funhermore, finding a mistake and making corrections or
( mod ificati on to the controller would become very difficult in a larger FSM Con isting of
sc_ in <sc_ l.ogic> elk. rst, B. S; more states and having many more outputs. Instead, the process uses a different approach
sc_ out < sc_ logic > L;
se out <sc logic > Dreg_clr. Dreg_ld;
in whi ch a default value for the all Outputs is fi rst assigned and only the deviation from
sc=out <sc=logic> Detr_clr. Detr_ent; the defaults are assigned later. The process first assigns a default value of 0 to all five out-
puts. The process then evaluates the current state and assigns the values to the outputs
8c_ signal. <statetype> currents tate. nextstate;
only when the output shou ld be I. The process also assigns the value 0 to several signals
SC_ CTOR CLDM_ Controller) within the eose statements; however, Lhese assignments are included only to clearly indi-
( cate the behavior of the controller (Lhey are redundant, but help make the description
SC METHOD (statereg) ;
se~sitive-P08 « rst « elk;
eas ier to understand).
SC METHOD (comblogic) ;
se;;:sitive « currents tate « B « S; (continued from Rgure 9.50)

caae Sl
void statereg () { Octr_clr .write (SC_LOOIC_1); II clear count
if ( rst .read (1 == SC_LOGIC_l 1 i f (B. read () == SC_LOGIC_ll
curren ts tate SO: II initial state nextstate S2;
else e18e
eurrentstate nextstate; nextstate 51;
break :
eaa. S2 :
void comblogic() { L. write (SC_LOGIC_l); II laser on
L. write (SC_LOGIC_OI ; nextstate .: S3;
Dreg_clr. write (5C_LOGIC_O) ; break ;
Dreg_ld .wriU (SC_LOGIC_OI; caae S3:
Detr elr. write (5C_LOGIC_0); L. write (SC_UX;rC_O); II laser off
Detr=ent. write (5C_LOGIC_O); Detr cnt. writ. (SC_LOGIC_l ) ; II count up
i f (S. read () == SCLOGIC_11
switch (eurrentstate) { nextstate 54;
case SO : e1a.
L. write (SC_LOGIC_OI; II laser off nextstate = S3;
Dreg_clr. write (SC_LOGIC_OI ; II clear Dreg break ;
Figure 9.50 Behavioral nextstate .: 51; caae S4:
break ; oreg_ld. write (SC_LOGIC_11; 1/ load Dreg
SystemC description of Detr_cnt. writ. (5C_LOGIC_O) ; I I stop counting
Figure 9.51 Behavioral
la er-based distance (continued in Figure 9.51) nextstate = 51;
SystclllC description of
measurer's controller. break; }
Inser-based distance )
I1lca~ urcr 's controller I;
The SystemC module behaviorally describes the ulse rDis lMeasllrer 's FSM. Sim ilar
(collfill"edj.
to the comroller design shown in Figure 9.24, the module consists of two processes, one
modeling the SLaLe regi ter, the other modeli ng the FSM 's conLrollogic. The state reg ister . what the next state should be_based on the current state
Thc process also detmnme b
process. named slolereg. is sensitive to the positi ve edge of the reset input . rSl, and the d . 111e next st3te will be loaded into the taLe regisLer y
and the values of inputs B an ..
Lpo -ttlve lock edge.
posiLive edge of the clock input, elk. If the rSI is enabled, the n the process asynchronously the statc rcgista process on th~ nex ,
-'92 9 Hardware Description Languages
9.7 Exercises ~ 493
9.6 CHAPTER SUMMARY 9.7 Create a behavioral HDL description of a 2x I
multIplexor described in Figu re 2.54. Then. ~ 4xl
In this chapter. we stated that hardware desc ription languages (HOLs) are widely used in create a s lruc tur~J HDL description that combines iO - f- iO
modem d igi ta l des ign. We provided brief introductions to several wide ly used HDLs, three 2x I muluplexors to create a 4x I multi-
namely. VHOL. Veri log and Systemc. We introduced those HDLs primarily through the plexor as shown 10 Figure 9.52. il - f - il di \ ~
use of examples. illustrating how each HOL might be used to describe combinational 9.8 Create a combinational behavioral HDL descrip- sO I L iO
logic. sequential logic. datapath compo nents, as well as RTL behavior and structure. To tIon . of an 8-bll 4x I multiplexor. Be Sure to d- - d
become proficient at the use of HOLs, a more thorough study of a particular HOL might specIfy the design input and output pons using a ~I / il
be helpful. This c hap ter a lso illustrates the po int that different HOLs have several multIple bll data type. ,.....!!~
commonona lilics. 9.9 Clearly ex plain the difference between a struc-
i2 -

i3 -
iO

it
d
U
tural HDL description and a behavioral HDL
description. Explain the benefits of using both
9.7 EXERCISES klOds of descriptions.
The following exerc ises can be completed using any of the HDLs described in this 9.10 Explain why a combinational behavioral HDL s~ SI

chapter. description must include all the combinational Figure 9.52 4x I multiplexor
ci rcui.l's inputs in a sensitivity list. In particular. composed of three 2x I mUltiplexor.;.
SECTIO 9.2: CO IBINATIONAL LOGIC DESCRIPTION USING HARDWARE explatn why omitting an input actually descri bes
DESCRIPTION LA GUAGES a sequential circuit.
9.1 Create a structu ral HDL description of the binary numbe r to seven-segment display described 9.11 Create a behavioral HDL description of a 16x4 priority encoder. The priority encoder has
in Example 2.23. consisting of the simple logic gates. lllv. AND2, and OR2. Be sure to include 16 1OPUts, d l 5, dl 4, :... dl. dO, and four outputs e3, e2. el. eO. The priority encoder outputs a
combi national behavioral descriptions of the simple logic gates. 4-bllblOary number IOdlcaMg whIch of the 16 inputs is a 1. If more than one input is a 1, the
9.2 Create combinational behavioral HDL desc ript ions for each of the following two-input logic pnonty encoder will outpu t the bmary number for the highest numbered input.
gates. where each logic gate has two inputs. a and h. and a single output F:
SECTION 9.3: SEQUENTIAL LOGIC DESCRIPTION USING HARDWARE DESCRIP-
(a) NAND2. TION LANGUAGES
(b) NOR2,
(c) XOR2. 9.12 (a) Create a behavioral HDL description of a 32-bit parallel load register.
(d) XNOR2. (b) Create a testbench to test the description.

9.3 (a) Create a combinational behavioral HDL description of the three Is pattern detector of 9. 13 (a) Create behavioral HDL description of the FSM controller for the improved code detector
Example 2.24. described in Figure 3.46.
(b) Create a testbench that checks that your description works properly. (b) Create a testbench to test the description.

9.4 (a) Create a combinational behavioral HDL description of the Numbe r-of-ls counter shown 9.14 (a) Create a behavioral HDL descri ption of the button press synchronizer described in Fieure
in Figure 2.4 I, by describing the combinational behavior of both outputs x and y in sum- 3.53. -
of-minterms form. (b) Create a testbench to test the description.
(b) Create a testbench that checks that you r descriptio n works properly. 9.15 (a) Create a behaviroal HDL description of the secure car key controller described in Figures
9.5 Create an HDL description of the 2x4 decode r shown in Figure 2.50, as: 3.57 and 3.58.
(a) combi national behavior. (b) Create a testbench to test the description.
(b) structure.
SECTION 9.4: DATAPATH COMPOl'l'ENT DESCRIPTION USL'iG HARDWARE
(c) Create a testbench to test either description (the same testbench can test either
description). DESCRIPTION LANGUAGES

9.6 Create an HDL description of the 4x I mult iplexer descri bed in Figure 2.55, as: 9. 16 (a) Create behavioral HDL description of an 8-bit parallel load register with register clear
(a) combinational behavior. input e1r.
(b) structure. (b) Create a testbench to test the description.
(c) Create a testbench to test either description (the same testbench can test either 9.17 (a) Create a behavioral HDL description of an 8-bit parallel load register with a clear 10\\
description). input cJr_1 and a set high input s~l_h. When the e1r.) input is 1. the register contents
should be cleared to "00000000 . When the stU, IOputs IS 1. the registers contents
should be set to "11111111". If both inputs are I. the lear low input has priority.
(b) Create a testbench to test the description.

-
--
-19-1 Hardware Description Languages
9.7 Exe rcises 495
9.18 Create a behavioral HDL descriptio n of an 8-bit measures the length of lime in milliseconds be~
register with IwO control inputs sO and sl wi th the sl sO Operation the time as a 10-bit binary numbe . Ore the user presses the button B. outputting
0 0 Maintain present value r on rtlme. if the user did l . L. b . .
following control behavior described in Figure 9.53. I second (1000 milliseconds) the react" . . no press ",e utton Wtthln
9.19 Create a structura l HDL descri ption of a half-adde r. 0 1 Parallel load on rrill/e. Assume a clock fre~uency or'~n~~mer wlil set the output slow to 1 and output 1000
0 Shift right level state machine in an HDL (b) C z. (a) Start by captunng the design using a high-
9.20 Create a structural HDL description of a 4-bit carry- path descripti on in an HDL. · onven the high-level state machine to a controUer/data-
Rotate right
ripple adder wi thout a carry input. First create a
behavioral description of a fu ll-adder, and then use 9.35 Starting from the C description shown in F 9
Common Divisor (GCD) calculator that tak::~ein:u~' t:~a:~ ~t RTL design of a Greatest
Figur. 9.53 Operation table of the
the fuJI-adder component in your carry-rippl e adder
S-bit register fo r Exercise 9. IS. d
description. input go, and a 16-bit output D. When the go is '1' the GC~' ;np~ts a an h, an enable
9.21 Create a behav ioral HDL descripti on of th e approxi lllnte Celsius- to-Fahrenheit convener greatest common di visor and output the GCD on th." output D Scta cu atohr whlil cholmpute the
h" HD . an W it a Ig - evel state
described in Figure 4.40. mllac I ~e. In an I L. and then create an HDL implementation with a datapath controller and
a thclr Intema components. . ,
9.22 Create a behavioral HDL description of an approx imate Fahrenheil-lo-Celsi us converter using
the following approxi mation for the conversion: C ; (F - 32) /2 .
~int GCD(uint a, uint b) II not quite C syntax
9.23 (a) Create a behavioral HDL description of a I-bit comparator.
(b) Create a structural description of a 4-bit comparator. using the I-bit comparators. while ( a ! = b )

9.2~ Create a behavioral HDL description of a 32-bit equality compara tor with three 8-bit inputs a, ifla>b)(
a = a - b;
h. and c. else (
9.25 Create a structural HDL description of th e up-dawn-coun ter circui t described in Figure 4.55. b = b - a;
Be sure to firs t creme a behavioral HDL description of each component used in your structural
HDL design. return(a) ;
9.26 Create a structural HDL descriptio n of a 4-bit down-cou nter with parallel load. Be sure to first
create a behavioral HDL descripti on of each component used in your stru ctural HDL design.
Figure 9.54 C program description of a greatest common divisor calculator.
9.27 Create a structural HDL descript ion of the RGB to CMYK conve rter described in Figure 4.68.
Be sure to first crea te a behavioral HDL description of each component used in your structural
HDL design.
9.28 Create a structu ral HDL description of a CMYK to RGB converter. Hint: Use the information
presented in Example 4.20 desc ribing the RGB to CMYK conve rter to assist in designing the
CMYK to RGB converter.
9.29 Create a structural HDL description of a 4-bit adder/subt.ractor circuit. Be sure to first create a
behavioral HDL description of each component used in you r structural HDL design.

SECTION 9.5: RTL DESIGN USING HARDWARE DESCRIPTION LANGUAGES


9.30 Create a behavioral HDL description of the high-level state machine for the simple bus inter-
face shown in Figure 5.24.
9.31 Create a structural HDL description of the controller/datapath for si mple bus interface shown
in Figure 5.26.
9.32 Create a behavioral HDL description of the high-level tate mac hine for the sum-of-absolute-
differences component shown in Figure 5.29.
9.33 Create a structural HDL description of the contro ller/datapath design of the sum-of-absolute-
differences component shown in Figure 5.30.
9.~ Create an RTL design of a reaction timer circuit that measures the time elapsed between the
illumination of a light , and the pre sing of a button by the use r. The reaction timer has three
input" a clock input elk. a reset input rst. and a button inpu t B. and three outputs. a light
enable output lell. a IO-bit reaction time outpu t rt;me. and a .'ilow ou tpu t ind icating the user
was not fast enough. Th e reaction timer works as follows . On reset. the reac tion timer wa its
for 2 seconds before illuminating the light by ;elti ng lell to l. The reaction limer then
A.2 Switching Algebra 4 497
(i) a + (b . c) = (a + b) . (0 + c)
(ii) a . (b + c) = (a . b) + (0 . c)

A P3: The set B has two disti nct identit ele


every element in B

(i) 0 + a =a +0 =a
y ments, denoted as 0 and I, such that for

(ii) I . a =a. I =a
Boolean Algebras The elements 0 and I are called the additive' d n I
'd n I . I en t ye ement and the multiplicati ve
t en t ye ement, respecti vely. (These elements should t b f . .
gers 0 and I.) no e con used wl Lh the Inte·

P4: For every element " e B there exists an element a' called th I f
a, such that . e comp ement 0
This appelldix is reproduced lVilh permissioll from Ihe l exlbook " IIlIIVduClioll 10 Digita l
(i) a+a' = I
Syslems" by £rcegolloc. Lallg. alld Morello, ISBN 0·471-52799-8, Johll Wiley alld SOilS
publishers, 1999. (ii) a · a' = 0
. The symbols + and • should not be confused with the arithmetic addition and multi.
pltcatlOn sym b~ls. However, for convenience + and · are often called "plus" and ·'times.'·
Boolean algebras is a n imponant class of algebras that has been studied and used exten- and the expressIons a + b and. a· b are called "sum" and "product ,,. respecuve
. Iy. M ore-
over, + and · are also called "OR" and " AND," respectively.
sively for many purposes (see SecLion A.5). The switching algebra , used in the
description of switching expressions discussed in Section 2.4, is an instance (an element) The elements ~f the set B are called constants. Symbols representing arbitrary ele-
of the cia s of Boolean algebras. Conseque ntl y, theorems developed fo r Boo lean algeb ras ments of B are variables. The symbols a, b, and c in the postulates above are variables.
are also applicable to switc hing algeb ra, so they can be used for the tra nsfo rmati on of whereas 0 and I are constants.
switching ex pressions. Moreover, cenain ident iti es from Boolean algebra are the basi s fo r A precedence orderin g i defi ned on the operators: • has precedence over +. there-
the graphical and tabular techniques used for the minim ization o f swi tching ex pressions. fore, pare ntheses can be eliminated from product . Moreover, whenever single symbols
are used for van abies. the symbol · can be eliminated in products. For example.
In this appe ndix. we present the definition of Boo lean a lgebras as we ll as Lheorems
that are use ful for the Lransformation of Boolean expressions. We also show the re laLion- a + (b· c) can be written as a + bc
sh ip among Boolean and switching alge bras; in panicular. we show tha t the sw itching
algebra satisfies the postulate of a Boolean algebra. We also sketch othe r examples of
Boolean algebras, which are helpful to funher understa nd the propenies of this class of ~ A.2 SWITCHING ALGEBRA
algebras. Switching algebra is an algebraic system used to describe swi tching functions by means
of switching expressions. In this sense. a switching algebra serve the same role for
switching func tions as the ordinary algebra does for arithmeLic functions.
A 1 BOOLEAN ALGEBRA
The switching algebra of the set of two elements B = {O. I}. and two operations AND
A Boolean algeb r a is a tuple {B. +, . }, where a nd OR defined as fo llows:
B is a et of elements: OR 0
AND 0
+ and . are binary o perat ions applied over the elements of B,
o 0 o 0
saLi sfyi ng the following postulates:
o
PI: If a, b e B, Lhen
These operation ' are used to evaluate switching expressions. as indicated in ;,cuon ~.-I .
(i) a +b= b+a
(ii) a .b = b .a T heorem I
That is, + and · are commutati ve. The switching algebra i a Boolean algebra.
Proof We how that the switching algebra saLisfies the postulate of a B lean al"ebra.
P2: If a, b. c e B. then

496
-'98 A Boolean Algebras
-
A.3 Important Theorems in Boolean Algebra • 499
PI: Commutativity of C+). C, ). Thi s is shown by inspect ion of the operation tables. Theorem 2 Principle of Duality
The commutativi ty property holds if a table is sym meLric about the main Every algebraic idenLity deducible from th
diagonal. if e postulaLes of a Boolean algebra remains valid
P2: DistributivUy of (+) and (' ). Shown by perfect induction. thaL is. by consid-
ering al l poss ible values for the elements 0 , b, and e. Consider the fo llowing . . ons + and· are interchanged Lhroughout; and
Lhe operati
table: Lhe IdentIty elemenLs 0 and I are at so mterc
· hanged throughout
Proof The proof follows at once from Lh f h
abc a + be (a + b)(a + c) anoLher one (Lhe dual) that is obtai ned by . ~ acht L at for each of the postulaLes there is
m erc angmg + and . as well as 0 and I .
000 0 0

00 1 0 0
Thi s Lheorem is useful because it reduces the nu be f .
be proven: every theorem has its dual. m r 0 dIfferent Lheorems that must
010 0 0
Theorem 3
011 I I Every element in B has a unique complement.
100 I I Proof
. Let a E B ; let us assume that a' I and a' 2 are bOlh complements of a. Then.
101 I I uSlOg the postulates we can perfonm the following transfonmaLions:

11 0 I I a' t = a 'i' I by P3(ii) (identity)


II I I I
= a'] . (a + a'2) by hypothesis (a'2 is the complement of a )
Because a + be = (0 + bleb + e) for all cases. P2(i) is saLi sfied. A similar = 0 '1 . a + 0 '1 . 0'2 by P2(ii) (distributivity)
proof shows that P2(ii) is also saLisfied.
P3: Existence of additive a nd multiplica ti ve identi ty element . From the operation
=a . a'i + 0'1 . 0 '2 by PI (ii) (commutativity)
Lables = 0 + a'l . a'2 by hypothesis (a'i is the complement of a)
0+ 1=1+ 0= = a'i . 0'2 by P3(i) (identity)
Therefore, 0 is the additive identity. Similarly
Changing the index I for 2 and vice versa, and repeating all steps for a' 2' we get
0 · 1 =1 · 0=0
0'2 = 0 '2' 0'1
so that I is the multiplicative identiLY·
P4: Existence of the complement. By perfect inducLion:
= a'l . a'l by PI (ii)
and therefore a' 2 = a'i .
a a' o +a' (J-a
The uniqueness of the complement of an element allows considering as a unary
I 0 I 0
operation called complementation.
0 J J 0
Theorem 4
Conseq uentl y, I is the complement of 0 and 0 is Ihe complement of I. For any a E B:

Because all postulates are saLisfied , the sw itchi ng algebra is a Boolean algebra. As a l.a+ l=
result. all theorems true for Boolean algebras arc also true for the switching algebra. 2. a · 0 =0
Proof Using the postulates. we can perfonm the following rransfomlations:
A.3 IMPORTANT THEOREMS IN BOOLEAN ALGEBRA
We now present some importam theorems in Boolean algebra; these theorems can be
applied to the lran,forma tion of sw itching expressions.
SOO A Boolean Algebras
A.3 Important Theorems ,n Booloan Algobra SO l
b) Theorem 7 Involulion La"
a+1 = I · (11+ I ) PJ(i,) For every a E 8 .
= (11+11') (II + I ) P·l(i) (a')' = tl
= (I + (a'· I) P2(,) Proof From the defi nili n f no !cmelll "
by Theorem 3. Ihe complement of p I (II) ,ond (J arc bolh coonpl 'l1\ nh of II ' • OUI.
=a+a P3(II) nn c eonenl " un'que. "h,eh prove.' Ihe Ihcorelll .
P-l(,) Theorem HAbsorption Law
= I
For every pair f elemen15 a. b E B.
C"," (2): by 1.(I+ o · b=o

a 0 0+«(1 · 0) P3(i) 2. (I · (a+b) =a


(II (1')+(11 0) P-l (II) Proof
= 1I (11'+0) P2(II) ( I ):
b
= II' 1'3( ,)
1I (I + ab = lI ' I + tlb P3(i ,)
= 0 P4 (,i )
=a( I + b) 1'2(i ,)

C '.' e (2) can al,o be proven by me:,," of en,e ( I ) and the principle of dualil) .
= a(b+ I ) PI(i)
= (1· 1 Theorem 4 ( I )
Theorem 5
The complemenl of Ihe clelllent I i, O. and vice \er;J. That ". = 1I P3(ii)

I. 0' =I (2): dualilY


2. I ' =0 T heorem 9
I'roof Oy Theorelll 4 . For every pai r of elemenls a, b E 8.
0+ 1 = I and
J. (I+o ' b = (I+b
O· I = 0 2. o(a' + b) = ab
Because. by Theorem 3. Ihe complement of an elcment is unique. Theorem 5 follows . Proof
Theorem 6 Idem polen I Law ( I ): by
For eve ry a E B
a+a ' b = (a +lI')(a+b ) P2(i)
I. a+a = (I = I · (a +b) P4(i)
2. o · (I = a =a +b P3(ii)
Proof
( I ): by (2): dualilY

0+0=(0+0)· 1 P3(i i) Theorem 10


P-l(i)
In a Boolean algebra, each of the binary operalions (+ ) and (. ) is associalive. ThaI is. for
= (a+o) · (a+a')
every a, b, e E 8 ,
= ( a + (a . a'» P2(i)
I. a+(b+e) = (a+ b)+e
=a +O P4(ii)

=0 P3(i)
2. a(be) = (ab)e
The proof of this Iheorem is quile lenglhy. The interesled reader should consult Ihe
(2) : dualily further read ings suggesled al Ihe end of Ihis appendix.
..
•t

11
B
d i Ion I In
u r1 b S
[XAMPLE B 1

,.
(\'II\l'll Ih,' 1I11111"'r
I II ,,'r-llll 11 thl ,.

pomt to re~nt the number In


y
a tinHe number of bl~ auJlablc III
need to be truncated and the b1n.wy
508 B Additional TopIcs In Binary Number Systems B.4 Floating Point Representation 509

B.3 FI XED POINT ARITHMETIC products together at once instead of using the intermediate panial product sums to see
why this method i useful.
If " e tix the bi nary point of a real number in a certain posi-
Before proceeding to binary real number division, we will introduce binary integer
tion in the number (e.g .. after the -lth bit). we can add or
subtrac t binary real number by treating the numbers as inte- °°+ 00 1
I I
°
I I
00 1
111
division. which was nOl discussed in previous chapters.
We can use the familiar process of long divi·
gers and adding or subtracting normally. In the resulting sum
or differen e. we maintain the binary point's positi on. For
1 1 ° 1 . °°°1 sian to di vide two binary integers. For example,
consider the binary division of 1011 00 (44) by
1
divisor 1 Ojl 01 10
1 1 quotient° °°
dividend
examp le. a!.sume we are worki ng with S- bit numbers with figure B.4 Adding two
~tJ
10 (2). The full calculation is shown in Figure
half of the bits used to represent the fract ional part of the (i x"d poinl numbers. B.7. NOlice how the procedure is exactly the
number. If we wat1led to add 1001. 00 10 (9. 125) and same as decimal long division except that the
-0
1 ° °
0011 . 1111 (3.9375). we can simply add the two number a
if thev were it1le2ers. The sum. shown in Figure B.4. Can be converted back to a rea l
numbers are now in binary.
1 1 °
numb~r by maint;ining the binary poit1l's posi ti on within the sum . Converting the sum to
Dividing binary real numbers, like multipli-
cation, also does not require that the binary point
-1 °
decimal verifies that the calcul ation was correct: 1*23 + 1*22 + o*i + 1*20 + 0*2. 1 + be fixed. However. to simplify the calculation, we
~
1 °
0*2" + 0*2.3 + 1*2-4 = 8 + 4 + I + 0.0625 = 13.0625. shift both the dividend and divisor's binary point
Multiplying binary real numbers is also straightforward
and does not require that the binary poit1l be fixed. We first
01.10 right until the divisor no longer has a fractional
pan. For example. consider the division of 1 . 01 2
-0
°°°
x 1 1.0 1
o remainder
multiply the two num bers as if they were integer. Second. we
place a binary point in the product such that the precision of the
° ° 1 1
( 1.25) by 0.1 2 (0.5). The divisor. 0 . 12' has one
digit in its fractional pan. therefore we shift the Figure B.7 Dividing Iwo binary
° ° ° °
product i the sum of the prec isions of the multiplicand and
multip lier (the two numbers being multiplied). just like what is °° °°+ 1 1
1 1 dividend and divisor binary points right by one integers using long division.

done when we multiply twO decimal numbers together. Figure ° °


1 00 . 1 1 1
digit. changing our problem to 10. I, divided by
1, . We now treat the numbers as integers (ignoring the binary point) and can divide them
B.5 shows how we might multiply the binary numbers 01.10 u ~ing the long division approach. Trivially. 101 /1 2 is 101 2 , We then restore the binary
figure B.5 Multiplying
( 1.5 ) ard 11.0 1 (3.25) using the partial product method 111'0 fixed poinl numbers. poin7 to where it was in the dividend. giving us the answer 10 .1 2 or 2.5.
described in Section 4.7. After we calculate the product of the Why does shifting the binary pomt not change. the. answer? Ln general , hifting the
two numbers. we place a binary point in the appropriate loca- radix point ri ght by one digit is the sam~ a . multlplymg the number by its base. For
tion. Both the multiplier and multiplicand feature two bit. of precis ion. therefore binary numbers. shifting the binary pomt rtght IS equivalent to multtplytng the number by
the product must have four bits of prec ision. and we insert a binary point to reftect thi s. 2. Dividin o twO numbers will give you the rallO of the two numbers to each other. Multi-
Convening the product to deci mal veri fie that the calculation was correct : 0*23 + 1*22 + plying the~wo numbers by the same number (by meansof ~hifting the binary point) will
0*2 1 + 0*20 + 1*2. 1 + 1*2" + 1*2.3 + 0*2-4 = -l + 0.5 + 0.25 + 0. 125 = 4.875. nOl affect that ratio. since doing a IS equivalent to muillplytng the ral10 by I.
The pre"i ou, examp le was convenient Fi 'ed point numbers are simple to work with. but are limited in the range of numbers
in that we never had to add four Is that they can represent. For a fixed number of bits. tncreastng the preci ion of a number
1110.1 multiplicand comes at the ex pen e of the range of whole numbers that we can use. and vice versa.
IOgether in a column when we summed up
the pani al product; . To make the caleula-
x ° 1 1 1.1
1 1 101
multiplier
pat1ial product 1 (ppl)
Fixed point numbers are suitable fo~ a variery of appltcan~n~ ,. uch as a digital the~om­
+ 11101 pp2 eter. but more demanding appllcaltons need greater ftextblhty and range in tbetr real
lions simpler and to all ow for the partial
1010111 ppshppl+pp2 number repre~e ntation . .
product ; ummation to be implemented +11101 pp3
u> ing full -adder,. which can onl y add three 71 -71";0'-:0:-'1:-0::-:-1-=-1- pps2 ; pps 1 + pp3
I , at a time. we add the pani al products :;:+..;.1-,;1,..;1,..,0;.-;,.1"...,-..,-_pp4 B.4 FLOATING POINT REPRESENTATION
incrementally in>tead of all at once. For 1 1 0110011 pps3 ; pps2+pp4
;,-+-7°.0;°,-;°;-:°",,°;-;;--;;-;..,-_ pp5 \ hen " orking "ith decimal numbers. we often ~.present vcry large or very small
exa mple. let\ multiply 1110 . I ( 14.5) by 01 101 100.1 1 product; pps3+ pp5 ~c I' clllific n tation. Rather than wntmg a googol as a I with a hundred
numbers by t"lng
(C' III . I ) 7.5. A ~ , een in Figure B.6. we
o . f' . '" c~uld "rite 1.0' JOI!Xl. Ins~ead of - 99.79 _A - m/s. we could write the
begtn by generattng panial products as we Figure B.6 MultIplying IWO fixed poinl , ,I tcc II. C S • 1 99 ' 10' or even 299. * 10".
, 'cd of light u, .1.0*10 m/, . as - ' ' .
did earlier However. we add partial prod- numbe" u,ing inlCrrl1cdtnlc partial product'. , pe . ' . Id be transl3ted into btnary. we would be able to tore a mu h
Utt; Immed iately Into p,lnial prod uct ~um>. II <11 h nOllltlon (au . ' , fi .
be urlll if the bm~ POlllt "ere xed. What feature of thiS nota-
labeled PI" In the fi gure. Eventually. we "ind that the product i, 0110 1100 . 11, whi h grealcr range! of ntlm ~. • . . re entari n'!
lion need I\) be l·:tptured In :1 blnar) rep
corre'JlIlI1d, to 'he correct an >wer. 108.75. You may want to try adding the five partial
510 B Additional TopIcs in Binary Number Systems B.4 Floating Point Representation 511

--'"
First is the whole and fractional pan of the The IEEE standard defines cenain special values if the .
+ 3.0 * 108 are u",form. When the exponent bits are ali a ' . contents of the exponent bits
number being multipl ied by a power of 10. which
is called the malllissa (or sigllificalld). as shown in
Figure B.. We do not need to store the whole pan
of the number if we make sure the number is in a
\
sign mantissa
; \
base
exponent I
s, two poSSibilities occur:
I. If the mantissa bits are all as then the ent'Ire nurn be r evaluates to zero
2. If the mantissa bits are nonzero, then the number is nOl . . .
cenain fonn . We call a number wrillen in scientific Figure B.B Parts of a number in
whole pan of the mantissa is a binary zero and not a one ~:,~~I~z:~i ~at IS, the
notat ion lIormali:ed if the whole part of the scientific notation. When the exponent bits are all 1s, two possibilites occur:
number is greater than 0 but less than the base. In
I. If the ~ a nti ssa bits are all as, then the entire number evaluates t _ . fi .
the previOl;s speed of light examples, 3.0* 10 8 and dependmg on the sign bit. a + or m , mry,
2.998* IOs are norma li zed since 3 and 2. respectively. are greater than zero but less than
10. The number 299.8* 106• on the other hand , is nOl normalized. If a binary real number 2. If the manti ssa bits are nonzero then the emire " b r" . .
number (NaN). ' , num e IS clasSified as not a
is nomlali zed. then the whole part of the manti ssa can onl y be a 1. To save bits. we can
assume that the whole pan of the significand is I and slOre onl y the fractiona l pan . There are also speci fi c classes of NaNs, beyond the scope of tho .
econd is the base (somet imes referred to as the radix) and the exponent by which used in computations involving NaNs. tS appendtx, that are
the mantissa is multiplied. shown in Figure B.8. Calling 10 the base is no accident- the Wi th this
number is the same as the ba e of the entire number. In binary. the base is naturally 2. . information,
. we can conven decimal real numbers ta ft oatmg · pomt . num-
bers..Assuming the' deCimal number to be convened is not a spect'al vaIue In
., fl oatIng
' .
pomt
Knowing thi s. we do not need to store the 2. We can simply assume that 2 is the base and notation, Table B.2 descnbe how to perform the conversion.
SlOre the exponent.
Th ird. we must capture the sign of the num ber. TABLE B.2 Method for converting real decimal numbers to floating point
Step Description
The IEEE 754-1985 Standard Convert the 'Illmher from base Use the melhod described in Seclion B.2.
The Institute of Electrical and Electronic Engi neers (IEEE) 754- 1985 standard specifi es a 10 to base 2.
way in which the three values described above can be represented in a 32-bit or a 64-bit 2 COIwert 'he "umber 10 Initially multiply the number by i'. Adjust the binary point
binary number. referred 10 as single and double precision. respectively. Though there are 1lormali:ed scientific notatioll. and exponent so that the whole part of the number is I,.
other way to represent real numbers. the IEEE standard is by far the most widely used.
3 Fill ill the bit fields. Set the sign. biased exponent. and mantissa bits
We refer 10 these numbers as f/oa tillg poillt numbers. appropriately.
The IEEE standard a signs a
cenain range of bits for each of bit l31 130 129 1.
the three val ues . For 32-bit num-
124 123122 121 I· .
LSi-g-n-'-......J-e-xLp-o-n-e.Lnt-..J...-L......J.-m-'a-n-t~is-s...
o
a-L...-' EXAMPLE 8.2 Converting decimal real numbers to floating point
bers. the fi rst-most significant- Conven the following numbers from decimal to IEEE 754 32-bit floating point: 9.5. infinity. and
Figure B.9 Bit arrangement in a 32-bit Hoating point
bit >pecifie; the sign. followed by number. -52406.25 • 10".
bit for the ex ponent. and the Let's follow the procedure in Thble B.2 to convert 9.5 to. floating point. In tep L we COm'en
remaining 23 bits are ued for the mantissa. Thi arrangement is piclUred in Figure B.9. 9.5 to binary. Using the subtracuon method. we find that 9.5 IS 1001 . 1 in binary. To com-en the
number to scientific nototio~ per "';I' 2. we muluply the numbe: by 1'. giving 1001.1 • _0 (for
The sign bit is set to 0 if the number is positive. and the bi t is set 10 I if the num ber is
readabilit), purposes. we WIlte the 2 pan In base 10). To nonnahze the number. we must shift the
negative. The manti<sa bits are set 10 the fract ional pan of the mantissa in the original binary poinl left by three digilS. In order to not change the value of the number after movino the
number. For exam ple. if the manti sa is 1 . 1011 . we would store 1011 fo llowed by 19 binary point. we change the 2's exponent t.o 3. After step 2. our number becomes 1 _00 11 • }
zeroe, in bits 22 to O. As part of the standard, we add 127 to Ule exponent we slOre in the In step 3. we put everything together Into the properly fonnalled sequence of bits. The ion bit
exponent bits. Therefore. if a fl oating point number's exponent is 3. we wou ld store 130 in is set to O. indicating n positive number,.The ~~"(ponent bits are S~l t~ 3 + 127:: I ' '0 (we must bi:s the
the exponent bits. If the exponent wa~ -30. we wou ld store 97 in the exponent bits. The exponent) in bina,). and the mantissa bllS areset to 0?11 ". which IS.the fra tional part of the man-
adju·;ted number i, ca lled a hiased exponent. Exponent bits conlaining all Os or all 1s have ti>sa. Remember that the 1 to the left of the blnar)' pomtlS Imphed In e the number is normalized.
'pecl3l meanings and cannot be used. Under these condi tions. the range of biased exponents TIle properly encoded number is hown m Figure B.IO.
we can wnte in the exponent bi ts is I to 254, meaning the range of unbiased exponents is
- 126 to 127. Why don'l we .. imply store the exponent a< a signed, IWO 'S complement number
(di'>Cu,'>Cd In Section 4.8)? Becau,e itlUms out thai biasing the exponent resulL< in impler
circuitry for cornpanng the magnilUde (absolute value) of IWO noming poinl numbers.
511 B AdditIOnal TopICS in Binary Number Systems
-
B.4 Floating Point Representation 513
Nm\ let':, conven infi nity 10 a The form at for double preci sion
Step 1: Conven to binary
Hoallng po lill number. Since infinity
IS.I special \alue. \\c cannol employ
9.510 <=> 1001.1 2 (64- bit) floating point numbers i 1
bil [ 63 62 161 I· ..1 53 152151 Iso I.
··1 1 0 I 1
similar, with three fields having a Sign exponent
the method" e used 10 om'en 9.5 10 Step 2: Conven to normalized scientilic notation mantissa
floaling point. Rather. we 1111 in Ihe 1001.1 <=> 1001.1 • 20<=> 1.0011 ' 2 3 defined number of bits. The first Figure B.ll B"
mo t significan! bit represents th~ number. II arrangement in a 64-bit Roating point

~
three bit Iklds with :,pcc ial values
indicating tha t the num ber is infinit y. sign of the num ber. The next I I
To normalize. move binary
From the discussion of spec ial values bits hold the biased exponent and the remaining 52 bi hi '
point 3 digits left & add 3 to exponent mantissa. AdditIOnally. we add 10?3 to th . ts 0 d the fractIOnal pan of the
abo\c. we know th at the exponent
bits shou ld be all Is and the mantissa exponent. This arrangement is picl~red in ;i:~:~e~: . Instead of 127 to form the biased
bits should be aliOs. The sign bit
shou ld be 0 since infinity is positive.
Step::;~ F loaling I'oi nt Ari th metic
Therefore. the equivalent fl oating Floating poin! arithmelic is beyond the scope of this text, but we'l . .
vtew of the concept. Wt I prOVIde a bnef over-
poin! number is 0 11111111 .Q. 10000010 00110000000000000000000
0000000000000 0000000000 . sign exponent mantissa Floating poin! addition and subtraction must be performed b fi "
Conven ing -52-l06.25 ,. 10- 2 (biased) fl oating point numbers so that their exponents are I F Y rst aitgllmg the two
to floating poin t is stra igh tforward the two decimal numbers? 5?* leY + I 44* 10' S. equa. Or example, consider adding
Figure B.l0 Represenling 9.5 as a 32-bit Roating point -. - . . Ince the exponents d ''''' h
u,ing the method in Table B.2. For
number. most significant bit first. 2.52 * 10 2 to 0.0252* 10" Adding 0.0252* 10' and * ' IlIer.we canc ange
step I . \..'e conven the number to I ..46 )-2* 10' . S·Iml'1arl y, we could have changed 1.44* I 0' 1.44 10 gIVes us the answer
t 144* 0 2 . o >
binary. Recall that we represent the * ' . 2 0 I . Addm 144* 10-and
2.52 10- gtves us the sum 146.52*10. which is the sarne be 0
sign of the number using a single bit and not using two's complement representation. so we . I " num r as our first set of calcu-
IatlOns. An ana ogous situatIOn occurs when we work w'th fl ' .
only need to com'en 52406.25 * 10" to binary and set the sign bit to indicate that the number
Typicall y, hardware that performs Hoating poin! arithmetic O'ft o~ung PO'"t num~rs.
is negative. The number 52-106.25 * 10" evaluates to 524.0625. Using the subtraction or divide- . . '11 d" ' en re.erred to as aJWatmg
pO/l11 1/1111. WI a Just the manttssa of the number with th all
by-2 method we know that 52-1 i 1000001100 in binary. The fractional part. 0.0625. is con- . . '. e sm er exponent before
\'eniently 2-<. Thu 52-1.0625 is 100000 11 00 . 0001 in binary. In step 2. we write the number addtng or subtracting the manussas (with their implied I s res d)
. . tore to ooether and pre-
in scientific notatio n: 1000001100 . 0001 * 20. We must also normali ze the number by servtng the common exponen!. Notice Ihat before the addition or subtraction is
shifting the binary point left by 9 digits and compensating for thi s shift in the exponent: performed. the exponents of the two numbers are compared Th' . . ..
" . . tS COmpanson tS facili-
1 . 000001 100 000 1 * 29. Finall y. we combine the sign ( I since the original number is nega- tated through the us: of Ihe sIgn bit and the biased exponent as opposed to re reseorino
tive). biased exponent (9+ 127= 136). and fractional part of the mantissa into a noating point the exponen t In twO s complement form. p e
number: I 1000 1000 00000110000010000000000. . Multiplication and division in Hoating point require no uch alignments. Like in
deCimal multiplication and d"" ton of numbers in scientific notation: we multi I or
divide the mantissas and add or subtract the two exponent depend ' th pY
. . . mg on e operauon.
EXAMPLE B.3 Convertm g floating pomt numbers to decimal When multIpl ying. we add exponents. For exao:ple, let's multiply 6.-14* 107 by 5.0* I 0-3.
Ins tead of trying to multiply 6-1.-100.000 by 0.00). we mUltiply the two m U· a th
Comen the numbe r 1 100 10 11101010100000000000000000 from IEEE 754 32-bi t fl oating 644*- O ' 32? . an ssas tO"e er
point (0 decimal. a nd add the exponents. . ). IS . - and 7+(-3) tS -I. Thus the answer is 3_.2 * 10'.
To perform thi ~ conversion. we first split the number into its sign. exponent. and mantissa \ hen di\ iding. we subtract the exponent of the e1ivi or from the e1i\ 'd d'
. d' 'd 3 1 - * 10'" (d" d d) I' ~ en exponenL
pan.<: I 1001011 1 0 1 010100000000000000000. We can immediately see from the sign bit For example. let s IVI e . ) tVI en by 2.0' 10- - (divisor). D ' v'di 031 - b
- b . h d" , I I n_ .) y
that the number is negative. 2.0 g ives us 15.7) . u tmctlng t e _1\' tS~r S e>;ponem from the dividend's- gives us
Next. we convert the 8-bi t ex ponent and 23-bi t manti ssa from binary to decimal. We find that - 1-{- 12)=8. Thus the an wer IS 15.7) * 10 . Floating point divi ion defines ";ults for
1(1)101 I I IS 151. We unbi as the ex ponent by subtracting 127 from 151. givi ng an unbiased expo- several boundary Ilses u~h as d,vtdlng by O. \~hich evaluates to po iti"e Or negative
nent of 24. Recall tha t the mantissa in the pattern of bits represen ts the fractional part of the infinity. depending on the ~ tgn of th.e diVidend. Dlvtdtng a nonzero number bv infinity is
manu"a and I< 'tared Without the leading 1 from the whole part of the manti ssa (assuming the defined :l, O. othet'\\ ise d,vldmg by tJlfimty tS -
oTl glnal number wa, normalilOd). Restorin g the I and adding a binary point gives us the number
J.f)JOIOI()()()()(){)OO(. whic h is the ,arne number as 1.010101. By applying weig hts to
each di git. "'. ,ee that 1.010101 = ,·za + 0*2" + 1*2.2 + 0*2'] + 1*2-< + 0*2' s + , . 2.6 =
t ]2~ 12 5
Wi th the oTlglnal Ign. exponent. and mantissa ex tracted. we can combine them into a single
numlle r - I 327125 • 2". We can multiply the number out to -22.265.462.784 . which is equivalent
tll -2221)5-162784 If)'

rna- •
514 B Additional Topics in Binary Number Systems

B.5 EXERCISES
SECTION n.2: REAL 'UMBER REPRE ENTATION
I. Convert the following nUl11bcr~ from decimal (0 binary:
(a) 1.5
(b) 3.125
(c) 8.25

2.
(d) 7.75
Convert the following numbers from decimal to bi nary:
Extended RTL Design
(a) 9.375
(b) 2.4375 Example
(c) 5.65625
(d) 15.5703 125

SECTION n.3: FIXED POINT ARITHMETIC


J. Add Ihe fo llowing IWO un;igned binary numbers u ing binary addition and convertlhe result to
dec imal: C.l INTRODUCTION
(a) 1011 1. 001 + 1010.110
In Chapter 5, we performed RTL design of a soda dispenser processor. We ,tuned with a
(b) 01101 . 100+10100 . 101
(c) 10110.I+llO . Oll
high-level state machine, created the datnpath's structure, and then described tile o n-
(d) 1101. 111 + 10011 . 0111 troll er using a finite-state machine. We did not further design the controller to s!nle turc.
as such deSign was the subject of Chapter 3. and we did not wish to clutter hnptcr S"S
SECTION B.4: FLOATI G POINT REPRESENTATION RTL design discussion with too many details of previously learned material. In thi s
~. Convert Ihe foll owing decimal numbers to J2-bil noating point: appendix , we'll complete the RTL design by designing the controller's F M down to a
(a) - 50.208 state register and gates, resulting in a complete custom-processor implementation of u
(b) ~2A27523 · 10' controll er and a datapath. We'll then trace through the behavior of the complete imple-
(c) - 24.55 1.152 · 10'" mentation. The purpose of demonstrating this complete design is to give the reader a clcar
(d) 0 understanding of how the controller and datapath work together.
5. Convert the follow ing 32-bit naming point numbers to decimal: The block symbol for the soda dispenser processor appears in Figure C. I. Recall thut
(a) 010011000101 10110101 100001011 000 the soda dispenser features three inputs, c. S, and a. The 8-bit input S represents the cost
(b) 01001100010110 11 0101001000000000 of each bOltle of soda. The I-bit input C is 1 for
~) 01111111111000 11 00000000000 00000 one clock cycle when a coin is inserted. Addi- 8
(d) 01001101000 110101000101000000000 tionally, the value on 8-bit input a indicates the
value of the coin that was inserted. The soda dis-
Soda
penser features one outpUt, d, used to indicate dispenser
when soda should be dispensed. The I-bit processor
output d is 1 for one clock cycle after the value
of the coins inserted into the soda dispen er is Figure C.l Soda dispenser
greater than or equal to s. The soda dispenser block symbol.
does not give change.
In Chapter 5, we developed the high-level state machine seen in Figure C.2. We s ub-
sequently decomposed the high-level state machine into a controller (repre ellled
behaviorally as an FSM) and datapath, shown in Figure C.3. The datapath supports the
data operations necessitated by the high-level state machine. includtng res~tung the value
of ror (ror =0 in the Illir state), comparing if ror is less than S (for the tran~'t'?ns from the
Wair state), and adding lOr and a (in the Add tate). The controller FSM IS slmtlar to the

SIS
516 Extended RTL Design Example
C2 DeSigning 1/10 SOds Olspln 01 Controller SI7
hi gh-Ieve! qate machine. but " Input c (bfls). a (8 bfls). s (8 bits) Encode the tate .
OurputS' d (bll)
. 0 slr:ughtfoN nrd en,ndl
modified to control the d.lla- frill : O. IInit: 0 I. dd. 10. and DIJp: II nil 0' the ""'.1 .11'1 'II'a\ hlllr 't.IIC, "
Local reg/siers lot (8 bits)
path and accept ,wtu, Input
from the datapath (I e. Create tire tate Table F .•
. tOnl Ule controller 'h .
to t i t , ) rather than per- we kn 0\\ thm the Itnte table UI\ ttC<tu~ lie d"" '"cd III " " ,'.11 Ilrr 'tel'
mu t 3 COunt for ~ .
fonlllng d:lw opcraW)ll' outputsC.d o Id.O clr nl '"PU"( . a 1 . 1.. ln" SO) llnu~
t1irectly. The controller and
4
2 = 16 tOw (Figure .5). ' . nnd nO) \\tth ~ 1111 ut'. the 't.lle 1.lhle \\IIIIII 'I"de
dawpath arc ,hown In h gu re
'.3. Figure C2 Soda dl'pcn",r Illputl
0u\pU1I
hlgh - Ic~cl ... t..lte mtichlOc d=\ .1 sO c toI ~_.
d tOl Id tot clr nt nO
0 0 0 0 0 0 1 0 \
0 0 0 \
Inputs c. tOI It s (bit) ~ 0 0 \
0 0 t 0 t
OutputS' d. tol Id. tot clr (bit) 0 0 0 \ 0 1
0 0 1 1 0 0 1 0 \
0 \ 0 0 0 0 0 \ 1
0
'"
~ 0
1
1
0
\
1 0 0 0 0 \
0 0 0 0 \ 0
0 1 1 1 0 0 0 1 0
1 0 0 0 0 1 0 0 1
Controller :s
<0:
1
1
0
0
0
\
1
0
0 1 0 0 \
0 1 0 0 1
(a) (b ) 1 0 1 1 0 \ 0 0 \
Figure CJ Suda tlI'pcn,cr; (a) controller (de,,,,bed beh.l\ lora lly) and (b) datapalh ("ru ture) . 1 \ 0 0 1 0 0 0 0
1

C.2 DESIGNING THE SODA DISPENSER CONTROLLER


! 1
1
1
0
1
1
0
1
1
0
0
0
0
0
0
0
0
1 1 1 1 1 0 0 0 0
U,ing Ihe tilc -~ t cp controller de;ign procc" Introduced in hapter 3. we can complete
the de, ign of the controller. The five steps are as follows : Figure C.5 The soda d"pcnscr conlroller" >tate wblc .

By examining the outputs pecified '" the


Captll re the FSM. The F I for the soda tnpul. c. lot It • (bot)
comroller FSM. duplicated for convenience to OutpulS' d. tot Id, lot clr (b.t)
displ..'n:-.cr"s controller \Va, crea.ted during
Figure C.6, we fill in the corre~pond tog d.
step ~ of the RTL dc,ign method . The con- Combinational toUd tot_I d, and tot_CI r columns in the state table.
troller's FS~I is shown in Figure C.3(a). logic
d For example, in Figure C.6, we see that when the
controller FSM is in the filii state, d-O,
Captllre the A rchitecture. As indicated
tot_CI r-1. and tot_ld is implicitly O. Thus.
by the controller's F M. the tate
for rows in the state table that correspond to the
machine's architectu re require at least 2
fil i i state - namely, the four rows where
inputs (C and tot _ I s) and J ou tputs (d.
sls0-00 si nce we chose "00" as the encodi ng
to I d. and to . CI r). Additionally. we
for the In i/ state - we set the d column to O. the
will usc two bits 10 represent the con-
t ot_CI r column to I, and the to Id column Figur. C.6 Soda dispen,,:r conlroller
troller's states. which adds an additional 10 0 . - FSM WII/1 ' tate encoclins-,.
two in puts (the current state sls0) and two
We fill in the nex t state columns. nl and nO.
outputs (the next state n 1 n 0) 10 the con-
based on the the transitions specified in the controller FSM and the stale encoding we
troller architecture. The corresponding
chose in an earlier step. For example. con ider the Wait state. As indicated in Figure .6.
controller architecture is shown in Figure Figure C.4 Standard controller architecture
for the soda di penser. the FSM transition to the Add state when coL Thu ~. for rows where s !sOc-Oll
CA.

maw
5 18 C Extended RTL Design Example C2 Deslgnmg the Soda Dlspensol ContIolior 519
(s 1 sO 01 corre'pond, 10 Ihe \Vall ~Inlc). we CI Ihe n 1 column 10 1 and the rO col umn sing lechnique; di cu."ed '" ha I ,
inio an equi l olcnl III G-Icl cl gUie-ll'l-cd p rr -. lie ~lIl11e" the Jh.."c 1l<~\lc.,,\ elll""" 11\
10 0 (n I nO 10 corre'pond, 10 Ihe Add ,laiC). When ~-O. Ihe F 1\1 Imn,II1 n., 10 Ihe D/ p
lhe Boolean equalion, li e are c nl II'('UII Th" '1 01 cr."," I' 'Ir.l1l1ll1lc"\\.,,,1 ''''<0
'Wle If o. I 0 <lr remalO' In Ihe \\~111 "ale of t t l 1 We reprc'>eOl the e"'"8 .11\: olreJoI 10 I I
Iran'"I011 fr011l Wall 10 01\(1 In Ihe talc lable bj \ClUng r 1 10 I and nO 10 1 (D/.rp) in lhe equenl inl controller circuli Qnd the d . 'U11l\'p"", \I,'"
hll1l\ I he ",,"1
Figure C. . Jlup.llh fm Ihe 'I 1.1 Yhf'CII\CI " ,h,1IIll III
row II herc S 1 0 0 I (\~lIt) . O. and ;) 1 O. 111I1Iarl). I\e repre'>ClII the tr.IIl I-
lion fr011l \~lIt back lO \Vall by wnllOg P I ~O 1 1\ here 51 0-01. r -0. and
o I , 1 Wc Ihcn C"'11IllIC Ihe rem;,,",ng "<1n\1I10n, 10 a "mllar I\J). filling In Ihe
appropriale valuc, lo r n 1 alld nO UIIIII all Imn'"10n' arc ,Iccoun led for The compleled
, laiC table" , hown III Figure c.s.

Im plem elltthe C{)mbilloti{)llal t oxic. For each of Ihe ,laIC 1.lble \ OU IPUI . we IHlle lhe
corre'pollding \3oulean eq uullnn . From the \l,lIe table lie oblaln Ihe follow 109 equollon .
d 51 a
o
Id - sls0 '
o
clr - sl 'sO'
n1 - sl ' sOc ' 0 1 s' s l' sOc

nO - 51 ' sO' sl'sOc ' + sls0'


nO Sa ' + sl ' sOc'

NOle Ilwl Ihe tiN four equallon, derived fro111 G slsO


Ihe Malc lable arc nlready minl11llled . The fifth equa- c
lion. corresponding 10 nO. can be mlnlmi/ed 10 sO'
+ s I' sOC ' Ihrough algebraic mClhod~. or by u,ing o
a K-map t" ,howII III Figure .7. K-map' nre di,-
cussed in Seclion 6.2. Figure C.B Fin al implementallon of Ihc "Xla mllCh,"c controller 1 1~1Ir< (' lell) Wll h dlllllpl1l h
SI 'sOc' sO'

Figure C7 K -map for the inlllru


cquallon for nO .
520 C Extended RlL Design Example C.3 Understanding the Behavio .
r of the Soda Dispenser Controller and Data path 521
C.3 UNDERSTANDING THE BEHAVIOR OF THE
SODA DISPENSER CONTROLLER AND DATAPATH
In this section. we will look closely at how the controller and claw path we designed for
the soda di spenser interact to form a working implementation of our initi al hi gh-level
Slale mach ine.
Figure C.9 ill ustrate, the behavior of the soda dispenser controll er and dalapmh ,
including initi ali zati on and how the soda di spenser behaves when the user inserts a
quaner into the system. The 5 cloc k cyc les shown are labeled I through 5 in the figure .
We' ll assu me thm the cost of a soda ca n is 60 cents and thm the soda dispenser's con-
troll er is in the /Ilir stme during the firs t cloc k cycle. Let's examin e what occu rs in each
clock cycle:
Initi all y. in clock cycle I. the controller is in the /Ilil stale. shown in Figure C.9(b).
When in state /Il il. the controll er sets d to O. tot_ l d to 0, and tot_cl r to 1.
(b)
Additiona ll y. the cont roll er sets the nex t state signals nInO to 01. corresponding
to the H~,i' state. In the dawpath. the va lue of 101 and lOi+a is unknown. denoted
by ''??''. Notice that eve n though the cont ro ller set t ot _ cl r 10 1 during thi s elk
clock cycle. the 101 register wi ll not be cleared immediately (asy nchronously). slate (5150)
Rather. 101 will be cleared shonly aFter the nex t cloc k cycle, a synchronous
behavior. Finally, notice thm the price of the soda, s. is set to 60 cents and the
"'c:
<0 nexl state (ntnO) ~;=~:;~~~~~G;~~~~~~~~==
coin input signa ls. C and a. are initi all y 0 and O. respectively.
Ol
'in
~
d n--hr~--T- __+.==:::::ii;---+---
Figure C.9(c) shows the soda dispenser in clock cyc le 2. The contro ller is now in eE loUd
C :

h' =---+----I---......w
the iVail state . Accord ingly. the controller sets d, tot_ l d. and tot_c 1 r 10 O. (a) o
n __

()
The va lue of 101 is cleared. and shonly afterwards. two signals. tot_ l t_s and "'c:
<0
IOI+a. take a know n va lue. The datapath 's comparalOr sets tot_ l t_s to 1 since Ol
'in
the total . O. is less than the price of soda, 60. The datapath 's adder sets interme- .c: 101
OJ
diate signal 10i+a to 0 since 101 and a are now known. The nex t state signals a.
25 25
remai n set 10 01 (IVa it) since c is 0 and tot _ l t _ s is 1. 0
'"
OJ
tol+a ?? 25 25
Figure C.9(d) shows the soda di spenser in clock cycle 3. During the third clock 60
cycle. the user insens a quaner inlO the soda di spenser. as indicated by C
becom ing 1 and a becomi ng 25. Shonly after a changes, the adder's output 101+a
changes to 25. the sum of 101 and a. Since c is 1. the controller sets the next state
to 10 (Add). The va lues of d. tot_l d. and toCc 1 r remain the same since the
controller's stale has not changed since Ihe previous (2 nd) clock cycle.
In cl oc k cyc le 4, shown in Fi gure C.9(e), the conlroll er is in the Add stale and sets
tot_ l d 10 1 while keeping d and tot_c 1 r at O. As was Ihe case wilh tot_clr
during Ihe /Ilil stale. 10 1 will nol be updaled until Ihe neX I clock cycle. The con-
troller will uncondi ti onally relurn 10 slale iVail . selling nInO 10 01 (Wail ).

00 00
Figure C.9 Soda dispenser operation from initialization to inserting a quarter: (a) timing diagrnm. (b}-{e) signal values
during clock cycles 1-4.

m
C Extended RTL Design Example C.3 Understanding the Behavior of th S .
522 e oda Dispenser Controller and Datapath 523
In clock cycle 5. shown in Figure C. IO. the cont roller sets d. to t _ l d, and
tot_c 1 r to 0 since the controller is in the Wa it state. The tot register loads the
value of IOt+G. storing 25. Shonly afterwards, lOt+ a changes to 50 to refl ect the
new value of lOt . however, 50 is not loaded into to t as tot will only perform a load
synchronous to the rising edge of the clock signal.
The addition procedure de monstrated in clock cycles 3 through 5 is repeated for each
coin insened unti l enough change has been insened to cover the cost of a soda, a indi-
cated by inpu t signa l s.

Figure C.10 Operation


of the controller and
data path: clock cycle 5
from Figure C.9(a).
C Extended RTL Design Example C.3 Understanding the Behavior of the Soda Dis penser Controller and Datapath
52~ 525
Fig ure C. II de tai ls the behavior of the soda di,pe."er when the user has inserted
enough change in to the machine to merit a soda being di spensed. In the timing diagram
shown in Figure C. II (a). we dupl icate clock cycle 5 from Figu re C.9(a) as a point of ref-
erence. During the nex t few dOlen clock cycles. we assume that the user has inserted a
nickel followed by a quarter. As a result. the reg ister 10 1 will contain the value 55 e
1->0

(25 + 5 + 25 cents). Lct"s examine the behav ior of the soda di 'pense r when the user inserts
a dime into the machine :
In Fi gure C. II (b). corresponding to cloc k cyc le 100. the socia dispe nser' S con-
troller is in the IVa il state. Assum ing the user insert s a d ime into the soda
dispense r. the c input will become hi gh for one clock cycle and the a input will

~
change to 10. the value of a dime. Short ly after a changes, the intermediate signal
101+0 changes to 65 (55+ I 0). With c asserted . the nex t state signal s nInO become

10 (Add).
In clock cycle 10 1. shown in Figure C. II (e). the contro ller is in the Add state and (b) (e)
assert to _I d. The regi ter /0/ will not load a new total until the ri sing edge of
lhe next clock cycle. The controller uncondit iona ll y sets the nex t state to 01 Clk~ " ' ~
( \\'ail).
state (s l s0) ~=====
Figure C.I I(d) shows the status of the soda di spense r in clock cycle 102. where '"c: I I I 1

the controller is in the IVail Slate . As ind icated by the arrows in Fi gure C.II (a).
0;
next state (nl nO) ~=====i' ~ait 00 UISP : Init: Wait
'"
I~
! ~~~~~d--i
'in
tot_l d being asse rted on the rising edge of the cloc k ca uses 10 1 to load the value
~
on its inpu t. which i, 65 . Shortl y aft er 101 loads a new va lue, the comparator's
output to t_l t_5 changes from 1 to 0 to re fl cct the fac t that 101 (65) is not less (a)
ec tot ' it=
o ~ ----L::::: : ~
than 5 (60). Since the controller is in the Wail state. and since both c and
tot_l _5 are O. the cOl1lroller sets the ncx t state signal; to 11 (Disp). Notice
(.)
~
-5
toUU -=+-_
tot.slr
____' ,: {/:
tot ~ _____ 1 55 t 55 65 I 65 I 65
that prior to the nex t state ignals settling on the Disp statc. the next state was Wail ~ a2sT====gs, 10 10 10 , 10 , 10
for a brief period of time. Depending on the time requ ired for signal s to propagate
through the datapath and controll er. certa in signals may initia ll y contain unex-
pected value,. but the ~e signa ls wi ll eventu all y settl e to their expected values. We
8
1 tot+a
S
~::::~ ,
2£4-m- i
65
60
65
60
75
60
,
I
75
60
, 75
, 60

can avoid any problems a soc iated with thi s peri d of uncertai nty by selecting a
clock period that is long enough to allow our circuit 's intermcdiate signals to .-l-_ _ _ _ '_°-l ~a ~ lO la
; ettle into a , tablc state and stay stable long enough to comply wit h any setup •
time, requi red by ou r circuit's sequential component"
In Figure C. II (e). the controller is in the Dis!, Sln te. The cOl1l roll er asserts d, indi -
cating to ,orne outside component th at a soda should be di' pcnsed. The controller
will unconditionall y tran. iti on to the /Ilil state. where the initi ali zation procedure
shown in Figure C.9 is repeated (partia ll y show n in clock cyc le 104 of Fi gure
C.1 1(a) ).
We ,ee that lhe controller and datapath work together to implemen t the behavior of
the origi nal high-level ,tate machine.

(d ) (e)

Figure C.ll oda J"pcn"'r opec.ll1M \\ hen ,ullic.en! change has been i=cd: (3) timing diagram. (bHe\ signal
, alue, dunllg clock c)clc, IOC Ill.\
Index ~
ASCII. 10
ASICs, see Applica ti o n-S pec ifi c Integrated Circuits BOOlean algebra, 38, 47-55 496-504
ASM s (a lgorithmic sta te mac hines). 233 e.valualing expressions in '48-49
I nd ex Assemble r progra ms, 430
Asse mbl y code. 431
hterals in. 50
operators in. 38-39, 48-49
.

product terms in, 50


assert (term ). 137
Properties in. 50-55
asse rt stateme nt s, 456, 458-459
sum-of-products in, 50
Assoc iative propert y. 50. 50 I
SWllchmg, 497-498
of logic, 504 Asychronous circ uits. 102
=. :!~I terminology, 49-50
of sets. 504 Asyc hronous inputs. 133
SdlSpla) , tatement. ~5 7 theorems in. 498-503
switching , 496. 497 Asyc hronous reset inputs, 135
6 HC II microprocessor. 2 1 Variables in, 49
Algebraic meth ods, in two-level logic size optimi zation . Asychro no us se t in pu ts, 135
7~F subserie- ICs. ~03 Boolean functio ns, 55-{i7
296-298 Atria (of heart), 138, 139
7~ H C subseries ICs. ~0 3 canonical form , 63-{i5
Algorith ms: Audio, dig it ized. 6-8
7~LS ,ubseries ICs. ~03 ClrCUlls fO.r re~resenting, 56
Espresso too l in. 3 15 Audi o recordi ng, 5-7
~OOO series ICs. ~ 03 and cO~bmatlOnal circuits, 65
exac t. 308 Automation conversIon of. 58-{j()
7 ~OO series ICs. ~02-1().1
selec tion o f. 356-357 with Quine- McClus key method, 3 11 -3 12 defined, 55
8051 microprocessor. 21. 422
fo r state red uct io n. 3 19 of two-leve l log ic size optimization, 308-3 15
equations for representing, 56
A Algo rithmic state machin es (AS Ms), 233
B truth tables for represe nting, 56-58, 62-{i3
Abe." compone nt (A L-extender). 203 Ahemalive minimum-bidwidth binary encoding, 323-324 BOOlean logIC gates, see Gate(s)
Bardeen. Jo hn. 33
Abo\e-mirro r display (exa mple ): ALUs. see Arithmetic-logic units
Bases tati o ns (cell pho nes), 279-28 1
~oolean ?perators , see Operator(s)
with 16 32·bi l registers. 20-+-207 always procedure, 453-454 Boollng ' computers, 43 1
Base ten. 11 - 12 Brattain. Walter, 33
"ith 16,32 regi ster file. 208 A mpe res, 3 1
Basic input/ou tpu t system (BIOS), 431 Buffers, 206, 272
\\ Ith parallel-load registers. 155- 156 Analog ci rcui ts. 5
Ba ic SR latch. 97-99 Bus (i n registe r files). 206
with shift registers. 159. 160 Analog phenomena, encoding of. 9
Beamforrners. 210-213 Bus interface, 238-241
with up-counters. 183 Analog signals, 4
princi ple of. 210-2 11 Bus protOCol, 239
Absorption Law. 501 Analog-la-digital converte r, 9
in ultrasound machines. 2 12-2 15
Abstraction (in RTL design). 276 AND gates. 43-44, 404-407 Button press synchronizer (example). 123-124
Behav ioral-leve l design. 254-258 Button sensor. 10
Access time (RA M). 263 AN D operato r, 38-40
Bell. A lexander G raham. 8
Active-high input. 136
ACllve-Iol> input. 137
Application-S pecific Integrated Circ uits (AS ICs).
38G-388
Bell Labora tori es. 33 c
Bell Telephone. 8 C (program langu age), 19-20, 254-258.388
Actuator. 9 cell arrays. 383 BeltWam c irc uit. 387-388 C++ (program language), 254. 258. 388
Adaptive cruise comro!. 237 FPGAs vs., 40 1 B-frames. see Bidirectio nal predi cted frames Calculators, 200
Adder(,). 165- 173. 197 gate arrays, 38 1-382 Biased expone nt. 5 10 Calculus. propositional. 504
building a SUblI3clor using. 197-200 implemen ting. usin g NO R gates. 386-388 Bidi rectional predicted frames ( B-frames), 363-364. 369 Cameras. digital. 22-23
carry-lookahead. 33.\-342 impleme ntin g, using o nl y AND gates, 384-386 Binary numbers. 11- 17.505 CAN (controller area network). 160
carry-ripple. 166-173.339-340.468-471 standard ce lls. 382-383 Binary num be r systems. 05-5 13 Canonical form (Boolean functions). 63-65
carry-;elect. 3~2-3~3 structured. 383. 408 fi xed poin t ari thmetic in. 50 -509 Capture (step in combinational logic desiga). 67-{i9 ..
creallng faster. 333-343 Architec ture. 447 Roating po int represcntation in. 509-513 Carry-lookahead adders. 334-342
deSIgn examples using. 171-173 Arithmetic: real num ber represe nt ati o n in. 505-507 efficient example. 336-339
.+-bit carry-ri pple. 169-171 fixed point. 508-509 Binary poi nt. 506 half-adders in. 337-339
full-. 168-169 Roati ng point. 5 13 Binary rcprc c nl3li ons. 4 hierarchical. 339-3~2
h.lf-. 167-168 A rithmeti c/logic ex te nder (AL-exte nde r). 202-203 Binary sear h. 357 inefficient example. 335-336
"·bu. 165-166 A rithmeti c/logic instru cti o ns. 439 BIOS (basic input/ou tput ,y tem). 431 Carry-ripple adders. 166-173
t"'o-Ievel logIC. 334 A rithmeti c-logic units (A LUs). 20 1-203 Bit. 4 in dntapath component description. ~ -171
Adder tree. 215 multi -fun ction calculator using. 203 Bit file, 399 8-bit. 173
add ,",tructlon. ~19-131. 434 operati o n. 423-424 Bit storage. 96. ec also <pecific types. e.g.: R Intches -l-bit. 169-17_
Addlllve Identu} elcment. 497 A RM microprocessor. 2 1. 422 Bitwise opcrn tion. ~O I full·adders. 168-169
Addll1ve '<lund. 211 Arrays. Sec a lso Field programmable ga te arrays (FPGAs) Blinking li ght- (10 computcrs). ~)O half-adders. 167-168
Addre" (for reg"ter). 205 ce ll ,383 Block sy mbol. 152 and hierarchical arry-lookahead adders. 339-~
"'L-extender. ;ee Anthmellcflogic extender gate, 38 1-382. 389 Board game,. computcn/cd. 157 Carry-ripple style magnitude comparator. 17 -I 0
Algebr""1 programmable logic. 407 Boole. George. ), ClUT)-sclc t adders. 34l-3~3
528 Ind ex
Index 529
test benches in, 455-459 Contro ller(,). III. 11 9- 130. 135- 140
Cas.elle '"pes. 5- 6 behavi or of. in soda machine di spenser example, 519- rui e control, adapt;'c, 237
us ing hardware languages. 447-459
Cell arrays. 383 525 Crystals. pielOClectnc, 210
Combinati onal logic design, 67-72, 168- 169 CUlT'<nt (teon ), I
Ce lls (cell phone region<). 279 co mmon pitfalls with, 128- 129
Combin ati onal logic optimizati on, 296-3 17 Currentstate signal, 46 66
Cells. stand ard (AS IC), 382- 383 connec ti on of dot apath to. in RTL design . 236
multilevel logic optimizati on, 3 15-3 17 Custom digital circult<, .1 - 22
ellul ar telephones. 7. 279- 284 defined. III
two- level logic-size optimizati on. 296-3 15 Cyc\c, clock, 10
co mpone nts of. 28 1- 284 deri va ti on of FSM for. 237. 238
Combining le nn s 10 eliminate a vari able, 297
voice qua lity on. 25 1
Cc b ius, 175
Combl ogic process, 455, 464-466 de ign examples using, 116-117, 120-1 2 1. 123- 127 o
Communication: design o f. in soda machine di spenser example. 516-5 18 0313 communi Ali n. 161
han nc b (in transduce rs). 2 10 design process for, 120, 126
serial , 160 Data-dominated design, 247- 250
Chec kerboard. co mput cri led (ex ampl e). 156- 158 and implementation of FSMs. 122
wireless. 161 defined, 247
Chips. ~cc Silicon chips Commutative propeny, 50. 498 initial state of. 135- 136 example using, 248- 250
Cincx I componenl (AL -cxlcndcr). 20~ in laser-based di stance measurer example, 480-491 Data input, 150
Co mparator(s). 177- 18 1
C irc u i t ~ : in LED module, 4 14-4 16 Data memory, 423
equalit y, 177- 178
analog. 5 exampl e using, 180- 18 1 negative logic in. 136-137 Data movement in\ lructiOIl\, 439
asychronous. 102 mag nitude, 178- 180 output glitches in. 136 Datapath, 423 24
and Boolean functi on" 56 Compensating wei ght scale (exa mple), 173 in pace makers. 138- 140 conn~cti on ~f COntroller to, In RTL de' i~ n , 236
building. usin g gmcs. 44-l7 Co mple ment (s).4 . 194-1 97, 497 in equential logic description. 463-466 ~reauon of, In RTL de'ign, 2 236
c lock divider. 187 defined, 195 tandard architecture for, 119 In laser-based di stance measurer (example), 480-49 1
co mbinati onal. 30, 65. 85. 95 ex iste nce o f, 499 Controller area network (CA ), 160 fo r programmable procc«o". 422 24
crit ical path in. 252- 25-1 unique, 499 Control unit. 424-428 :n Six-instruction progrnmmable proce,IO"', 435 37
defi ned. 22 Compl e mentation. 499 in six- instruction programmable processors, 435-437 r" soda machine di ' penICr (example), 519- 525
digital. 4-5.2 1- 22. 38-10. 2 13- 2 15 Compleme nt propeny. 51 for three-instruction programmable processors, 432-434 or three-instruction programmable proce"or , 431-4
integrated. 33-35 Compl ex it y, managing (RTL design), 275 Conversion(s). 58 Datapath component description:
mathemati cal formali sms in design. 130 Co mplex programmable logic devi ce (CPLD), 407-408 am ong Boolean functions. 58- 60 and carry-ripple adde"" 468-47 1
and notati on simplification. 69- 72 from any base to any ot her base, 15- 16, 60 and full-adders, 467-468
FPGAs vs .. 407-408
panitioning. among lookup tables. 390-394 from binary to decimal, 12 up-Counters in, 471-475
SPLDs vs .. 407-40
sense amp lifier. 26 1 from circui ts to equati ons. 58- 59 usi ng hardware lunguages in, 467-475
Component allocation, 349- 350
from c ircuits to truth tables, 60 Datapath components, 15 1
seque ntial. 30. 85- 86. 95 Compre sion. 7
decimal to binary, 13- 15 and faster adders tradeoff, 333- 343
simplifying drawings of. 130 and computation of ratios in video, 364, 367, 368
from eq uations to truth tabl es. 59 and smaller multipliers tradeoff, 343- 345
state of, 95 in digital video. 363- 369
as step in combi national logic de sign, 67- 69, 72 Datapath operntions, 423-424
sy nchronous. 102 quantization in, 366-367 , 369
from truth tables to circui ts, 60 OCT. see Discrete cosine lransfonn
CLBs. see Configurablc logic bloc ks and transforming to frequency domain. 364-366 Detr. sec Local registers
Clear in puts. 134 Compute rs, 4 from truth tables to equations, 60
Debugging, 33
C lock di vider. 187 with blinking lights, 430 Convener(s):
Decimal point, 506
Clock frequen cy, 103.25 1- 254 booting. 43 1 analog-to-digital. 9
Decimal to binary conversion:
Cloc k gating, 358-360 Compu terized board games (example), 156-158 digital -to-ana log, 9
di vide-by-2 method, 14-15
Cloc k signal. 102- 105 Computer monitors. 192 of FSMs to circui ts , see Controller(s)
subtraction method, 13-14
Co ncurrency (i n RTL design), 348-349 RGB to CMYK (example), 192-194
Cloc k skew. 359 Declaration(s):
CMOS transistors. 35-37, 41. 42. 357-358 Concurrent computat ion, 354-355 Core, 41 I enum, 465
CMY color space. 192- 194 Conductors, 36 Cosine waves, 364-366 process, 452-453
CMYK color space. 194 Configurable logic blocks (C LB s) Counters, 181 - 188 type, 463
Codecs.409 grid of. in FPGAs, 398- 399 down , 18 1, 183 Decoders, 77-79, 395
output configuration memory in, 399 exa mpl es usi ng, 183, 184, 186- 188 Decoding stage, 426-427
Code detector (examp le). 11 7- 11 8, 129- 130
Color pace co nvener-- RGB to CMYK (example). 192- as progra mmable ICs, 396-398 N-bit, 18 1 Decrement (in counters), 181
Configuration (in RTL desig n), 245 parallel load, 185-187 Decrementer, 183
194
Configuration memory, 398, 399 as timers, 187 Deep Blue (computer), 157
Co mbin ati onal c ircui ts, 30. 85
Congestion, 204 up, 181-183 Delay (i n gates), 85
multiple-output, 65
Constants, 434, 497 up/down, 184 Delay circuits. 213-214
output of, 95
Constructor functions, 451-452 Cover (term). 309 DeMorgan's law, 52, 502, 503
Combin ational logic descripti on:
Control-dom inated design, 247 CPLD. sec Complex programmable logic device DemUltiplexers, 85
gate behav ior in. 452-455
Critical path (in circuits), 252-254, 3 17, 333 Dequeue, 272
stru ctu re in. 447-452 Control input, 3 1, 32. 150. See a lso Gate(s)
530 Index
Indo 53 1
Dc,igner proli le,. 29. 9-1. 22-1. 293. 377- 378.444 D lalch. 103- 106 cnum stiltcmcnl. -t65
maSler. 105- 106 derivulion of ~
Dc~ i gning cOl11bin~ltional logic. 67-72 EPRO I. see Erasable PROM d . . Or Com roller '>7 '18
,"rvanl. 105- 106 EqualilY comparator. 177- 178 ~,gn ,,-<ample, u' '"g. Ils-i 'ls' i -,
and circuli notal ion". 69- 72 Don'l care inpul combinations. 305- 307 enly Iype. 32 _ 3 . - 110
Equalions. 56
,Iep' in. 67- 69. 72 Down-counlers. 181. 183 ~ loore Iype. 32 ). J
Equivalenl slales. 318
DC!'Iign proce ..... : dowTlto \;\tatcment. 459 n.ondclcnninililic. 128
Erasable PROM (EPROM). 267-26 'linplirying n 101' ,
ror cOlllroller>. 120. 126 Drain (OUlpUI). 35. 36 Espresso (heurislic (001). 315 FIR fiI : ~on.'or. 11 5-1 16. 130
for rcgi~lcrs. 163 DRAM. see Dynamic random access memory ICrs. sec FlOlie 1I1l.pul •
E semial prime implicanl. 309-3 10 Firsl-in firsl-oul (FIFO) 'e rc'llOl"C lille"
Detector ~ys le l1l ,,/app l il.:~Hl on'. 17-19.21 Driver>. 206 Exacl algorilhm. 308 Firsl-in fi"l- . . 272
Dcterior:lIion. 6 DSP. ,ee Digilal signal processing/proces.ors Excalibur plalrorm (A llera). 409 Fi"'l . , OUI (FiFO) queue,. 272
D nip·liop,. 103- 109 Dual lnline PaCkage (DIP) swilch. 171- 172.402 . pll« (slale redUCllon) 1 0 )11
Execuling slage. 426-427
edge-Iriggered. 10 107 DualilY. principle or. 499 ~lXed-poill! llrilhmeli .508- 50<) --
Exi tence: Hush memory. 269 .
-I-bil. 109 Dual -poned regi'ler filc, 208 or add ili ve idenlit y elemenl. 498
and It!vcl-!"cn:,itivc D latch. 103- 1().l DVD<. sec Di gi lal video di,cs F1lghl
I~ ' Ii 'lIIend'''"
,a c 11 -bUllon (cxnm ple) 10K
or complemenl . 499 Ip- 0ps. 96-111.130-135 •
Digital camcm .... 22- 23 Dymunic microphone. 5 of mulliplicalive idenlilYelemenl. 497
Digi lal circuit,. -1--5. 2 1- 22. 38--10. 213-2 15 Dynamic power. 358 clock signnl, '". 102- 10)
Expanding (Ierm). 309 D. 103- 109
Digital filter. 2-l8. See ;l lso Finite impulse rc"pon'\c Dynamic random access memory (DRAM). 262-263.271 Expand operal ion. 3 13 and D latche.. 103- 104
fihers (FIR ) Exponenl. biased. 5 10
Digil:ll phenomena. encoding of. 9- 10 E and reedback in bil Slomgc 96-97
lK.131 ' .
Digilal , ignal procc;<ing/proccsso" (DSP). 213. 28-1 EchoDelay circuilS. 214 F IDlche. vs.. 107
Digiwl ~igna IJo, . +-7 Economy or scale. 200 Fabricalion planl (rab). 380
EDA (eleclronic desig n automalion). 409 non- ide~1 behavior in, 131- 134
Digilal sound recorder (exampk). 26+-265 Fahrenheil. 175
Edge-triggered D liip-nop . 10-1--107 and r:glSlers in bil Slomge. 109-1 II
Digi lal ,yslems. 4. 17- 18 Falling edge-Iriggered flip-fl ops. 107 resel mpul' in. 134-135
Digital telephone an!o.wcring machine (example). 270-27 1 defined. 105 Fanoul. 204 sel inpUIS in. 135
Digital thermometer converter (example). 175 musler/servanl design. 105- 106 Fa. I Fourier Transrorm (FFT). 364 SR. 108. 131
EEPROM. sec Eleclrically erasable PROM Feedback. 96-97
Digital -Io-analog converter. 9 and SR Inlches. 97- 10 1
8-bi l carry-ripple adders, 173 Felchng slage, 426-427
Digil~ll video. 2-l-l T. 131
Electricall y erasable PROM (EEPROM ). 268-269. 27 1 FFT (Fasl Fourier Transrorm). 364
Di gital video discs (DVDs). 36 1-363 F1oOling-poin! arilhmelic. 513
Electronul.gnctism. 5 Field programmable gale arrays (FPGAs). 377. 388-40 1
Digilal video player/recorders. 36 1- 370 FioOl~ng poin! numbers. 510
Eleclronics. 31 archlleclure or. 398-40 I
compression in. 363-369 FloDl~ng poinl rcpresen!a!ion. 509- 513
Electronic design automalion (EDA). 409 AS ICs vs .. 40 1
discrele cosine Iransrorm in . 36+-367. 369 Fioallng poinl unit. 513
Eleclronic focusing (or sound). 2 1I. 2 12 configurable logic blocks with, 396-398
and DVDs. 36 1-363 Flops, 108. Sec also F1ip-lIop,
Embedded syslems. 4 CPLDs vs .. 407-408
and hurrman coding. 367-369 Flow-Of-conlrol inslruclion" 440
Enable (decoders). 77 lookup tables wilh. 389-394
MPEG-2 encoding and. 363-366. 369-370 Focusll1g (of sound). 21 1, 212
Enable inpul. 101 microprocessors vs .. 40 I
Digiti zed audio. 6-8 Encoders. 85-86 programming or. 399-400
4-bil carry-ripple adders, 169-172
Digililcd pictun.:s. 8 4-bH D liip-Hops. 109. Sec also Regisler(s)
Encoding. 9- 13 SPLDs vs .. 407-408
Digiti zed video. FPGAs. see Field programmable gale arrays
of anal7,g phenomena. 9 swilch malrices wi lh. 394-396
DIP, see Dual Inline Package switch Frames. 241.361.363-364.369
or digilal phenomena. 9- 10 FIFO (firsl-in firsl-oul). 272 Frequency:
DIP-switch-based calculator (examples): emropy. 368 FIFO que ues. 272 clock. 103. 251-254
adding. 171 - 172 huITman. 367-369 Fillering (in digi lal signal processing). 282 sound waves. 210
add ing/sublracling. 191-192. 198 minimum-bilwidth binary, 323-324 Finile impulse response fillers (FIR). 282-284 FSMs. see Finile-slale machines
multi-runction wi thout using ALU s. 20 1 MPEG-2.363-366 wilh clock galing. 359-360 Full-adders. 16&-169.467-468
using ALU. 203 of numbers. 1(}-13 example using. 248-250 Full-cuslom ICs. 379-380
Discrele cosine transrorm (DCT). 364-367. 369 one-hOI. 324-326 and pipelining. 347 Fuse-based programmable ROM . sec One-lime
Discrete transistors. 33 OUlpUI. 327-328 using operalor scheduling. 352-354 programmable (CYrP) ROM
Display Slalemenls. 457 run-Ienglh. 367. 369 Finile induclion, 503
Disp stale. 330 in sequenlial logic opl imizat ion. 323- 328 Finile-slate mac hines (FSMs), 11 3- 119. 128-130 G
Distance measurer. laser-b3sed. see Laser-based E lAC (compuler). 33 behavior in, 11 8- 119 GAL (generic array logic). 407
distance measurer Enqueue. 272 comroller archi lecture ror. 11 9 Games, compulerized board. 157
Distribuli ve propeny. 50. 498 emily declaration. 447 convening circuillo, 126- 127 Gale(s), 35. 36.41-44.73-76
Divide-by-2 melhod. 14-15.505 Entropy encoding. 368 wi lh data (FSMD). 230 A D.43-44
Divide-by-n melhod. 15- 16 enum declaration, 465 defined. 11 4 building circuilS u ing.44-47

b
532 Index
Hexadecimal numbers (hex). 16- ) 7 Index . 533
Instructions. 425-428. See also specific instruCtions
G"le(~) «('Olilirllled)
and combinational behavior. -t52~S5
Hierarchical carry-)ookahead adders. 339- 342 arithmellcfloglc. 439 L
Hierarchy (in RTL design). 275- 278 data movement. 439
d~lay~ with. 85 Lands (on DVDs), 362
High-definition TV (HDTV). 94 now-of-control. 440 Laser(s):
and FPGA,. 400--l01
High impedance. 239 Instruction memory. 425
10\\ -power. on no ncritical paths. 360 for surgery. 112
High-Ieve) state machine(s). 229- 233 Instruction reg ister (lR ). 426
NA D.73-75 m three-cycles h" h .
NOR. 73-75
in laser-based distance measurer (example). 475-480 Instruction set: 120- 122, 3z,;, ~g26tl mer (example). 111 - 11 2, 11 5,
and Moore vs. Meal y. 354 ~ n six- in~ lru c li o~ programmable processors, 434-435
NOT. 42 Highway speed measuri ng system (example). 187- 188 Laser-based distanc
number of possible. 76 m three~m s trucllo n p~ogramma ble processors. 428-431 230-238 e measurer (example),
Hold time (in flip-fl op inputs). 131. 132 Instrucllon set extenSIOns (programmable processo )
OR. 42-13 connecting the data ath
Hu ffman coding. 367-369 428. 439-440 rs . COntroller in, 480-4~ 1 to a COntroller in. 236
unhersal. 75
XNOR. 74. 75 Hz (hertz). 103 Insulators. 36 datapath in, 234-236. 480-491
XOR. 74. 75 In-system programmable EPROMs. 268 den vat IOn of COnlI II .
Gate arrays. 3 1-382 .389. See also Field programmable Integrated circuits (lCs). 33-35 high-level state 0 er s FSM in, 237. 238
ICs, see Integrated circuits
gate arrays (FPGAs)
Idempotent Law. 52. 500
fu ll-custom. 379-380 LatChes, 97-101 , ~;~~;~~~O;29-233. 475-480
Gating. clock. 35 -360 semicustom (AS ICs), 380-388
General-purpose processors. -1.21 . See also Progra mmable Ide ntity comparator. see N-bi t equality comparator basICSR. 97-99
Integrated circuit (lC) technology(-ies). 379-412 flip-flops vs. , 107
processors Identity elements, 497
CPLDs as. 407-408
Generate (in carry-lookahcad adders). 338. 340-34 1 Identity propeny. 50 level-sensitive D. 103- 106
FPGA as, 388-40 I
Generator(s): I-frames, see Intracoded frame s level-sensitive SR, 99-101
FPGA-to-AS IC conversion as. 408 Latency (in pipe
. I'Ine registers). 347
I Hz pulse generator (example). 183 . 186-187 If- then-else statements. 255-256 La
manufactured. 379-388
sequence generator (example). 124- 125. 327-328 If-then statements, 255 yOUl (of transistors on chips) 380
and Moore's Law, 4 12
Generator. sequence. see Sequence generator Impedance. high. 239
off-the-shelf SSI ICs as. 40 1-404 ~~~s (Liquid Crystal on SilicO~) chip. 94
Generic array logic (GAL). 407 Im plementation(s): , see Llght-enuUtng diode
physical , see Physical implementation and proces or varieties. 410-41 I
Generic variables. 503 Level-sensitive D latch. 103-104
as step in combinational logic design. 67-69, 72 programmable. 388
GHz (gigahenz). 103 Level-sensitive SR latch, 99-10 I
Giant video display (product profil e). 4 12-4 16 two-level logic. 67 re lative popu larity of, 409
Lights. blinkin •. 430
Gigahenz (GHz). 103 Im plicant (term), 309 SOCs as. 408-409
SPLDs as. 404-407 LLighh t-emitting diode (LED). 171-172 41?-416
Glitcheslglitching. 100. 136 Im plication tables, 31 8-322 Ig t sensor, 10 . -
Google. II Improvement, iterative, 3 12 tradeoffs among. 409-410
Intel,2 1 Light sequencer (example). 184
Inc rement (counters). 181
H Intracoded fram es (I-frames). 363-364, 369 Lmear search, 356
Incrementer. 182- 183
Haitz's law. 413 Inverse. 48 Liquid Crystal on Silicon (LCoS) chip 94
Inductance. 188
Half·adders. 167- 168 Inverters. 42 Luera/s. 50, 296-298 .
induction :
in carry-lookahead scheme. 337-339 Involution Law. 52, 50 I load-constant instruction. 43-1-435, 437
finite, 503
tmplementing on a gate array (example). 382 lR (instruction register). 426 Loadmg (data). 151
perfect. 498
Implementing urn circuit using NAND gates load instruction. 428-131. 434
Inducti ve loop. 188 Irredundant operations. 3 15
(example). 385 Load operation . 423-124
Initial state (controller ), 135- 136 Iterate (term). 3 13
Implementing sum circuit using NOR gates Load/shift registers. 160-163
Init state. 330 Iterative improvement. 31 2
(example). 386-387 Load-store architecture. 424
Implementing using standard cells (example). 3 3-384 In put(s):
ac ti ve-high. 136 J Local registers (Dctr), 232-233
Hardware description languages (HDLs). 446-447 Logic:
ac ti ve-low, 137 Java (program language). 254. 258
Hardware languages:
asynchronous. 133, 135 JK Hip-nops. 13 1 next-state. 329
in combi national logic description. 447-459
in datapath component descrip tion. 467-475 clear, 134 jump-if-zero instntction. 435-437 output. 329
In reg"ter-transfer level (RTL) design. 475-49 1
in combinational logic description. 450 Logic block. configurnble (CLB). 396-39
In ,equential logic description. 459-466
conditions. 11 4 K logIC gates. see Gate( 1
HDLs. -.e hardware description languages control . 150 Keys. secure Cnr (example), 11 6-1 17. 125- 126 Logic Ie. 40_
HDTV (h,gh-definilion TV). 94 data. 150 Keyboands. computer. 7 1 Lochhead (in omputer games). 157
Hean. human. 138 enable. 101 Kilohertz, 2 10 Lockup t. bles. 3 9-394
Heru (HZ), )03 reset. 134- 135 K-maps: "an'ples using. 392-394
Heumllc,. 308. 3) 3-3) 5 synchronous, 134-135 four-variable. 302-303 parritioning a cin:uit among. 390-394
E'prc"o too) In. 3 15 Input/output extensions (progra mmable processors), 440 three-vnriable. 298- _99 Lo\\ -PO" er gat . 360
Ilerallve. 312 Instantiation (i n RTL dc;ign). 234 and two-leve l logic ,ilo optimi7atiOll. 19 306 LT 1000 ' .ntil.tor. 2. 3

t
534 Index

Monitor(s): Network rouler, 92 Index SJS


M
MAC (multiply-accumulate) unit. 353 RGB.I92 New Year's Eve counldown display (exam Ie) 18 in BOOlean al eb
in ultrasound machines. 213 NOT, 3S-40 g ra, 38--39, 4S-49
Nexpena plalform (Ph ilips), 409 p , 6
~I achine code. ~30
Moore. Gordon. 34 Nexl-stale logic. 329 OR,38--40
Magnetic RAM (MAG RAM ). 27 1
Moore FSMs. 328-333 nexlSlale signal. 463-466 Operator binding 350-
'Iagnitude comparators. 178- 180 high-leve l state machines, 354 Operator sched I: 35 1
MAGRAM (magnetic RA M). 27 1 NMOS .Iransislors. 35. 36, 42-44, 73 Opticom s Uing, 351-354
with Mealy FSMs, 332-333 Noncnlical paths, 360 O ' ystem, 188
Mantissa. 510 Moore's Law. 34, 35. 4 12 pt~mal solution. 308
Manufactured integrated circuits (ICs). 379-388 Nondelerministic FSM, 128
MOS (ternl). 37 Non-ideal b ehavior (in flip-flops), 131 - 134 OptlmlZatiOn(s), 294-2
AS ICs. 380-388 Motion-in-the-dark detector applicat ion, 17- 19, 21 , 440 Nonrecumng engll1eering (NRE), 200, 380 and algOrithm selecti!6.3~ee also Tradeoff(s)
full-custom ICs. 379-380 Motion sensor, 9 combinational I ' . 6
Nonvolalile memory, 265
Mark 1I (computer). 33 Motorola, 21 criteria for, 295 OglC, 296-317
Nonvolalile RAM (N VRAM ). 27 1
Mars Cli mate Orbiter. 175 MP3 fornlat , 7 defi~ed, 294, 295
NOR gales, 73-75, 386--388
Mask-programmed ROM . 266
°
Master latch. 105- 106
Maxterm. M
MPEG- 1. 363
MPEG-2 encoding, 363-366, 369-370
Normalized num bers, 5 10
Nolalion(s):
at higher vs. lOwer d .
multilevel logic 31 eslgn levels. 355
power, 357-300 5-317
MSI (medium-scale integration), 34 in Boolean algebra. 48-49
Mealy FSMs. 328-333 MTBF (mean time between fai lures). 134 simplifyi ng circuit. 69-70 ~n RTL deSign. 345-354
example using. 331 Multifunction registers. 160-163 equenUallogic.3 17_333
simplifyi ng for FSMs. 115- 116, 130
high-level state mach ines. 354 Multilevel carry-Iookahead adders. 342 two-level 10 ' .
NOT gales. 42 OR glc Size. 296-315
with Moore FSMs. 332-333 Multilevel logic. 360 gates, 42-43
NOT operalor. 38-40
timing issues in. 33 1-332 Multilevel logic optimi zation. 3 15-317 NRE, see Nonrecurring engineering OR operator, 38--40
1ean time between failures (MTBF). 134 Multiple bit storage. 109- 11 1 Orthogonal implementati
nS (nanosecond), 100 OSCillation, 99-100 on features, 410-411
Medium-scale integration (MS I), 34 Multiple-output combinational circuits, 65 Null elements. 52
Megahertz (MHz). 103.210 Multiplexers (m uxes), 79-83 Oscillators:
Numbers:
Memory. III. See also Sequential circui ts internal design of, 79- 80 defined, 102
bi nary, 11 -17
configuration. 398 N-bit Mxl , 8 1-82 quartz, 102
encoding of, 10-13
data. 423 Mu ltiplicative identity element, 497 in sequential I ' '.
hexadecimal, 16-- 17 OTP ROM oglc d.escnpuon. 461-463
fl ash. 269 Multipliers: OClal, 17 . see One-ume
in LrUction, 425 OutDelay, 213-2 14 programmable ROM
in beam formers. 2 15 represenling negative. 194- 197
MxN.258 in binary num bers. 189- 190 Output(s), 31. 32
subtractors for positi ve, 190- 191
nonvolatile. 265 sequential , 343-345 in combinational I .
NVRAM (nonvolatile RAM). 27 1 reading. 246 oglc description. 450
random access (RAM). 259-265 Multiply-accumulate (MAC) unit. 353 NxN multipliers. 189
read-only (ROM ). 265-271 Multi-ported register file , 208 reg. 453. 454, 460
in RTL design. 258--271
volatile, 265
Muxes, see Multiplexers o Output enCoding, 327-328
Output glitches. 136
MxN memory, 258 Octal numbers. 17
Metastability, 131-134 MxN register file. 204 OutPUt logic. 329
Off-sel. 308
Metastable state, 132 Off-the-shelf logic (SSI ) IC. ~Ol-l().l Overclocking (in Pes). 253
Meucci, Antonio. 8 N Ohm 's Law. 3 1 Overflow detection, 198-200
MHz (megahertz), 103,2 10 NAND gates, 73- 75, 384-386 I Hz pulse g~ nerator (example). I 3. I 6--1 7 p
Microphones. 5. 210 Nanosecond (ns), 100 One-hot encod ing. 32+.326
Microprocessors: Nanowalls, 360 Pacemakers, 13 7-1 ~
One's complement . 196
defined. 18 N-bil adders, 165- 166 PAL (programable array logic). 407
One-time programmable (OTP) RO 1. 267. 405
digital circuits in, 4-5 N-bil arithmetic-logic unils, 20 I Parnllelload Counters. 185-187
On-set. 308
FPGAs vs .• 40 I N-bil barrel shifters. 176 Opcode, 429 ~~I~elload registers, 151-152. 160-161
rdluUomng. _3 -
software in. 18--21 N-bit counlers, 18 1 Operands, 429
Millimum-bitwidth binary encodi ng, 323-324 N-bit equalily comparalor, 177- 178 Pe (program COunter). 426
Operation(s): Pel'. see P:np
. heral component interface
M,nterm, 63. 308 N-bit magnitude comparalors, 178- 180 bitwi e. 20 1 p
N-bil regislers, 151 enuum ml roProcessors. _1
MIPS microprocessor, 21. 422 expand. 313 Perfect indu tion. 49
Mnemonic instructions. 430 N-bit shifters. 174 irredundnnt . 3 15
N- bit subtractors, 190-19 1 Perfonnance (in digirnl systems). _95
Module: reduce. 3 15
Pe~ormanre euensions (programmable
III combillalional logic description, 450 Negative edge-triggered nip-na ps. 107 Operalion ode. 429 Pe~od (clock signal). 103 proces rs). ~ I
In LED,. 414-416 Negalive logic, 136--137 Operator(s): Penpheruls. ~3
SC. 450-452 Negalive numbers. represenl ing. 194- 197 AND. 3 -40
Peripheral component interfoce (Pel). 1~. 141
536 Index
Index .. 537
Product term. 50 Read pon, 205
P·framcs. <;;ee Predicted frames using programmin g I
Physical design. 387 Program, 42 1, 425 Read time, 263 reg output, 453, 45 7 ~nguages in, 254-258
Programable array logic (PAL). 407 Real nu mbers, 505-507 Relays, 32 ' 60
Physical implementation. 379-117 Record ing, audio, 5-7
alternati ve technologies for. 401-409 Program counter (PC). 426 Reset inputs, 134- 135
Programmable illlegraled circui t (lC) technology. see Reduce operation. 3 15 Resetting, 98
comparing technologies for. 409--t 12
Field programmable gate arrays (FPGAs) Register(s), 109- 111 , 15 1- 165 ReSistance, 31
of giant video display. 412-1 16
Programmable inlerCOnneCIS. 394--396 design process for, 163 Resource sharing, 351
and manufactured IC technologies. 379- 388
Program mable logic arrays (PLAs). 407 examples using, 152-160, 164- 165
and programmable IC technologies. 388-40 I Resplll (in IC fabrication), 380
Programmable logic device (PLD). 404-407 local (Dctr), 232- 233
PIC microprocessor. 21. 422 Reverse engllleering 1?6
Prog ramm able processors. 42 1-442 multifunct ion, 160- 163
Pictures. digiti zed. 8 RGB color space, 192--194
control unit for. 424-428 in multiple bit storage, 109- 111 RGB monitors, 192
Piezoelectric crystals. 210 N-bi t, 15 1
Pipeline reg isters. 346 datapath for. 422-424 Rising edge-triggered fl'
input/ou lput ex ten sions 10, 440 parallel load, 151-152 Rolling over, 181 Ip-ftOps, 107
Pipelining. 345- 347 wi th parallel load and shi ft , 160- 163
Pixels. 192. 361 instructi on set eX lensions to. 439-440 ROMs, see Read-Onl y Mem
performance ex tensions 10, 441 pipeline, 346 Rotate registers, 159-160 ory
Placement (in chip components). 387
rotate, 159- 160
PLAs (programmable logic arrays), 407 six-instruction. 434-439 :Outin g (in chips), 387
th ree-i nstruction. 428-434 in sequential logic descripti on, 459-46 1
Platform SOCs. 408-109 5-6000 SP processor 157
Programmable ROM , 267 shi ft , 158, 159 RTL components, 151 '
PLD. see Programmable logic device
Programmers (ROM ), 267 updati ng of, 245-246 RTL design, see Register-tra
PMOS transistors. 37. 42-44. 73
Pop (in queues). 272 Programming languages, 254-258 Regislered data outputs, 246-247 Run-length encoding. 367, 3~~fer level design
PROM , see Progra mmable ROM Register fi les, 204-208
Pones):
in combinational logic description. -+47 Propagate (in carry- lookalJead adde rs), 338. 340-34 1 dual-poned, 208 S
Propagation. 104
mul ti-poned, 208 SAD. see Sum-of-absolute-<1'!t
read. 205
Propositional calculus. 504 MxN, 204 Sampl ing, 6 I. erences
write. 205
Pulse width modulation (PWM), 415 single-poned, 208
Positive edge-triggered flip-Hops. 107 Sscale, chompen ating weight (example) 173
Push (in queues). 272 Register-transfer level (RTL) components, 151 Can c run. 399 .
Positive numbers. subtractors for. 190- 194
PWM (pu lse width modulation). 4 15 Registe r- transfer level (RTL) design, 225-285 Scan convener, 213
Powe r:
abstraction in, 276 SC_CfOR statement, 45 1--452 454
in digital systems. 295
Q behavioral-level, 254-258 Scheduling. operator. 351-354'
dynamic. 358
Quantizati on (in video compression). 366-367, 369 clock frequen cy, determination of. 25 1-254 Schematic, 445
Power optimi zation. 357-360
Power PC programmable processor. 422 Quartz, 102 component allocation in. 349-350 SChematic capture tool (use in circUits) 84
Quartz oscillators, 102 concurrency in, 348-349 SC_1Il0 statement, 451 '
Precharging (RAM bit storage), 261
Predicted frames (P-frames). 363-364, 369 Queues. 271-272 connection of data path to controller, 236 SC_METHOD. 454-455, 458
Preset (asynchronous set). 135 Queuing. 271-274 controller's FSM , derivation of. 237. 238 SC_module. 450-452
Prime (term ). 48 Quine-McCluskey method, 3 11-3 12 data-dominated,247-248 sc_outo statemenL 451
Prime implicant. 309 QWERTY keyboard, 7 1 data path, creation of, 234-236 sc_signal statement. 451
Printers. 192- 194 examples of. 238-244, 248-250. 269-27 1. 279-284
SC_THREAD testbench process. 458. -162-463
Priori ty encoders. 86 R hierarchy in. 275-278 Search(es):
Proces declaration, 452-453 Race condition. 100 high-level state machine. creation of. 229-233 binary. 257
Processor(s): Radi x. 5 10 managing complexilY in, 275 linear. 56
defi ned, 225 Random access memory (RAMs): memory components in . 258-271 Seal belt warning lighl (example):
digital signal. 213 bit storage in , 260-26 1 method, 226-238 .." ended. on an FPGA. 395-'96
single-purpose, 421 dynamic (DRAM), 262-263 operalor binding in. 350-351 implementing. with a lookup table. 390
superscalar. 44 1 example using, 264-265 operalor scheduling in, 35 1- 3 4 usmg OR-b:L<ed gale :tmlv. 3 -3
Very Large In "ruction Word (VLlW), 44 1 in RTL design, 259-265, 271 optimizalions and tradeolTs in, _45- 354 using off-the-shdf 7-100 I~, 403-104
Product. 48 static, 26 1- 262 pipelining in. 345- 347 using simplo PLO, -106
Product-of-maxterms form, 64 readO fun ction. 455 pitfalls in. 245- 246 ond p:1SS (stale reduction), 3_1. 321
Product profi les: Reading (data), 15 1 queuing in, 27 1- 274 ecure car key(e,ample), 116-11 , IJ5.-L6
cell phones. 279-284 Read-Onl y Memory (ROM s): RAMs in, 259- _65, 27 1 el tun;, 9. ee also lultipl xers (mu.,)
dIgital video pl aye rlrecorde rs, 361 -370 examples using, 269- 27 1 and registered da ta outpulS, 2-16-_47 emi nduct rs. 36
giant VIdeO d"play, 412-4 16 in RTL design , 265-27 1 ROMs in, 265- 27 1 emicuslom lCs. see Application pecific IntegTllled
pacemaker>. 137-139 types of. 266-269 scope of. 225- 126 Circuils (A IC 1
ultr",ound machines. 209-2 16 Read-onl y memory programming, 265 using hardware langung., Ill, -17_ 91 n. amplifier, 261
538 Index
Simple programmable logic dev ice (SPLO). 404-407 Index -4 539
State minimi zation. 3 17-323 Systems:
Sensitive processes. 453. 455 C PLOs vs .. 407-408 Sl3le reduction :
Sensitivity lists. 46 1-462 d~tector. 17- 19. 21
FPGAs vs .. 407-408 algorithm for. 3 19
Scnsor(s). 9- 10 digital. 17- 18
Simul ati on (in ci rcuits). 84 example. for. us ing implica ti on table. 32 1- 322
bunon. 10 embedded. 4
Simul ator. 84 Implication tables. 3 18-320
light. 10 SystemC. 450-45? 454-4
470-471 . 474-475 479-4~5. 458-463. 465-466 468
Single- ported register fi le. 208 in sequen tial log ic optimi za tio n. 3 17- 323
traffic light. 188 Single- purpose processor. 42 1
Sequence ge nerator (exa mple). 124- 125. 327-328 steps in. 320-32 1 System-on-a-chip (SOC 0. 488-491 ' .
Sequencer. li gh t (example). 184
Si x- instru cti on programmable processors. 434-439 Stale signal. 476 T ). 408-409
cont ro l unit in. 435-437 Statetype. 463-466
Sequenti al ci rcuits. 30. 85-86. 95. 126- 127. See also
datapath in. 435-437
Fini te-state mac hines (FS Ms) Static random access memory (SRAM). 261-262. 271 Tables. implication. 318- 319
instru ctio n set in. 434-435 Steenn g (of sound). 2 1 I. 2 12 Tabular methOd. 311 - 312
con trollers. I I I. I 19- 130. 135- 140
Size (in di gital sys tems). 295
convertin g to FS M (exa mple). 126- 127 Stereo speaker. 2 10 ialking doll (example). 269-27\
Small -scale integrati on. see SS I ap (as mathematical te
Rip-Oo ps. 96- 111 . 130-135 SLOre instruction . 429-43 1, 434
SOc. see System-on-a-chip Technology rna ' rm). 282
Sequential logic descripti on: Store operations. 423-424
Soda machine dispense r (ex ample). 227-229. 515-525 Telephones. 8 PPlOg. 387
con trollers in. 463-466 Structure (in combinatio nal logic descripti on). 447-452
contro ller. design of. 5 16-51 8 Temperat ure averager (e
osci llators in. 46 1-463 Structured AS ICs. 383. 408
understanding be hav io r o f contro ller and datapath . Temperature histo . Xample). 175
reg isters in. 459-46 1 Subsening (i n program lang uage ). 258 154-155 ry display (example). 109-111
51 9-525
using hardware languages in. 459-466 subtract instruction. 435-437 Terms: .
Soft ware. 18
Sequent ial logic optimizati on. 317-333 Subtraction (using addition). 195- 196
Solid-s tate transistors. 33 combining. to eliminate .
and Moore vs. Mealy FS Ms. 328-333 Subtraction method. 13- 14. 505 prOduct. 50 a vanable. 297
Sound. 2 10-2 12
state encodi ng as. 323-328 SubtracLOr(s). 190-200
Sound ge neration circuits. 2 13-2 14 Terminal count (counter OUI
state reducti on as. 317-323 detecting overR olV in. 198-200 Testbenches. 455-459 put). 181
Sound wa ves. 2 10. 2 12
Sequen tial multi plie rs. 343-345 examples using. 191 - 194, 198
Source inpu t. 3 1. 32. 35 . 36 T flip-flops. 131
Seri al communicati on. 160 for positive numbers. 190-194
SPG bloc ks. 339 Three-cycles-high laser .
Seri al co mputati on. 354-355
Spin (in IC fabri cati on). 380 usi ng adder 10 build a. 197-200 altern ative binary ~mer (example):
Serializing (in computations). 352
SPLO. see S im ple programmable log ic devi ce Sum. 48 comroller for. 1 2c:.~~2 '"g for. 324
Servant 0 latch. 105- 106 Summation circuits. 2 14--2 15
Set inputs. synchronous/asy nchronous. 135 Spurious va lues. 17 1 first deSign. poorly done. 111-112
SRAM. see Static random access memory Sum-of-absolute-differences (SAD): FSM for. 115
Setting (in latches). 99 wilh concu rrency (example). 348-349
SR flip-flops. 108. 13 1 using one-hot encOding. 326
Setup ti me (i n flip-flop inputs). 13 1. 132 design example. 241 - 244
SR latches. 97- 10 I 3-D Images (ultrasound). 216
Shannon. Claude. 40 examples using C code. 254-258
bas ic. 97- 99 Three-mstruction prograrnmabl
Shifters. 173- 176 Sum-of-minterms form. 63-65 control unit for. 432-434 e processors. 428-434
level-se nsitive. 99- 10 I
barrel. 176 Sum-of-products. 50
SSI (s mall-scale integrati on). 34, 401-402 datapath for. 431-433
examples using. 175
Stages: Superscalar processor. 44 1 first in truction set in. 428-431
simple. 174 Three-state driver. 206
pipeline registers. 346 SlI'itch(es). 3 1-35
Shift registers. 158. 159 Throu2hput (in Pipe
. I'lOe regISters)
.
programmable processors. 425-428 and discrete transistors. 33 - 347
Shockley. William. 33 Timer(s): .
Standard architecture (for controllers). 11 9 Duallnline Package. 171 - 172
SHRO function. 477
Standard cells. 382- 383 and integrated circuits. 33-34 as coumer type. 187-188
Signa l(s).448
Standard represe nt ati on. 62 relays in. 32 Tthree-cYCles-high laser. 111-112. 115. 120-122
cu rren tstate. 463-466 mllng analYSIs. 254
di gital. 4-7 State(s): sliding (exa mple). 306-307
of ci rcuits. 85- 86. 95 . III and vacuum tubes. 32-33 Timing diagrams. 20
nex tstate. 463-466
eq uivalency between. 3 18 Switching algebra. 496. 497 Timing issues. with Mealy FSMs. 331-332
state. 476
State diagram. I 14 Swilch matrices. 394-396. 398-399. See also Tradeoff(s): See also Optimization(s)
Signal processor. 2 13
State encoding: Programmable inlerconn ec ts and algonthm selection. 356
Sign bit. 196 among IC technologies. 409-110
Signed-magni tude. 194 alternative min im um-bitw idt h binary. 323- 324 Synchronizer. bunon press (exam ple). 123- 124
datapath omponenL 333-345
Significand.510 one-hot. 324-326 Synchronous circuit. 102
defined. 29-
Silicon (elemen t). 37 output. 327-328 Synchronous clear. 164
in sequential logic optimization. 323- 328 at higher vs. lo\\'er design 1e\-eL. 355
Silicon chips. 33-35. See also Integ rated ci rcuits (ICs) Synchronous clea ring. 184
m RTL design. 34>-354
and economy of scale. 200 Statements. See also specific statements Synchronous reset in puts. 134- 135
assert. 456. 458-459 bet\\'":n serial and concurrent computation. 35+-355
fabri cation of. 380 Synchronous se t. 164 Traffic Itght sensors. I
Silicon Valley (Californ ia). 37 display. 457 Synchronous Sci inputs . 135 Tmnsdu rs. 9. 210

& 2
540 Index

Transfonllation operations. 423-124 v


Transistors: Vacuum lUbeS. 32-33
CMOS. 35-37. ~ 1.42 Variable(s). 49. ~97
discrete. 33 combining temlS to elimintllc 3, 297
nMOS. 35. 36. ~2-44. 73 generic. 503
pMOS. 37. 42-44. 73 Veri log (hardware description language), 254, 258.
Transitions. 11 4 449-450.453-454.457.460.462.464-465,467.
Transparent latch. sec Level-sensitive SR kl.l ch 469-470. 473-474,477-479.484-487
Truth table(s). 42 Very Large Instruction Word (V LlW) processor. 44 1
and Boolean func tions. 56-58 Very-large scale integration (V LSI), 34
as Boolean function standard representation. 62- 63 VHDL (hardware description language). 254. 258,
defined. 56 447-449. 452-453. 456. 459-46 1. 463-464. 467-469.
Tubes. vac uum. 32-33 47 1-473.475-477. 480-484
Two-level logic adders. 334 Video. digiti zed. 8,244
Two-level log ic implementations. 67 Video compression (examples):
Two-level logic size optimi zation. 296-3 15 usi ng C code. 254-256
using sum-of-absolute differences (SAD) design.
automation of. 308-3 15
24 1-244
and don '( care input corn b i n ~lI i ons. 305-307
Video display. giant. 412-416
and K-maps. 298-306
Vinex II Pro platfo rm (Xilinx). 409
usi ng algebraic methods. 296-298
VLlW (Very Large Instruction Word) processor, 44 1
Two's complement. 196- 197 VLSI (very-large scale integration). 34
building a subtractor using adders and, 197- 200 Volati Ie memory. 265
defined. 196 Voltage. 3 1
detecting overfl ow using. J 99-200
type declaration. 463 w
type statement. 463 wai l for statement. 46 J
Typewriters. 71 wai tO function. 458, 462
Wait state. 330
u Wall (uni t). 357
Ultrasound (term). 210 Waves. cosine. 364-366
Ultrasound imaging. 2 10 Wavefonn (of inputs), 84
Ultrasound machines. 209-216 Weight sampler (example). 153-154
beamfonner in. 210-2 13 Western Union. 8
digital circuits in , 213-215 While loop statements, 256
future challenges with . 216 Wireless commu nica tion. 16 J
moniLOr in. 2 J 3 Wire signal, 457
scan converter in. 213 wi res OUtput, 450. 453
signal processor in. 2 13 Word (data item), 258
transducer in, 2 10 Wrapping around (co unters). 18 1
Unique complement, 499 Wristwatch, beeping (example):
Uniting theorem. 297 using combined MooreiMealy machine, 333
using Mealy machine, 33 1
Universal gates, 75, 384
writeO function. 455
Universal Serial Bus (US B), 161
Write pan, 205
Up·counters, 181-183, 471-475
Up/down counters. 184 X
US B (Uni versal Serial Bus), 161 XNOR gates, 74. 75
use statement, 476 XOR gates. 74, 75

You might also like