Professional Documents
Culture Documents
2 2
Please read this disclaimer before
pThrisodcoceumeednitnisgco:nfidential and intended solely for the educational purpose
of RMK Group of Educational Institutions. If you have received this document
through email in error, please notify the system manager. This document
contains proprietary information and is intended only to the respective group /
learning community as intended. If you are not the addressee you should not
disseminate, distribute or copy through e-mail. Please notify the sender
immediately by e-mail if you have received this document by mistake and
delete this document from your system. If you are not the intended recipient
you are notified that disclosing, copying, distributing or taking any action in
reliance on the contents of this information is strictly prohibited.
3 3
R.M.D ENGINEERING COLLEGE
DEPARTMENT OF ELECTRONICS AND COMMUNICATION
ENGINEERING
Batch/Year :2021-2025/III
Created by :Ms.P.Santhoshini
:Ms.S.Gayathri Priya
Date :23.09.2023
4 4
TABLE OF
CONTENTS
S.No Contents Page
Number
1 Course Objectives 7
2 Pre Requisites 8
3 Syllabus 9
4 Course outcomes 10
5 CO- PO/PSO Mapping 15
6 Unit IV- DESIGN OF ARITHMETIC BUILDING BLOCKS
AND SUBSYSTEM
6.1 Lecture Plan 16
Adders 20
Ripple Carry Adder 21
The Mirror Adder 24
Transmission Gate Based Adder 25
Carry Look Ahead Adder 26
High Speed Adders 30
Multiplier 36
Barrel Shifter 47
ALU 48
Designing Memory And Array Structures 51
6.4 Assignments 77
6.5 Part A Q & A 78
6.6 Part B Questions 83
6.7 Supportive online Certification courses 84
6.8 Real time Applications in day to day life and to Industry 86
6.9 Contents beyond the Syllabus 87
5
Table of
Contents
S.No Contents Page
Number
7 Assessment Schedule 88
6
1. COURSE OBJECTIVE
OBJECTIVES:
7
2. PRE REQUISITES
8
3. SYLLABUS
LIST OF EXPERIMENTS
1. Design of inverter using LT-SPICE
2. Layout verification of CMOS inverter, NOR and NAND gates
LIST OF EXPERIMENTS
1. Design of adder and subtractor
2. Design of multiplexer and demultiplexer
LIST OF EXPERIMENTS
1.Design of Flipflops
2.Design of counter
3. Design of universal shift register
4. Design of Mealy and Moore State Machines
5. Design of random Access Memory
9
UNIT IV DESIGN OF ARITHMETIC BUILDING BLOCKS AND
SUBSYSTEM 15
Arithmetic Building Blocks: Data Paths, Adders, Multipliers, Shifters, ALUs, power
and speed tradeoffs, Designing Memory and Array structures: Memory Architectures
and Building Blocks, Memory Core, Memory Peripheral Circuitry.
LIST OF EXPERIMENTS
1.Design of Arithmetic Logic Unit
2.Design of Ripple Carry Adder
3.Design of Carry Select Adder
4.Design of Multiplier
10
4. COURSE OUTCOMES
Highest
Course Outcomes Cognitive
Level
Understand the fundamental principles of VLSI circuit design in
CO1 K2
digital domain
11
Program Outcomes(PO)
Program Engineering Graduates will be able to
Outcome
Engineering Apply the knowledge of mathematics,science,engineering
fundamentals, and an engineering specialization to the solution of
PO1 Knowledge
complex engineering problems.
Conduct
Use research-based knowledge and research methods including
Investigations
design of experiments, analysis and interpretation of data, and
PO4 of Complex
synthesis of the information to provide valid conclusions.
Problems
12
Program Outcomes(PO)
Program Engineering Graduates will be able to
Outcome
Ethics Apply ethical principles and commit to professional ethics and
PO8
responsibilities and norms of the engineering practice
Individual and Function effectively as an individual, and as a member or leader in
Team Work
PO9 diverse teams, and in multidisciplinary settings
Lifelong Recognize the need for, and have the preparation and ability to
engage in independent and life-long learning in the broadest context
PO12 Learning
of technological change.
13
Program Specific Outcomes(PSO)
Program
Specific Electronics and Communication Engineering Graduates will be
Outcomes able to
PSO2
To apply design principles and best practices for developing
quality products for scientific and business applications
14
5. CO- PO/PSO Mapping
Program
Course Level
Outcom of Program Outcomes Specific
es CO Outcomes
K3,K5
K K4 K4 K5 A3 A2 A3 A3 A3 A3 A2 K6 K5 K3
,K6
3
P
PO-
O- PO-2 PO-3 PO-4 PO-6 PO-7 PO-8 9 PO-10 PO-11 PO-12 PSO-1 PSO-2 PSO-3
PO-5
1
CO1 K2 2 1 1 - - - - - - - - - - 1 1
CO2 K3 1 2 - - - - - - - - - - - 2 1
CO3 K3 2 1 2 - - - - - - - - - - 1 1
CO4 K3 1 2 1 - - - - - - - - - - 1 2
CO5 K3 1 2 - - - - - - - - - - - 1 2
CO6 K2 2 1 1 - - - - - - - - - - 1 1
15
UNIT IV DESIGN OF ARITHMETIC BUILDING BLOCKS AND SUBSYSTEM
Arithmetic Building Blocks: Data Paths, Adders, Multipliers, Shifters, ALUs, power
and speed tradeoffs, Designing Memory and Array structures: Memory Architectures
and Building Blocks, Memory Core, Memory Peripheral Circuitry.
16
LECTURE PLAN
UNIT IV -DESIGN OF ARITHMETIC BUILDING BLOCKS
AND SUBSYSTEM
S No. Proposed Ac Per Reaso
. of Date tu tai Taxonom Mode of n or
N Per al nin y level Delivery Devia
o Topic iods Da g tion
te CO
, Designing K3 Chalk -
5 Memory and Array 1 CO3 Apply and talk
structures
: Memory K3 Chalk -
6 Architectures and 1 CO3 Apply and talk
K2
Memory Core, Chalk -
1 CO3 Under
8 and talk
stand
Memory
Chalk -
Peripheral K1
1 CO3 and talk
9 Circuitry. Remember
1. Crossword Puzzles:
Made the students to understand the basic concepts of memory,
multiplers,adders and had a short discussion using the below mentioned
crossword puzzles.
2. Roleplay :
18
6.3 LECTURE NOTES
UNIT IV DESIGN OF ARITHMETIC BUILDING
BLOCKS AND SUBSYSTEM
19
Fig 4.2 Bit sliced data paths
Adders:
Addition forms the basis for many processing operations, from ALUs to address
generation to multiplication to filtering. As a result, adder circuits that add two binary
numbers are of great interest to digital system designers.
Single-Bit Addition
The half adderof Figure 4.3 (a) adds two single-bit inputs, A and B. The result is 0, 1,
or 2, so two bits are required to represent the value; they are called the sum S and
carry- out
C. The carry-out is equivalent to a carry-in to the next more significant column of a
multibit adder, so it can be described as having double the weight of the other bits. If
multiple adders are to be cascaded, each must be able to receive the carry-in. Such a
full adder as shown in Figure 4.3 (b) has a third input called C or Cin.
21
Express Sum and Carry as a function of P,G, D
Define 3 new variable which ONLY depend on A, B
Generate(G)= AB Propagate (P) = A ⊕B Delete = A’B’
We can rewrite the S and C0 as functions of P and G(or D)
C(G,P)=G+PCi
S(G,P)=P⊕Ci
22
The propagation delay of such a structure (is also called critical path) is
defined as the worst case delay over all possible input patterns. In case of ripple
carry adder ,the worst case delay happens when a carry generated at the LSB
position propagates all the way to the MSB bit. This carry is finally consumed in the
last stage to produce the Sum. The delay is proportional to number of bits in the
input words N is approximated by
t = (N-1) t +t
adder carry sum
23
Inversion Property
To design full adder this property used. Inverting all inputs to a full adder results in
inverted values for all outputs.
A B A B
Ci FA Co Ci FA Co
S S
Inversion property applied on the Boolean expression of full adder,
In ripple carry adder minimize the critical path by reducing inverting stages.
24
The Mirror Adder:
□ The NMOS and PMOS chains are completely symmetrical. A maximum of two series
transistors can be observed in the carry-generation circuitry.
When laying out the cell, the most critical issue is the minimization of the capacitance
at node Co. The reduction of the diffusion capacitances is particularly important.
□ The transistors connected to Ci are placed closest to the output. Only the transistors in
the carry stage have to be optimized for optimal speed. All transistors in the sum stage
can be minimal size.
25
4.2.3 TRANSMISSION GATE BASED ADDER:
26
A rather different full adder design use transmission gates to form multiplexers and
XORs.
Figure (4.5) shows the transistor-level schematic using 24 transistors and providing
buffered outputs of the proper polarity with equal delay.
The design can be understood by parsing the transmission gate structures into
multiplexers and an “invertible inverter” XOR structure 1 Note that the multiplexer
choosing S is configured to compute P ^ C.
The sum output and carry output can be expressed in terms of carry generate and
carry propagate as
27
To determine whether a bit pair will generate a carry, the following logic works:
Gi=Ai.Bi
To determine whether a bit pair will propagate a carry, either of the following logic
statements work:
Pi=Ai⊕Bi
28
The carry look ahead equations implemented using mirror structure.
29
Design Example: Implementing a Look ahead adder in Dynamic
logic:
Generate Block
Propagate Block
Pi=Ai⊕Bi Gi=Ai.Bi
One way of implementing the sum in domino logic is through sum selection.
The sum are computed as Si =0 Ai Xor Bi and Si =1 Ai Xor Bi .The dynamic gate is
then used to select one of these possibilities ,based on incoming carry. The
implementation of the multiplexer gate requires three logic levels, because no
complementary carry is available in domino logic. keepers should be placed at all
dynamic nodes.
30
Fig 4.12 sum select in dynamic logic
31
5. HIGH SPEED ADDERS:
1. MANCHESTER CARRY CHAIN ADDER:
The carry propagation circuit can be simplified by adding Generate and Delete signals.
The propagate path is unchanged and it passes Ci to Co output if the propagate signal (Ai
Xor Bi) is true. If the propagate condition is not satisfied , the output is either pulled low
by Di signal or pulled up by Gi‟.
32
A Manchester carry chain adder uses a cascade of pass transistors to
implement the carry chain. During the pre charge phase (Φ=0) all intermediate
nodes of the pass transistors carry chain are pre charged to VDD. During
evaluation the node is discharged when there is an incoming carry and the
propagate signal Pk is high or when the generate signal for stage k(Gk) is high.
Consider the 4 bit full adder shown in the fig. Suppose that values Ak and Bk (0,1,2,3)
propagate signal P0,P1,P2,P3 .an incoming carry Ci,0=1 propagate under those conditions
through the complete adder chain and causes outgoing carry C0,3=1
33
If all the propagate signals P0 P1 P2 P3 =1 then C0,3= Ci,0 Else generate or delete.
When BP= P0 P1 P2 P3 =1 the incoming carry immediately forward to the next block.
Hence the name is called Carry by pass adder or carry skip adder.
P0 P0 P2 P3
G1 G1 G2 G3
Ci,0 C C C C
o,0 o,1 o,2 o,3
F FA FA F
A A
P0 P0 P2 P3
G1 G1 G2 G3 BP=PoP 1P 2P 3
C Co,0 Co,1 Co,2
i,0
FA F FA F
Multiplexer
A A Co,3
Idea: If (P0 and P1 and P2 and P3 = 1) then Co3 = C0, else “kill” or
“generate”.
Let us now compute the delay of N bit adder. We assume that the total adder is
divided in (N/M) equal length bypass stages, each of which contains M bits.
34
Fig4.16 N=16 carry by pass adder
t =t +Mt +(N/M) t +t
add setup carry mux sum
34
Fig 4.17 4 bit carry select module
36
4.6 Multiplier:
Very important operation. Often the speed of multiplication limits the
performance of the digital processor.Multiplications are used in many digital signal
processing applications: correlations, convolution, filtering, and frequency analysis.
The analysis of the multiplier gives us some further insight on how to optimize
the performance (or the area) of complex circuit topologies.
Example: 12x5
The multiplication process may be viewed to consist of the following two steps:
Evaluation of partial products.
37
It should be noted that binary multiplication is equivalent to a logical AND operation.
Thus evaluation of partial products consists of the logical ANDing of the multiplicand
and the relevant multiplier bit. Each column of partial products must then be added
and, if necessary, any carry values passed to the next column.
Pk the partial product terms called summands. There are M*N summands which are
generated in parallel by a set of M*N AND gates.
A n*n multiplier requires n(n-2) full adders, n half adders, and n 2 AND gates. The worst
case delay is (2n+1)tg, where tg is the worst case adder delay
38
For 4-bit numbers, the expression above may be expanded as in the table
below.
Array Multiplier:
39
4.6.2 Wallace-Tree Based Multiplier:
Principle
Sum N shifted partial products Do
N- input addition efficiently
Reduced N-input addition in steps
Use counters, e.g. carry-save adder (CSA) (3/2 reduction)
CSA is simple, it is just a full adder
At the end of the array you need to add two parts together.
This take a fast adder, but you only need one at the end, not one for each partial
product.
The delay through the array addition (not including the CPA) is proportional to log1.5(n),
where n is the width of the Wallace tree.
40
4.6.4 BRAUN MULTIPLIER:
The simplest parallel multiplier is the Braun array. All the partial products are
computed in parallel, then collected through a cascade of Carry Save Adders.
The completion time is limited by the depth of the carry save array, and by the
carry propagation in the adder. Note that this multiplier is only suited for
positive operands. The structure of the Braun algorithm for the unsigned binary
multiplication is shown in figure 4.20
41
4.6.3 BAUGH WOOLEY MULTIPLIER:
In signed multiplication the length of the partial products and the number of
partial products will be very high. So an algorithm was introduced for signed
multiplication called as Baugh- Wooley algorithm. The Baugh-Wooley multiplication
is one amongst the cost-effective ways to handle the sign bits. This method has
been developed so as to style regular multipliers, suited to 2's complement
numbers.
Let two n-bit numbers, number (A) and number (B), A and B are often
pictured as
Where and area unit the bits during A and B, severally and −1 and −1 area unit the
sign bits. The full precision product, P = A × B, is provided by the equation:
42
The first two terms of above equation are positive and last two terms are
negative. In order to calculate the product, instead of subtracting the last two
terms, it is possible to add the opposite values. The above equation signifies the
Baugh-Wooley algorithm for multiplication process in two’s complement form.
43
The above algorithm implemented as architecture
44
4.6.4 Booth Multiplier:
Booth„s Algorithm is a smart move for multiplying signed numbers. It
initiate with the ability to both add and subtract there are multiple ways to
compute a product. Booth„s algorithm is a multiplication algorithm that
utilizes two„s complement notation of signed binary numbers for
multiplication.
When multiplying by 9:
Multiply by 10 (easy, just shift digit left) Subtract once
45
Booth encoding Algorithm:
46
Tree Multiplier with Booth Encoding :
47
4.7 Barrel Shifter:
A barrel shifter performs a right rotate operation . As mentioned earlier, it
handles left rotations using the complementary shift amount. Barrel shifters
can also perform shifts when suitable masking hardware is included.
Notice how, unlike funnel shifters, barrel shifters contain long wrap-around
wires. In a large shifter, it is beneficial to upsize or buffer the drivers for these
wires
mask out the bits that are rotated off the end of the shifter.
48
4.8 ALU:
An ALU is a Arithmetic Logic Unit that requires Arithmetic operations and Boolean
operations. Basically arithmetic operations are addition and subtraction. one may
either multiplex between an adder and a Boolean unit or merge the Boolean unit
into the adder as in the classic transistor-transistor logic.
49
The heart of the ALU is a 4-bit adder circuit. A 4-bit adder must take sum of
two 4- bit numbers, and there is an assumption that all 4-bit quantities are
presented in parallel form and that the shifter circuit is designed to accept and shift
a 4-bit parallel sum from the ALU. The sum is to be stored in parallel at the output
of the adder from where it is fed through the shifter and back to the register array.
Therefore, a single 4-bit data bus is needed from the adder to the shifter and
another 4-bit bus is required from the shifted output back to the register.
50
9. DESIGNING MEMORY AND ARRAY STRUCTURES
INTRODUCTION:
Memory is classified into two categories. They are
1. Background Memory
2. Foreground Memory
1. Background Memory:
2. Foreground Memory:
A memory that is embedded in to logic itself is called foreground memory.
Ex: Latch, Register and Flipflop Semiconductor
Classification
The memories are comes in many different formats and styles. The type of memory unit
that is preferable for particular application depends on size, the time to access stored data,
the access pattern and system requirements.
51
Fig. 4.27 Intuitive Architecture for NxM Fig. 4.28 A decoder reduces
Memory the number of
address bits
In order to reduce complexity to access the stored data, the column decoder
is used to select one particular cell out of M bits. This can be shown in Fig 4.27
K = log2N
Again the column decoder is used to select one particular cell from M number of
bits. This can be shown in Fig. 4.28.
52
Fig. 4.29 Array- Structured Memory Organisation
The horizontal select line that enables a single row of cell is called the
word line, while the wire that connect the cells in a single column to the input
output circuitry is named the bit lines the area of large memory module is
dominated by the size of the memory code that it is crucial to keep the size of the
basic storage cell as small as possible.
53
line. So the memory is partitioned in to smaller blocks (P). Thereby an extra
address line i required
Problem:
A 4 Mbit SRAM can be designed as a composition of 32 blocks, each of which
contain 128Kbits. Each block is structured as an array with 1024 rows and 128
columns. Find out the number of row address(x) , column address(y) and block
address(z).
Solution:
Number of blocks = 32
2Z = 32 = 25 Block address = 5
54
4.10 Memory Core
While designing large memories, the size of the memory cell must be as small
as possible without affecting the design quality such as speed and reliability. The
various types of memory core are read only, non volatile and read-write memory
cores.
ROM Cells:
consider the simplest cell, which is ROM based cell. In this, presence or
absence of diode between WL and BL differentiates between ROM cells storing 1 or a 0
respectively.
55
All output driving current is provided by the MOS transistor in the cell. The
word-line driver is only responsible for charging and discharging the word-line
capacitance. An example of 4 x 4 OR ROM cell array is shown in Fig. 4.32
56
A 4 x 4 MOS NAND ROM is given in Fig. 4.34. According to NAND
concepts, all the outputs BL(0), BL(0), BL(0), BL(0), are 1. If the inputs are WL(0),
WL(0), WL(0), WL(0), 0. If WL(1) is 1, the BL(0) is 0.
i. EPROM:
It can be erased by using ultraviolet rays on the cells through a
transparent window on the IC package. The main disadvantage of EPROM is that, it
must be removed from the board before erasing procedure.
ii. EEPROM:
It can allow to inject or remove charges from floating gate called
tunnelling. So this mechanism is based on Fowler-Nordheim tunnelling. The main
57
advantage is that, erasing is simply achieved by reversing the voltage applied to the
floating gate during writing process. But the repeated programming causes a drift in
the Vt due to malfunction or inability to reprogram the device.
58
Called self-limiting. The charge injected in to the floating gate can shifts the I-V
characteristics of transistor. It can be shown below Fig. 3.11
59
3. Read-Write Memory:
RWM or RAM memories are classified in to two categories depends on
either positive feedback or capacitive charge.
The word line is used to read and write the data on bit line BL and BL. When WL is
1, then M5 and M6 are on, thereby the data stored in Q and Q available on BL and
BL respectively(read operation). Similarly when WL is 1, the data on BL and BL are
written on Q and Q respectively(write operation).
60
Key points are 1. Periodic refreshment required to preserve data 2. Requires less
area 3. Slower 4. Produces single ended output.
Fig. 4.36 3T Dynamic Memory Cell Fig. 4.37 1-T dynamic Memory Cell
Read operation: When the read word line (RWL) is 1, then M3 goes on. Depending on
stored value on node x, M2 either on or off. If x is 1, then M2 goes on and already M3 is in
on condition, then BL2 = 0. If x is 0, then M2 goes off, then BL2 maintain its value 1.
The complexity can be reduced by using 1-T dynamic memory cell and the
operation of Fig. 4.37 3T Dynamic Memory Cell is shown below:
Read operation: When WL is 1, then M1 goes on. Thereby the data on capacitance is
available on BL line.
61
4. Contents addressable or Associative
Memory(CAM)
memory
A CAM is a special type of memory device, that stores data, but also has
ability to compare all the stored data in parallel with incoming data in an efficient
manner. The cell combines 6T RAM storage cell( M 4 - M 9 ) with 1 bit digital
comparison(M1-M3). When the cell is to be written, complimentary data is forced
onto the bit lines, while the word line is enabled as in a standard SRAM cell.
In the compare mode, the stored data S and Sbar are compared to the
incoming data, which is provided on the complementary bit lines Bit and Bit bar. The
match line is tied to all the CAM cells in a given row, and is initially precharged to
VDD. If S and Bit match, the internal node int is discharged, and M1 is turned off,
keeping the match line high. However, if the stored and incoming bit are different,
int is charged to VDD-Vt causing the match line to discharge. The application of CAM
cell to a fully associative cache memory is shown in Fig. 4.38.
62
3.3. Memory Peripheral Circuitry
Always there is an a trade between the performance and reliability for
reducing area in the case of memory core. So the circuit designer will concentrate more
on the peripheral circuitry to recover both speed and electrical integrity. Some of the
Memory Peripheral Circuitry are
1. Address Decoder
2. Sense amplifier
3. Voltage Reference
4. I/O drivers/buffers
Address Decoder
Whenever a memory allows random address-based access, address
decoder must be present. The two types of decoders are
geometry matching between the cell dimensions of decoders and thememory core is
must which is called pitch matching. if it fails, this will lead to delay and power
dissipation.
Row Decoder
consider and 8 bit address Decoder. Each of the outputs WLi is a logic
function of the 8 input address signals (A0 to A7 ). For example, the rows with address
0 and 127 are enabled by the following logic functions:
WL0 = A0 A1 A2 A3 A4 A5 A6 A7
WL127 = A0 A1 A2 A3 A4 A5 A6 A7
63
This function can be implemented in two phases, using single 8 input NAND gate
and an inverter. For single stage implementation, it is converted in to wide NOR
using De Morgan's rule.
WL0 = A0 + A1 + A2 + A3 + A4 + A5 + A6 +
To implement this function, an 8 input NOR gate is needed per row. The propagation delay
of decoder is the sum of read and write access times. The decoder can be designed by the
following ways:
Consider the 8 input NAND decoder.The expression for WL0 can be regrouped in the
following way:
WL0 = A0 A1 A2 A3 A4 A5 A6 A7
64
Fig. 4.40 A NAND decoder using 2 input
pre- decoder
4096 transistors.
2. As the number of inputs to the gate is halved, the propagation delay is reduced by
approximately a factor of 4.
NOR decoders are substantially faster but they consume more area than their
NAND counterparts and drastically more power. This is clear from the following observation:
only a single word line is being pulled down after the precharge in a NAND decoder while
only a single wire stays high in the NOR decoder.
65
Fig. 4.41. 2 input NOR Decoder
66
performance and architectural considerations. one implementation is based on
CMOS pass transistor multiplexer which is shown in Fig. 4.43.
Fig. 4.43. Four input pass transistor based column decoder usi ng pre-
decoder
The main advantage of this approach is its speed. because only a single pass
transistor is inserted in the signal path which introduced only a minimal extra the
resistance. The column decoding is one of the last actions to be performed in the read
sequence, so that the predecoding can be executed in parallel with other operations such
as the memory access and sensing and can we perform as soon as the column address is
available. Its propagation delay does not to the overall memory access time. Slower
implementation such as Nand decoders might even be acceptable. The disadvantage of
the structure is its large transistor count. (K + 1)2K
+ 2K devices are needed for 2K input decoder. For example 1024 to 1 column decoder
requires 12,288 transistors.
A more efficient way of implementing decoder by a tree decoder which uses binary
reduction scheme. No predecoder is required. The number of transistor count is
drastically reduced.
67
N tree = 2K + 2 K-1 +…….+ 4 + 2 = 2 x (2K – 1)
Sense amplifier
Sense amplifier play a major role in the functionality performance and
reliability of memory circuits. They perform the following functions:
2. Delay reduction: The amplifier compensates for the restricted fan out driving
capability of the memory cell by accelerating the bit line transition or by
detecting and amplifying the small transactions on the bit line to large signal
output swings.
68
3. Power reduction: Reducing the signals going on the bit lines can eliminate a
substantial part of the power dissipation related to charging and discharging the bit
lines.
4. Signal restoration: Because the read and refresh functions are intrinsically linked in
1T DRAM, it is necessary to drive the bit lines to the full signal range after sensing.
The differential amplifier take small signal differential inputs that is the bit
line voltage and amplifies them to a large signal single ended output. It is generally
known that a differential approach presents numerous advantage. Such an amplifier
reject noise that is equally injected to both inputs. This is especially attractive in
memories that the exact value of BIT lines signal varies from die to die and even for
different locations on a single die. The effectiveness of a differential amplifier is
characterized by its ability to reject the common noise and amplify the true
difference between the signals. The signals common to both inputs of suppressed at
the output of the Amplifier by a ratio called common mode rejection ratio. Similarly
Spikes on the power supply are suppressed by a ratio called power supply rejection
ratio.
69
The input signals bit and bit or heavily loaded and driven by the SRAM memory cell.
The swing on those lines as small as a small memory cell drives a large capacitive
load. The inputs are fed to the differential input devices M1 and M2 and transistors
M3 and M4 act as an active current mirror load. The amplifier is conditioned by the
sense amplifier enable signal SE. Initially the inputs of precharged and equalize to a
common value while SE is low disabling the sensing circuit. Once read operation is
initiated, one of the bit line drops. SE is enabled when a sufficient differential signal
has been established and amplifier evaluates.
A =-g (r ǁr )
sense m1 o2 o4
Where gm1 is the transconductance of input transistor and ro is the small signal
resistance of transistor.
Voltage References:
The operation of sophisticated memory required number of voltage references and
supply levels including the following:
▪ Boosted word line voltage: In a conventional 1T DRAM cell using an NMOS pass
transistor, the maximum voltage level that can be written on to a cell level equals
VDD-VT which negatively impact the reliability of the memory. By raising the voltage
to VDD+VT, the charge pump can be used.
▪ Half VDD: DRAM bit lines of precharged VDD/2. This voltage must be generated on
chip.
▪ Reduce internal supply: Most memory circuits operate at a lower power supply
than the external power supply. DRAM use internal voltage regulators to generate
the required voltages while being compliant with the standard interface voltages.
70
The design of voltage reference is fall under the category of analog circuit
design. A some of the reference circuit is given below:
2. Charge Pump:
A charge pump is an ideal generator for the word line boosting and do not
draw much current. The concept is best explained with the simple circuit of Fig.4.47.
Transistors M1 and M2 are connected in diode style. Assume initially that the clock
is
71
high. During this phase node A is at ground and node B at VDD-VT. The charge
stored in the capacitor is given by
72
The bottom devices M3 and M4 act as a current mirror to force the same current
through the drain of M and the resistor R1. By making the device M large and keeping the
1 1
current small enough, the source to gate voltage for M can be made approximately equal to
1
2
VGS,M1 = VTP + 1
M1,
Also the current flowing through the resistor and drain current M1 both are equal V
TP
/R1. Note that M2 act as biasing transistor. Since device M1 and M5 experience the same gate-to-
source voltage, the drain current of M1 is mirrored to M5. The reference voltage is given by
V =
REF VTP 1 2
Buffers/Drivers:
The length of word and bit lines increases with increasing memory sizes. Even
though some of the associated performance degradation can be alleviated by partitioning
the memory array, a large portion of the read and write access time can be attributed to
the wire delays. A major part of the memory Periphery area is therefore allocated to the
drivers, in particular the address buffers and I/O drivers.
73
Timing and Control:
A careful timing of the different events such as address latching, word line
decoding, bit line precharging and equalization, sense amplifier enabling and output
driving are necessary if maximum performance is to be achieved. Although the timing and
control circuitry only occupies a minimal amount of area, its design is an integral and
defining part of the memory design process. It requires careful Optimization, and the
execution of extensive and repetitive SPICE simulations over a range of operating
conditions. So the different memory timing approaches can be classified as clocked and self
timed.
74
In this scheme, the user must provide two main control signals-RAS (row-
address strobe) and CAS (Column-address strobe) that indicate the presence of the
row and column addresses respectively. Another control signal (W) indicates if the
intended operation is read or write. These signals can be interpreted as external
clock signals, and are used to time the internal memory events. Similar to the
second clocking approach, the RAS and CAS signal must be sufficiently separated so
that all the ensured operations have come to the completion. The Fig. 3.15 shows a
simplified timing diagram of a 1 x 4 Mbit DRAM memory.
75
Fig. 4.51 RDRAM Architecture
can be transferred in a short period. Multiple memory chips can be connected to the
bus called the Rambus channel. The schematic diagram of the input/ output circuitry
of an RDRAM is given in the Fig. 4.51.
5.
Explain logarithmic shifter. CO4 K3
76
6.5 Part A Q & A
Unit-IV
S.NO Question and Answers CO K
LEVE
L
delay of the carry path. A number of circuit topologies exist proving that CO4 K1
2
careful optimization of the circuit topology and the transistor sizes helps
78
6.5 Part A Q &
A Unit-IV
S.No Question and Answers CO K
LEVE
L
6. CO4 K1
What are the advantages of carry skip adder?
7. CO4 K2
What is the logic of adder for increasing its performance?
8 CO4 K1
Define input ordering.
For PMOS and NMOS the inner inputs encounters the body effect and
requires high threshold voltage to turn on. By input ordering the rare
changing inputs are moved to inner inputs. This provides sufficient
power saving.
9. CO4 K2
Write down the expression for worst case delay for RCA.
t = (n-1) tc+ts
10. Write down the expression to obtain delay for N-bit CO4 K2
carry bypass adder.
79
6.5 Part A Q &
A Unit-IV
The simplest multiplier is the Braun multiplier. All the partial products
are computed in parallel, and then collected through a cascade of
Carry Save Adders. The completion time is limited by the depth of the
carry save array, and by the carry propagation in the adder. This
multiplier is suitable for positive operands.
12 CO4 K1
Why we go for booth’s algorithm?
Booth algorithm is a method that will reduce the number of
multiplicand m u l t i p l es . For a given number of ranges to be
Logarithm
shifter
14 CO4 K1
What are the various shift operations available?
Logical left shift
Logical right shift
Arithmetic right
shift
15. How many storage locations are available when a CO4 K2
128 columns.
2Z = 32 = 25 Block address = 5
17 Why diode based ROM cell not suitable for large CO4 K1
memories? The disadvantage of diode cell is that it does not
isolate the bit line from the word line. All current required to
charge the bit line capacitance which can be quite High for large
memories, has to be provided through the word line and its driver.
Therefore it is suitable only for small memories.
18 CO4 K2
Draw the schematic symbol of FAMOS
81
6.5 Part A Q &
A Unit-IV
S.No Question and Answers CO K
LEVE
L
19. Draw 6T CMOS SRAM cell. CO4 K2
82
6.5 Part A Q &
A Unit-IV
S.No Question and Answers CO K
LEVE
L
22 CO4 K2
What is CAM?
A CAM is a special type of memory device, that stores data, but also
has ability to compare all the stored data in parallel with incoming
24 CO4 K2
Why NOR decoder are faster than NAND decoder?
NOR decoders are substantially faster but they consume more area
than their NAND counterparts and drastically more power. This is clear
from the following observation: only a single word line is being pulled
down after the precharge in a NAND decoder while only a single wire
external world, sense amplifiers are used to amplify the internal swing
84
6.7 Supportive online Certification courses (NPTEL,
Swayam, Coursera, Udemy, etc.,) for EC8095 VLSI DESIGN
85
Supportive Link to Videos UNIT IV
3 =5-PI4T25OXI
SRAM Operation - https://www.youtube.com/watch?v
4
Memory and Storage =0BM97a7p6Zo
Circuits
86
6.8 Real time Applications in day to day life and
to Industry
2. CMOS technology is used in a wide range of analog circuits which includes data
converters, image sensors & highly incorporated transceivers for several kinds of
communication.
87
6.9 Contents beyond the syllabus
Unit IV 8-BIT KOGGE STONE ADDER
The complete functioning of KSA can be easily comprehended by analyzing it in terms of
three distinct parts :
1. Pre processing This step involves computation of generate and propagate signals
corresponding too each pair of bits in A and B. These signals are given by the logic
equations below:
pi = Ai xor Bi
gi = Ai and Bi
2.Carrylook ahead network This block differentiates KSA from other adders and is the main
force behind its high performance. This step involves computation of carries corresponding
to each bit. It uses group propagate and generate as intermediate signals which are given
by the logic equations below:
5. Implementation
The schematic of KSA is implemented by using following building blocks :
1. Bit propagate and generate This block implements the
88
89
7. ASSESSMENT SCHEDULE
Unit 1 Assignment
Assessment
Unit Test 1
Unit 2 Assignment
Assessment
Internal Assessment 1 27.02.2023
Retest for IA 1
Unit 3 Assignment
Assessment
Unit Test 2
Unit 4 Assignment
Assessment
Internal Assessment 2 18.04.2023
Retest for IA 2
Unit 5 Assignment
Assessment
Revision Test 1
Revision Test 2
University Exam
90
8.Prescribed Text Books & Reference Books
TEXT BOOKS:
1. Neil H.E. Weste, David Money Harris ―CMOS VLSI Design: A Circuits
and Systems Perspectiveǁ, 4th Edition, Pearson , 2017 (UNIT I,II,V)
2. Jan M. Rabaey ,Anantha Chandrakasan, Borivoje. Nikolic, ǁDigital
Integrated Circuits:A Design perspectiveǁ, Second Edition , Pearson ,
2016.(UNIT III,IV)
REFERENCES :
1. M.J. Smith, ―Application Specific Integrated Circuitsǁ, Addisson
Wesley, 1997
2. Sung-Mo kang, Yusuf leblebici, Chulwoo Kim ―CMOS Digital
Integrated Circuits: Analysis & Designǁ,4th edition McGraw Hill
Education,2013
3. Wayne Wolf, ―Modern VLSI Design: System On Chipǁ,
Pearson Education, 2007
4. R.Jacob Baker, Harry W.LI., David E.Boyee, ―CMOS Circuit
Design, Layout and Simulationǁ, Prentice Hall of India 2005.
91
9 MINI PROJECT
92
Thank you
Disclaimer:
This document is confidential and intended solely for the educational purpose of RMK Group of
Educational Institutions. If you have received this document through email in error, please notify the
system manager. This document contains proprietary information and is intended only to the
respective group / learning community as intended. If you are not the addressee you should not
disseminate, distribute or copy through e-mail. Please notify the sender immediately by e-mail if you
have received this document by mistake and delete this document from your system. If you are not
the intended recipient you are notified that disclosing, copying, distributing or taking any action in
reliance on the contents of this information is strictly prohibited.
93 93