Professional Documents
Culture Documents
BACHELOR OF TECHNOLOGY
IN
Submitted by
18FE1A0499 - P. RIZWANA
18FE1A0487 - N. TEJASWINI
18FE1A0463 - K. YOGA SRIRAM
18FE1A0493 - P. SRI HARSHA
CERTIFICATE
This is to certify that the major project work entitled “IMPLEMENTATION OF HIGH-SPEED
AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER”
is a bona fide work done by
18FE1A0499 - P.RIZWANA
18FE1A0487 - N.TEJASWINI
18FE1A0463 - K.YOGA SRIRAM
18FE1A0493 - P. HARSHA
Under my guidance and submitted in partial fulfilment of the requirements for the award of
the Degree of Bachelor of Technology in Electronics and Communication Engineering by
Jawaharlal Nehru Technological University, Kakinada.
External Examiner
ACKNOWLEDGEMENT
We would like to thank our beloved principal Dr. PHANEENDRA KUMAR for
providing a great support for us in completing our project and for giving us the opportunity of
doing the project.
We feel elated to thank Dr. B.HARISH Professor and our Head of the Department
(HOD), for inspiring us all the way and arranging all the facilities and resources needed for our
project.
It is with immense pleasure that we would like to express our indebted gratitude to our
guide Dr. VENKATA KISHORE PERLA who guided us a lot and encouraged us in every step of
our project work. His invaluable moral support and guidance through the project helped us to a
greater extent. We are thankful to him for his valuable suggestions and discussions during the
project.
We express our hearty thanks to all the staff members and non-teaching staff for all their
help and co-operation extended in bringing out this project successfully in time.
Project Associates
Ms. P.RIZWANA
Ms. N.TEJASWINI
Mr. K.SRIRAM
Mr. P.HARSHA
DECLARATION
We hereby declare that the project entitled “IMPLEMENTATION OF
HIGH-SPEED AND AREA-EFFICIENT THREE OPERAND BINARY
ADDER” has been undertaken by us and this work has been submitted to
“VIGNAN’S LARA INSTITUTE OF TECHNOLOGY AND SCIENCE”
affiliated to JNTUK, Kakinada in the partial fulfillment of the requirements for the
award of the degree of Bachelor of Technology (B.Tech) in Electronics and
Communication Engineering (ECE), is the result of the work done by us under the
guidance of Dr. P. VENKATA KISHORE PERLA Assistant Professor of ECE
department.
We further declare that this project work has not been submitted in full or
partial requirements for the award of any degree in any other educational institution.
Project Associates
Ms. P.RIZWANA
Ms. N.TEJASWINI
Mr. K.SRIRAM
Mr. P.HARSHA
ABSTRACT
Three operand binary adder is the basic functional unit to perform the Modular Arithmetic in
various cryptography and Pseudo-Random Bit Generator. Existing methods like Carry Save
Adder (CS3A) is the widely used technique to perform three-operand addition. However, the
ripple-carry stage in the CS3A leads to a high propagation delay. Moreover, a parallel prefix
two-operand adder such as Han-Carlson (HCA) significantly reduces the critical path delay but
increases the area. Hence, a new high-speed and area-efficient adder architecture is proposed
using pre-compute bitwise addition followed by carry prefix computation logic to perform the
three-operand binary addition that consumes substantially less area, low power and drastically
reduces the adder delay and also the proposed adder achieves the lowest ADP and PDP than the
existing three-operand adder techniques. It is implemented on zed board using Xilinx.
v
LIST OF CONTENTS Page no:
ABSTRACT ....................................................................................................... v
LIST OF CONTENTS....................................................................................... vi-vii
LIST OF FIGURES ...........................................................................................viii-ix
CHAPTER I: INTRODUCTION .................................................................... 1- 11
vi
4.2 Operation of proposed Adder ............................................................. 23
vi
i
LIST OF FIGURES Page no:
viii
Fig 7.3: Proposed Adder Simulation Waveforms ............................................................. 47
ix
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
CHAPTER – 1
INTRODUCTION
The batteries driven and portable devices are in great demand in many industrial applications
which need the implementation of low power and area efficient devices. Moore's law was
discovered by Gordon Moore in 1965. He was the Co-founder of INTEL Corporation. He has set
the pace for our modern digital revolution and utilized that the computing world increases in
power and decreases in cost. He has predicted that several transistors in an integrated circuit
would quadruple every two years. This prediction is known as Moore's Law. Today, many
industrial applications are designed in the nanometer range. The transistor size is restricted with
the phenomena like Short Channel Effects including the hot carrier effect and tunnelling through
oxide thickness. In the CPU, the arithmetic logic unit (ALU) is a crucial part. An adder cell is an
important unit of an ALU. Many digital circuit adders are used to perform the addition of
numbers. In many computers, adders are used in other parts of the processor to calculate
addresses, table indices and similar operations. Due to the increase in the demand for portable
devices such as mobile phones, laptops, tablets and the need for area and power efficient V LSI
circuits is arisen. Low power adder cells are used in Low power applications.
DEPARTMENT OF ECE,VLITS 1
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
Before Introduction of VLSI most IC’s had limited set of tasks to perform.
As VLSI allows the designer to add all the external peripherals, CPU, ROM and ROM into
single chip, we can perform several functions by using one IC.
The particular advances associated with VLSI configuration cycle are represented in Figure 1.1.
These means are framework detail, utilitarian structure, rationale configuration, circuit plan,
physical structure, creation and testing.
DEPARTMENT OF ECE,VLITS 2
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
System Specifications
Plan details are required to set out the standards for the structure. While chipping
away at the plan, the fundamental variables to be considered in this procedure incorporate
physical measurements (size of the chip), execution, usefulness, decision of manufacture
innovation and structure methods [10]. The normal final products of the entire procedure are the
determinations for the speed, size, usefulness and intensity of the VLSI circuit.
Behavioural Description
Conduct portrayal is then made to break down the plan as far as usefulness, execution,
consistence to given measures, and different details. The result of this progression is typically
timing chart or different connections between sub-units. This stage is to improve the general
structure process and decrease the intricacy of the resulting stages.
Rationale configuration step changes the social determination into a register exchange
level (RTL) portrayal that incorporates the word widths, control stream, register designation,
rationale and number juggling activities. Further, the practical units are communicated as crude
rationale tasks (NAND, NOT, and so forth.). This depiction can be spoken to as a Hardware
Description Language (HDL), in particular Verilog and VHDL. The primary goal of this
progression is to limit the quantity of Boolean articulations.
DEPARTMENT OF ECE,VLITS 3
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
Logic Synthesis
Physical design
Physical structure is a phase in the standard arrangement cycle which trails the circuit
plan. At this movement, circuit depictions of the parts (contraptions and interconnects) of the
structure are changed over into geometric depictions of shapes which, when delivered in the
relating layers of materials, will ensure the necessary working of the portions. T his geometric
depiction is called facilitated circuit group. The last execution of the circuit is evaluated through
a minimized course of action of the territory and exact steering of wires. Being a NP -difficult
issue, the physical structure is additionally separated into various sub-issues, which is signified
as parceling, arrangement and directing. The focal point of this exploration is to consider the
apportioning and position.
Finally, the wafer is produced and diced in a manufacture office. So as to guarantee that
the chips meet all the structure and practical necessities, each chip is bundled and tried. In any
case, the achievement of the whole procedure firmly relies upon the connection between
conceptual models at the larger amount and physical usage at the lower level.
The rationale blend and circuit configuration results in the circuit parts, which are
separated from a physical library and changed over into rectangular shapes with fixed
measurements. The circuit segments are called as cells or modules and the interconnections as
nets which are gathered as a netlist. The planning imperatives on sign proliferation ways along
nets are characterized. A total format of the circuit, where every one of the cells are situated on
the chip without covering and all the interconnection ways finished, is the yield of the physical
plan arrange. This format is accomplished in different stages: apportioning, floor planning,
arrangement, directing and compaction. Figure 1.2 delineates the phases of circuit design.
DEPARTMENT OF ECE,VLITS 4
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
Partitioning
The building change requests can be dealt with by a compelling and proficient parceling
device by enormously diminishing the unpredictability of the structure procedure. In addition, last
item as for creation cost and framework execution is assessed dependent on the nature of the
apportioning. Parcelling is a procedure to separate a circuit or framework into a gathering of littler
segments. It is a structure task that breaks a huge framework into littler pieces to be actualized on
discrete cooperating parts. While in the meantime, it likewise goes about as an algorithmic strategy
to comprehend troublesome and complex combinatorial streamlining issues as in rationale.
The span of VLSI plans has expanded to frameworks of a huge number of transistors. The
multifaceted nature of the circuit has turned out to be high to the point that it is hard to structure and
recreate the entire framework without disintegrating it into sets of littler sub-frameworks.
Subsequently, the circuits are apportioned by gathering the segments into squares otherwise
called sub-circuits or modules. Be that as it may, the real apportioning procedure depends on
variables like number of obstructs, the span of the squares, and the quantity of interconnections
between the squares. The yield of dividing is a lot of squares alongside the interconnections
required by squares, which are alluded to as a netlist.
DEPARTMENT OF ECE,VLITS 5
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
Floor planning
Floor organizing is the route toward recognizing structures that should be put close to
each other, and assigning space for them so as to meet the once in a while conflicting
destinations of open space (cost of the chip), required execution, and the hankering to have
everything close to everything else. Floor arranging incorporates finding the arrangement and
relative introduction of the modules with the goal that the absolute gadget region is limited. The
arrangement of the modules is done on the premise that emphatically associated segments come
nearer to one another. After floor arranging, the steering area must be partitioned into channels
and switchboxes.
Placement
An urgent errand in chip configuration is situation which requires the places of the
modules to be settled on a given chip region. The arrangement impacts the complete length of
wires required to interconnect them, and therefore on the presentation of the chip, the sign
change times and the utilization of intensity. Meeting time imperatives were not a troublesome
assignment before, yet in present day times because of mechanical progression and developing
multifaceted nature, the chip configuration has turned into a complex and te dious procedure.
Likewise, associated advancement objectives and tight requirements should be tended to. Thus,
the situation necessities continue changing, as position instruments turned into an essential
segment in incorporated plan streams at various stages and situations. Because of its iterative
nature, arrangement has made the general turnaround time significantly delicate to the runtime of
the position. The fluctuated reasons legitimize the requirement for adaptable yet amazing and
quick position calculations which is basic for quicker turnaround and least time to showcase.
To total up, the fundamental target of the position is to locate a base territory plan of the
obstructs that permits fulfillment of interconnections between the squares. Two stages are
incorporated into standard cell situation. The principal stage incorporates the making of
beginning situation. The second stage incorporates evaluation of the underlying arrangement
pursued by enhancement for the emphasis till the format contains least zone and pursues the plan
determinations. So as to allow interconnections, space is left clear between the squares. Amid
arrangement, an expected measure of steering space is included between the phones. In one of the
past works of the creator, it has been demonstrated that assessing the space is vital, as an excessive
amount of room can prompt imperfect designs or too little space may discount the ideal (briefest)
courses for all nets, likewise the fruition of the interconnections can likewise turned out to be
DEPARTMENT OF ECE,VLITS 6
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
In any case, the nature of the situation can't be assessed until the steering stage is finished.
Another instance of situation might be completed if routable structure isn't accomplished because of
time requirement. Further, the exertion is made to diminish the quantity of cycles by evaluating the
required directing space since when the places of the squares are fixed, it is difficult to improve the
steering just as the general execution of the circuit. In this way, it is apparent that the productive
situation calculation is basic for a decent directing and circuit execution.
Routing
Routing is viewed as a standout amongst the most convoluted strides in the back-end
configuration process. It alludes to finding reasonable ways inside the accessible format space
where the wires are associated with the ideal arrangement of pins. It tries to limit complete wire
length yet as for track limit limitations. The fundamental objective of this stage is to guarantee
that the interconnections are finished between the squares according to the predetermined netlists
in semi hand craft. At the point when free spaces are empty, they are divided into channels and
switchboxes, which are utilized to finish all circuit interconnections through the briefest
conceivable wire length. Directing takes a two-organize strategy: worldwide steering and
definite steering. Worldwide directing associates the squares of the circuit without considering
the precise geometric subtleties of each wire and stick. Worldwide steering determines the "free
course" of a wire through various locales in the directing space. Point by point directing doles
out real tracks and vias for nets.
The two particular issues that emerge because of these steering is to adjust the densities of
the directing channels and to relegate explicit wire fragments to every association. The goals of the
worldwide directing calculation are to circulate the associations among the channels to guarantee that
the channel densities are adjusted and to lessen the quantity of "turns" for every association. The
principle disadvantage of two-organize procedure of steering, i.e., worldwide and nitty gritty
directing, is that it doesn't give suitable chances to handle issues which emerge from sign
postponement, cross talk and procedure imperatives. Batterywala, et al. set forward a transitional
advance of track task among worldwide and point by point directing to address these issues.
Amid this stage, the worldwide steering data can be utilized to productively deliver these issues
and to help the itemized switch in accomplishing the wiring fruitions.
DEPARTMENT OF ECE,VLITS 7
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
Due to the restricted pin number the digital devices Testing is more difficult. At-speed the
real-life testing implies the analog and digital subsystems. The analog test had the ad-hoc nature it
is particularly difficult enough access terminals due to the lack of enough test signals readout
outputs. Under the study such complex systems have Specific tests. Since long ago, for the digital
circuitry which can be considered adequate there exist general test techniques. It is always not
possible for such complex digital systems to find out the efficient solutions. For the advanced
technologies they used in the past today the fault models, of digital are not as well- as established
more complex. One may find in the modern metal oxide semiconductor technologies are not
kindly reflecting the faults models like stuck-at models, Traditional.
The electrical level of being satisfactorily modeled far from the failure mechanisms as it is
affecting the micro sensors (or nano sensors). To the alternative complex system on a chip the
interconnects of a New is under developed, like construction of systematic investment plan
which addresses the 3-D integration. This is connecting to the idea of two or even more chips,
based on mounting together through the silicon. The advantages of metal oxide semiconductor
gain the higher density of every chip. The problem of power dissipation is still a problem but the
semiconductor is not yet mature.
The efficient clock distribution is also an issue for the quite Timing quite formidable task
for the large chips. Related to the totally-synchronous designs there are many problems to the
clock skew, clock distribution, along the chip to the clock power consumption, fan-out. There
are mixed-signal components; to require a clock it is even more complex. To the Beginning
there are partially-synchronous techniques to be popular; there are not always compensated
disadvantages and advantages. Related to the clocking itself, the chip activity is restricted to the
processing events to avoid the problems that synchronism the power dissipation.
The circuit parts which are not operative can be asleep by activity controlled clock instead
of activity periodic. The use of a non-synchronous paradigm is now targeting the Even mixed-
signal circuitry (basically, data converters). Related to the technology and its modeling there are
other difficulties As far as incorporating to very large scale integration sensors and actuators.
Based on the electrical simulators may require non-electrical models which have to be compatible
with the regular design flow of Sensors and actuators. The design flow extensions or modifications
must be targeted in the modern days of very large scale integration research of the extended
models.
DEPARTMENT OF ECE,VLITS 8
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
The incorporation of non-digital devices forces to much more complex flows, where
compatibility between components is essential about design flows. In addition to the first-silicon
working is not an easy task to verify the issues are more needed to the guaranteeing. Since long
ago the integrated Design flows can associate the tasks that are supported by Computer Aided
Design tools, in the so-called design frameworks and platforms which are becoming more
complex. The particular chip to be realized the number and diversity of tools to be used that are
dependent.
The digital world can be considered the Topics related to the mature comp lex very large
scale integration enough in the frame. Since all these topics need some revisiting Nevertheless,
they are impact on complexity has increasing, and within the chip have to cooperate the digital
circuitry that need to coexist with the More-than-Moore devices. With the efficiency of non-
digital subsystems the Architectural issues need to adapt and manage.
Architecture as can be seen there are various blocks which are need to coding in the tool,
but the question is why Verilog only can be used and why in what way does System Verilog
have advantage over Verilog. DUT Block is the Device under test i.e. the top module for which
coding is done by using Verilog. If all the remaining Blocks are coded via Verilog then we have
to instantiate coding by defining module name for each block, module to module communication
is very hectic thus we don’t prefer Verilog for coding of other blocks instead prefer System
Verilog. The main reason is System Verilog includes OOPs concepts thus defining each blocks
codes in a class format provides easy way of coding as compare to Verilog.
There are many companies which provide simulating advanced verification tools namely QuestaSim,
ModelSim, Xilinx, Cadence etc. Full adders are important components in applications such as digital
signal processors (DSP) architectures and microprocessors. Apart from the basic addition adders also
used in performing useful operations such as subtraction, multiplication, division, address
calculation, etc. In most of these systems the adder lies in the critical path that determines the overall
performance of the system. In this paper conventional complementary metal oxide semiconductor
(CMOS) and adiabatic adder circuits are analyzed in terms of power and transistor count using
0.18UM technology. Communication systems use the concept of transmitting information using
the electrical distribution network as a communication channel. To enable the transmission data
signal modulated on a carrier signal is superimposed on the electrical wires. Typical power lines
are designed to handle 50/60 Hz of AC power signal; Power has become one of the most
important paradigms of design convergence for multi Giga hertz communication systems such as
optical data links, wireless products, microprocessor &ASIC/SOC designs.
DEPARTMENT OF ECE,VLITS 9
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
Lowering the supply voltage, however, also reduces the performance of the circuit, which is
usually unacceptable. One way to overcome this limitation, available in some application domains, is
to replicate the circuit block whose supply voltage is being reduced in order to maintain the same
throughput .This paper introduces design aspects for low power phase locked loop using VLSI
technology. This phase locked loop is designed using latest 45nm process technology parameters,
which in turn offers high speed performance at low power. The main novelty related to the 45nm
technology such as the high-k gate oxide, metal-gate and very low-k interconnect dielectric
described. VLSI Technology includes process design, trends, chip fabrication, real circuit
parameters, circuit design, electrical characteristics, configuration building blocks, switching
circuitry, translation onto silicon, CAD, practical experience in layout design.
Majority of Digital Signal Processing (DSP) applications require arithmetic blocks such
as multipliers and adders for hardware realization of complex algorithms. Power consumption of
arithmetic blocks need to be minimized by use of low power techniques. In this paper, an
experimental setup is developed to identify the sources of power dissipation and remedies that
can be adopted to minimize power dissipation in arithmetic blocks. Use of low power techniques
such as Multi Vt, variable Vt, pipelining, geometry scaling and use of appropriate load
capacitance have been used to reduce power dissipation. A 4 -bitpipelined adder is designed and
the power dissipation is reduced to 4.17µW from 9.6µW. The designed pipelined adder can be
used for DSP applications.
1. DSP
2. Communications
3. Microwave and RF
4. MEMS
DEPARTMENT OF ECE,VLITS 10
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
5. Cryptography
6. Consumer Electronics
7. Automobiles
8. Space Applications
9. Robotics
5. Higher Reliability
DEPARTMENT OF ECE,VLITS 11
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
Chapter- 2
Literature Survey
Types of Adders:
There are many proposed adders. Each adder has its advantages and limitations.
Overcoming the drawbacks from the previous adders, different adders are proposed. Let us
briefly discuss about the adders.
DEPARTMENT OF ECE,VLITS 12
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
DEPARTMENT OF ECE,VLITS 13
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
selecting the carry. As the name suggest the carry generated is selected on the basis of the
Cin provided at the start. Here we use two different carry-in values i.e 0 and 1 as shown in Fig.4.
Traditionally we use one carry value which gets generated from the previous block of the FA, for the
calculation of the full adder ahead of it, but here we have carry values of 0 and 1 instead of one
carry value. As there are two carry-in values so we use two sets of full adders for calculating the
sum and carry for each carry-in values. So, we pre-calculate the sum and the carry values for the
particular block. Now on the basis of Cin the sum and carry values are carry forwarded by the
Multiplexers, AND and OR gate. And hence we get the sum and carry values. When we
consider speed as a factor then it is faster than RCA, carry & sum each have two output values
for two possible input values of Cin i.e, 0 & 1. Depending on the initial value of Cin the value of
carry & sum is taken at output.
But the delay is more in CSA, although the values are calculated beforehand but the use of twice the
number of FA’s along with MUX and AND gate increases the delay. Cost is higher than RCA and
CLA, the number of FA’s is double the amount of FAs in RCA, usage of MUX & area occupied
is more.
DEPARTMENT OF ECE,VLITS 14
MPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
It is faster than RCA, CLA, CSA & CSVA adders, depending upon the Pi signal
output from AND logic block the carry is selected. If output of AND logic block is 1 then
Cin(skipping the addition via FA’s) is selected as carry and if 0 is the output then carry is
generated via FA’s. Whereas the delay depends on the value/s of Pi & AND logic block, if AND
logic block gives output as ‘1’ then calculation over FA’s is skipped or else calculation happens
Via FAs.
DEPARTMENT OF ECE,VLITS 15
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
If the number of inputs increase, then the number of FA’s increase which increases the
area. It is used in application where it is needed to add more than two numbers at a time.
Number of FAs used here are less as compared to other types of adders.
When it comes to speed, it is slower than CSKA & PPA adders, for a greater number of
inputs the speed will be low as compared to a smaller number of inputs. As number of inputs
increase speed decreases. Here adder uses carry of one FA block with sum of the second FA
block to get the sum. As per shown in the upper part of the diagram, the important point is that
the sum and carry are calculated individually and not simultaneously. Firstly, the sum is
calculated without taking carry into consideration. Then adder calculates carry by leaving one
space from the LSB side as there is no carry at the start. After the sum and carry values are
calculated, then adder adds both of them together to get the final number as per shown in the
lower part of the diagram. Delay is variable, as the number of inputs increases delay also
increases, also CSVA is composed of RCA, so the delay is more. Cost is less, nominal cost as its
basic component is RCA, but cost may increase if number of inputs increase.
DEPARTMENT OF ECE,VLITS 16
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
DEPARTMENT OF ECE,VLITS 17
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
DEPARTMENT OF ECE,VLITS 18
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
CHAPTER-3
EXISTING ADDERS
CS3A contains array of full adders in two stages. Stage-1 directly adds the input
bits without considering the previous carry. Stage-2 adds the previous stage output to the carry
generated by the adjacent previous adder in the same stage. Hence the sum and carry is produced.
CS3A operates at good speed when operand size is small. As operand size
increases it also increases the delay of the circuit. CS3A occupies less area when compared to
other parallel prefix adders. It is mostly adopted when the operand size is small and in the
applications where area plays a vital role.
➢
The dotted line indicates the critical path delay of the circuit.
➢
a, b, c are the n-bit inputs and Cin is the input carry.
➢
S is the n-bit sum output and Cout is the carry out.
Advantages of CS3A:
DEPARTMENT OF ECE,VLITS 19
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
Limitations of CS3A:
To overcome the problem encountered in CS3A that is to shorten the critical path delay, two
stages of parallel prefix two-operand adder can also be used. In literature, parallel prefix or
logarithmic prefix adders are the fastest two-operand adder techniques.
One of the parallel prefix adder is Han Carlson Adder. Basically it was designed
for two operand adders. In order to implement three-operand addition, two stages of Han Carlson
Adders are connected in such a way that output of first stage is given as input to the second stage
and named as Han Carlson 3-operand binary adder(HC3A).
DEPARTMENT OF ECE,VLITS 20
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
DEPARTMENT OF ECE,VLITS 21
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
Advantage:
• Delay is less compared to CS3A.
Limitation:
• Area occupied is very high compared to CS3A.
DEPARTMENT OF ECE,VLITS 22
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
CHAPTER-4
PROPOSED ADDER
The proposed adder is a parallel prefix adder. It contains four stages.Each stage contributes
to the final result of the adder.
4.2 Operation
Stage-1: Bit Addition Logic
• Stage-1 consists of an array of full adders. For n-bit addition there are n full adders.
• Full adder produces two outputs based on the given inputs.
• Let us consider that a, b, c are the vector inputs whose addition has to be done.
The output of a full adder be S’ and cy and they are given as:
DEPARTMENT OF ECE,VLITS 23
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
• It takes stage-1 inputs and compute generate and propagate bits with the following logic.
• Stage-3 contains an array of black and grey cells which is arranged as shown in the
fig:4.1
DEPARTMENT OF ECE,VLITS 24
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
• This is the last stage which computes sum and carry bits by taking generate and propagate
bits as inputs from the previous stage output.
DEPARTMENT OF ECE,VLITS 25
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
DEPARTMENT OF ECE,VLITS 26
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
CHAPTER-5
XILINX SOFTWARE
Design Entry
In design flow of ISE the first step is design entry. Based on your design objectives the
sources files are created during the design entry. Using a Hardware Description Language
(HDL), such as VHDL, Verilog, or ABEL, or using a schematic the top-level design is created.
In your design for the lower-level source files in your design the multiple formats are used.
Synthesis
The Synthesis is last step which comes after design entry and optional simulation. In this
process net list is created from VHDL, Verilog, or mixed language designs made by t he user.
The result of this step is net list which is given as input to the implementation step.
Implementation
You can run design implementation after synthesis, here into a physical file format the
logical design is converted and to the selected target device it can be downloaded. In one step
the implementation process is made to run from the project navigator, or separately you can run
each of the implementation processes. Depending on whether you are targeting a Field
Programmable Gate Array (FPGA) or a Complex Programmable Logic Device (CPLD) the
implementation process vary.
Verification
You can check the helpfulness of your structure at a couple of focuses in the arrangement stream.
You can use test framework programming to affirm the helpfulness and timing of your structure or a
section of your arrangement. The test framework decodes VHDL or Verilog code into circuit
helpfulness and grandstands sound eventual outcomes of the portrayed HDL to choose right circuit
DEPARTMENT OF ECE,VLITS 27
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
task. Reenactment empowers you to make and affirm complex limits in a by and large little
proportion of time. You can similarly seek after in-circuit check programming your device.
Device Configuration
In order to configure your device first a program file is generated. The configuration is
done by downloading the programming files from a host computer to a Xilinx device
DEPARTMENT OF ECE,VLITS 28
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
1. Toolbar
2. Sources window
3. Processes window
4. Workspace
5. Transcript window
DEPARTMENT OF ECE,VLITS 29
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
DEPARTMENT OF ECE,VLITS 30
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
Before the past 25 years the field programmable gate array increased lower cost per
transistor in logic cell counts, functionality relentlessly. Field programmable gate array have
taken steadily market share from the ASIC markets and gate array.
The field programmable gate array progression helped render gate arrays obsolete before
15 years ago. the exorbitant cost of designing and manufacturing the ASICs numerous trends,
doesn’t change the standards to reduce the Materials, of Bill in the face of economic times the
programmability software need in the face of economic times for both hardwar e, software to
create an environment of rough reduced staffing –converging the ASICs where the products
electronics in the favor of field programmable gate array are designers at a greater pace,
dumping. The convergence of programmable imperative trends.
With several hundred and thousands the cells programmable, the programmable field gate array
is available up to 11.2 transceivers of Gbps, Block RAM, of 38 Mb and digital signal processing
slices of 2,000. The field programmable gate array is leveraging the number of applications
Designers to ever-growing address. The opportunity things of all are considered, for field
programmable gate array to step up the pace in gobbling the application software to create an
environment of rough reduced staffing –converging the ASICs where the products electronics in
the favor of field programmable gate array are designers at a greater pace, dumping. The
convergence of programmable imperative trends.
To market quickly the opportunities of Xilinx is actively moving and help the customers
to get field programmable gate array of their innovation. This year, we introduce the goal in
mind the Spartan-6 and Virtex-6 Field programmable gate array families to design the tools,
need for hardware- and software- to develop the boards, the support was emphasized. In some
manner, these elements offered our customers to bring to closely defined and refined flows it is
tied to the silicon of targeted.
The Platform Design of Targeted approach as a pyramid (see Figure 1). The foundation
layer of the pyramid serves the Base Platform. It is composed of base development boards of our
DEPARTMENT OF ECE,VLITS 31
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
Virtex-6 and Spartan-6 field programmable gate array silicon, our ISE Design Suite. in which
we offer embedded, The Domain Specific Platform, is above the next layer reference designs of
DSP, tools of domain and connectivity IP, the forward plug daughter cards into the boards base
of market commission. the layer, of the top the Specific Platforms of various Market customers
these are composed of IP, communication, custom tools, video, or the market of AVB and
custom boards.
The value-added portions majority of their design efforts to choose the designs, they
significantly concentrate their overall design time to reduce. If they want, Customers can, design
every function from the scratch designs. Most customers certainly will choose to concentrate the
value-added portions of their designs to benefits the reinventing the wheel.
The Targeted Design Platform is part of the field programmable gate array approach to suit
specific design disciplines our tool flows are also refining. The figure out traditionally we offered all
our users’ which tools match the tasks. The smorgasbord tools left it up. The licenses suitable to their
budgets obtaining the number of tools. We will soon offer the editions of domain-specific Irish
specific exchange the Design to Suite the help user’s with specific jobs pair tools.
On top of the ISE design environment One of the digital signal processing bundles of edition to
improved the System Generator for digital signal processing synthesis, of Accel digital signal
processing of DSP-specific IP running. To find it useful the digital signal processing Edition is
primarily targeted to the developers of algorithm they are logic designers but not high density
lipoproteins they do some amount of algorithm development. So if digital signal processing the users
of Edition want to do a software application bit development of their algorithms they can add the
software development kit. The stand-alone tool is Xilinx Software Development Kit (SDK).
Over the last 10 years the businesses have fared the field programmable gate array and
application specific integrated circuits move to a Platform approach of Targeted Design it makes
sense. Xilinx’s business of roughly 80 percent came from the industry of wired and
communications of wireless before 10 years ago. With the rest of the semiconductor industry the
communications bubble burst circa 2001 dot-com, business affected and declined in turn of
adversely our business. The vertical application groups in 2002 Xilinx quickly created groups in
defense, aerospace, It should not happen again to the Xilinx of automotive, Medical (ISM),
DEPARTMENT OF ECE,VLITS 32
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
Scientific, and Industrial, it is wired and wireless communications broadcast as well. Xilinx
establish a much broader customer base.
again by half drop or more. It is not a matter of how much they will drop certainly,
matter if they will drop.
DEPARTMENT OF ECE,VLITS 33
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
STEP-2: It will show New Project wizard as shown below. In that window type name of
the project and then click on Next.
After clicking on Next it will show another window as shown below. In that select
simulation, preferred language ect. After that click on Next and then click on Finish
DEPARTMENT OF ECE,VLITS 34
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
It will show New Source Wizard as shown below. In that select Source Type like VHDL
Module, Verilog Module etc. and give File Name and then click on Next button.
DEPARTMENT OF ECE,VLITS 35
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
It will show the following window. In that give the port variables like import, out port etc.
Then click on next button.
Click on Next button it will show Summary window and then click on Finish. Immediately
it shows code window, in code widow type the code.
→ → →
STEP-4: Then select Implementation - select file name - double click on Check Syntax
If there are errors in code it will show red cross mark otherwise it shows right tick mark
as shown in below figure.
DEPARTMENT OF ECE,VLITS 36
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
DEPARTMENT OF ECE,VLITS 37
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
And then double click on View RTL Schematic it will show RTL Schematic.
And then double click on View Technology Schematic it will show technology Schematic.
DEPARTMENT OF ECE,VLITS 38
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
Click on OK button.
DEPARTMENT OF ECE,VLITS 39
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
Double click on Behavioral Check syntax .If there is no errors it will show following window.
Double Click on Simulate Behavioral Model .Then it shows wave form window as shown
below.
DEPARTMENT OF ECE,VLITS 40
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
In waveform window right click on in port variable and select Force Constant
Immediately it show another window in that window enter input value at Force to Value
and click on Apply and then click on OK.
DEPARTMENT OF ECE,VLITS 41
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
DEPARTMENT OF ECE,VLITS 42
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
CHAPTER-6
APPLICATIONS
• A Full Adder’s circuit can be used as a part of many other larger circuits like Ripple Carry
Adder, which adds n-bits simultaneously.
• The dedicated multiplication circuit uses Full Adder’s circuit to perform Carryout
Multiplication.
• Full Adders are used in ALU- Arithmetic Logic Unit. In order to generate memory
addresses inside a computer and to make the Program Counter point to next instruction,
the ALU makes use of Full Adders.
• Full adders are also used in generating Pseudo-Random Bits and in many Cryptographic
algorithms.
6.1 Cryptography:
DEPARTMENT OF ECE,VLITS 43
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
secured algorithm is designed based on the principles of an LCG named as MODIFIED DUAL
COMBINED LINEAR CONGRUENTIAL GENERATOR (MDCLCG).
DEPARTMENT OF ECE,VLITS 44
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
• Considering multiplier constants as a1, a2, a3, a4 and increment constants as b1,b2,
b3,b4 for 32 bit MDCLCG.
Xi={x1,x2,x3,x4,…}
Yi={y1,y2,y3,y4,…}
Pi={p1,p2,p3,p4,…}
Qi={q1,q2,q3,q4,…}
The 32-bit MDCLCG architecture is designed with the constant values of a1 = 65, b1 = 117, a2
= 16385, b2 = 221, a3 = 4097, b3 = 21359, a4 = 65537, b4 = 533, m = 232 and initial seeds of
(x0, y0, p0, q0) = (5183, 91356, 39771, 7392) which generates the sequence as follows,
x1 = (a1 × x0 + b1)
= 337012
DEPARTMENT OF ECE,VLITS 45
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
In the above equation the three input modulo-2n addition is performed using the proposed three-
operand adder technique. Similarly, the sequences x2, x3,... can be computed as follows
In the same way, other sequences such as Yi , Pi and Qi are also computed as follows,
Qi = {484450037,1003823370, 1558127391,2147362100,...}
The sequences Bi and Ci in the MDCLCG architecture are generated by comparing Xi with
Yi and Pi with Qi respectively using the magnitude comparator as follows,
Bi = {0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0,...}
Ci = {0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1,...}
The pseudorandom bit sequence Zi is obtained by Bi ⊕ Ci as highlighted below,
Zi = {0, 1, 0, 1, 1, 1, 0, 0, 1, 0, 1, 0, 1, 1,...}
DEPARTMENT OF ECE,VLITS 46
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
CHAPTER-7
RESULTS
7.1 Simulation Waveforms
DEPARTMENT OF ECE,VLITS 47
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
DEPARTMENT OF ECE,VLITS 48
CHAPTER-8
APPENDIX
8.1 CODES:
8.1.1 CS3A:
module CS3A_16bit(
input [15:0] a,
input [15:0] b,
input [15:0] c,
input cin,
output [16:0] sum,
output cout
);
wire [15:0] M;
wire [15:0] L;
wire [15:0] k;
genvar i;
generate for(i=0;i<16;i=i+1)begin: Full_Adder1
FA stage1(.a(a[i]),.b(b[i]),.c(c[i]),.sum(M[i]),.carry(L[i]));
end
endgenerate
HA label1(.a(cin),.b(M[0]),.sum(sum[0]),.carry(k[0]));
genvar j;
generate for(j=1;j<16;j=j+1)begin: Full_adder2
FA stage2(.a(L[j-1]),.b(M[j]),.c(k[j-1]),.sum(sum[j]),.carry(k[j]));
end
endgenerate
HA label2(.a(k[15]),.b(L[15]),.sum(sum[16]),.carry(cout));
endmodule
// FULL ADDER
DEPARTMENT OF ECE,VLITS 49
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
module FA(
input a,
input b,
input c,
output sum,
output carry
);
assign sum= a ^ b^c;
assign carry=(a & b)|(b & c)|(c & a);
endmodule
// HALF ADDER
module HA(
input a,
input b,
output sum,
output carry
);
assign sum= (a ^ b);
assign carry=(a & b);
endmodule
8.1.2 HC3A:
module HC3A_16bit(
input [15:0] a,
input [15:0] b,
input [15:0] c,
input cin,
output [15:0] sum,
output cout
);
wire [15:0] s;
DEPARTMENT OF ECE,VLITS 50
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
wire c1;
HCA2_code stage1(.a(a),.b(b),.cin(cin),.sum(s),.cout(c1));
HCA2_code stage2(.a(s),.b(c),.cin(c1),.sum(sum),.cout(cout));
endmodule
// 2 Operand Han Carlson Adder
module HCA2_code(
input [15:0] a,
input [15:0] b,
input cin, output
[15:0] sum, output
cout
);
wire [15:0] p;
wire [15:0] g;
wire [15:0] gr;
DEPARTMENT OF ECE,VLITS 51
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
endgenerate
//stage 3
gray_cell label5(.a(bp2[5]),.b(gr[1]),.c(bg2[5]),.g(gr[5]));
gray_cell label6(.a(bp2[7]),.b(gr[3]),.c(bg2[7]),.g(gr[7]));
generate for(i=9;i<16;i=i+2)begin: stage3
black_celllabel7(.a(bp2[i]),.b(bg2[i-4]),.c(bg2[i]),.r(bp2[i]),.s(bp2[i-
4]),.g(bg3[i]),.p(bp3[i]));
end
endgenerate
//stage 4
gray_cell label8(.a(bp3[9]),.b(gr[1]),.c(bp3[9]),.g(gr[9]));
gray_cell label9(.a(bp3[11]),.b(gr[3]),.c(bg3[11]),.g(gr[11]));
gray_cell label10(.a(bp3[13]),.b(gr[5]),.c(bg3[13]),.g(gr[13]));
gray_cell label11(.a(bp3[15]),.b(gr[7]),.c(bg3[15]),.g(gr[15]));
generate for(i=2;i<16;i=i+2) begin: even_stage
gray_cell label12(.a(p[i]),.b(gr[i-1]),.c(g[i]),.g(gr[i]));
end
endgenerate
assign sum[0] =p[0]^cin;
assign sum[1]=p[1]^g[0];
generate for(i=2;i<16;i=i+1)begin: assignment
assign sum[i]=p[i] ^ gr[i-1];
end
endgenerate
assign cout=gr[15];
endmodule
// HALF ADDER
module HA(
input a,
input b,
output sum,
output carry
DEPARTMENT OF ECE,VLITS 52
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
);
assign sum= (a ^ b);
assign carry=(a & b);
endmodule
// GREY CELL
module gray_cell(
input a,
input b,
input c,
output g
);
//BLACK CELL
module black_cell(
input a,
input b,
input c,
input r,
input s,
output g,
output p
);
assign g=(a & b) | c;
assign p=(r & s);
endmodule
DEPARTMENT OF ECE,VLITS 53
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
input [15:0] b,
input [15:0] c,
input cin,
output [15:0] sum,
output cout
);
// FULL ADDER
module FA(
input a,
input b,
DEPARTMENT OF ECE,VLITS 54
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
input c,
output sum,
output carry
);
assign sum= a ^ b^c;
assign carry=(a & b)|(b & c)|(c & a);
endmodule
// BASE LOGIC
module base(
input s,
input cy,
output p,
output g
);
assign p=s ^ cy;
assign g=s & cy;
endmodule
DEPARTMENT OF ECE,VLITS 55
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
//stage 3
gray_cell label5(.a(bp2[5]),.b(gr[1]),.c(bg2[5]),.g(gr[5]));
gray_cell label6(.a(bp2[7]),.b(gr[3]),.c(bg2[7]),.g(gr[7]));
generate for(i=9;i<16;i=i+2)begin: stage3
black_cell label7(.a(bp2[i]),.b(bg2[i-4]),.c(bg2[i]),.r(bp2[i]),.s(bp2[i-
4]),.g(bg3[i]),.p(bp3[i]));
end
endgenerate
//stage 4
gray_cell label8(.a(bp3[9]),.b(gr[1]),.c(bp3[9]),.g(gr[9]));
gray_cell label9(.a(bp3[11]),.b(gr[3]),.c(bg3[11]),.g(gr[11]));
gray_cell label10(.a(bp3[13]),.b(gr[5]),.c(bg3[13]),.g(gr[13]));
gray_cell label11(.a(bp3[15]),.b(gr[7]),.c(bg3[15]),.g(gr[15]));
generate for(i=2;i<16;i=i+2) begin: even_stage
gray_cell label12(.a(p[i]),.b(gr[i-1]),.c(g[i]),.g(gr[i]));
end
endgenerate
assign sum[0] =p[0]^cin;
assign sum[1]=p[1]^g[0];
generate for(i=2;i<16;i=i+1)begin: assignment
assign sum[i]=p[i] ^ gr[i-1];
DEPARTMENT OF ECE,VLITS 56
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
end
endgenerate
assign cout=gr[15];
endmodule
// GRAY CELL
module gray_cell(
input a,
input b,
input c,
output g
);
// BLACK CELL
module black_cell(
input a,
input b,
input c,
input r,
input s,
output g,
output p
);
assign g=(a & b) |
c; assign p=(r & s);
endmodule
DEPARTMENT OF ECE,VLITS 57
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
input [15:0] m,
output [5:1] Z
);
wire [5:1] B,C;
CLCG_Proposed m1(.a1(a1),.a2(a2),.x0(x0),.y0(y0),.b1(b1),.b2(b2),.m(m),.B(B));
CLCG_Proposed m2(.a1(a3),.a2(a4),.x0(p0),.y0(q0),.b1(b3),.b2(b4),.m(m),.B(C));
assign Z[1] = B[1]>=C[1];
assign Z[2] = B[2]>=C[2];
assign Z[3] = B[3]>=C[3];
assign Z[4] = B[4]>=C[4];
assign Z[5] = B[5]>=C[5];
Endmodule
module CLCG_Proposed(
input [15:0] a1,
input [15:0] a2,
DEPARTMENT OF ECE,VLITS 58
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
Endmodule
module LCG_Proposed_Adder(
input [15:0] a,
input [15:0] x0,
input [15:0] b,
input [15:0] m,
output [15:0] x1,x2,x3,x4,x5
);
wire [15:0] k0,k1,k2,k3,k4,c,sum1,sum2,sum3,sum4,sum0;
wire c1,cin;
assign cin=0;
genvar i;
generate for(i=0;i<16;i=i+1) begin: Assigning
assign c[i]=0;
End
Endgenerate
DEPARTMENT OF ECE,VLITS 59
IMPLEMENTATION OF HIGH-SPEED VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
proposed_adder16 l1(.a(k0),.b(b),.c(c),.cin(cin),.sum(sum0),.cout(c1));
assign x1= sum0 % (2 ** m);
assign k1=(a*x1);
proposed_adder16 l2(.a(k1),.b(b),.c(c),.cin(cin),.sum(sum1),.cout(c1));
assign x2= sum1 % (2 ** m);
assign k2=(a*x2);
proposed_adder16 l3(.a(k2),.b(b),.c(c),.cin(cin),.sum(sum2),.cout(c1));
assign x3= sum2 % (2 ** m);
assign k3=(a*x3);
proposed_adder16 l4(.a(k3),.b(b),.c(c),.cin(cin),.sum(sum3),.cout(c1));
assign x4= sum3 % (2 ** m);
assign k4=(a*x4);
proposed_adder16 l5(.a(k4),.b(b),.c(c),.cin(cin),.sum(sum4),.cout(c1));
assign x5= sum4 % (2 ** m);
Endmodule
DEPARTMENT OF ECE,VLITS 60
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
8.2. REFERENCES:
3. Z. Liu, D. Liu, and X. Zou, “An efficient and flexible hardware implementation of the dual-
field elliptic curve cryptographic processor,” IEEE Trans. Ind. Electron., vol. 64, no. 3, pp.
2353–2362, Mar. 2017.
4. B. Parhami, Computer Arithmetic: Algorithms and Hardware Design. New York, NY, USA:
Oxford Univ. Press, 2000.
5. P. L. Montgomery, “Modular multiplication without trial division,” Math. Comput., vol. 44,
no. 170, pp. 519–521, Apr. 1985.
6. S.-R. Kuang, K.-Y. Wu, and R.-Y. Lu, “Low-cost high-performance VLSI architecture for
montgomery modular multiplication,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol.
24, no. 2, pp. 434–443, Feb. 2016.
7. S.-R. Kuang, J.-P. Wang, K.-C. Chang, and H.-W. Hsu, “Energy-efficient high-throughput
montgomery modular multipliers for RSA cryptosystems,” IEEE Trans. Very Large Scale Integr.
(VLSI) Syst., vol. 21, no. 11, pp. 1999–2009, Nov. 2013.
10. A. K. Panda and K. C. Ray, “Modified dual-CLCG method and its VLSI architecture for
pseudorandom bit generation,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 66, no. 3, pp. 989–
1002, Mar. 2019.
11. A. Kumar Panda and K. Chandra Ray, “A coupled variable input LCG method and its VLSI
architecture for pseudorandom bit generation,” IEEE Trans. Instrum. Meas., vol. 69, no. 4, pp.
1011–1019, Apr. 2020.
DEPARTMENT OF ECE,VLITS 61
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER
12. N. Weste and K. Eshraghian, Principles of CMOS VLSI Design—A Systems Perspective.
Reading, MA, USA: Addison-Wesley, 1985.
13. T. Kim, W. Jao, and S. Tjiang, “Circuit optimization using carry-saveadder cells,” IEEE
Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 17, no. 10, pp. 974–984, Oct. 1998.
15. A. K. Panda and K. C. Ray, “Design and FPGA prototype of 1024- bit Blum-Blum-Shub
PRBG architecture,” in Proc. IEEE Int. Conf. Inf. Commun. Signal Process. (ICICSP),
Singapore, Sep. 2018, pp. 38–43.
DEPARTMENT OF ECE,VLITS 62