You are on page 1of 71

A Major Project Report on

Implementation of High-Speed and Area-Efficient VLSI


Architecture of Three-Operand Binary Adder

Submitted in partial fulfilment of the requirement for the award of degree of

BACHELOR OF TECHNOLOGY
IN

ELECTRONICS AND COMMUNICATION ENGINEERING

Submitted by

18FE1A0499 - P. RIZWANA
18FE1A0487 - N. TEJASWINI
18FE1A0463 - K. YOGA SRIRAM
18FE1A0493 - P. SRI HARSHA

Under the Esteemed Guidance of

Dr. VENKATA KISHORE PERLA


Assistant Professor

DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING

VIGNAN’S LARA INSTITUTE OF TECHNOLOGY &SCIENCE


(An ISO 9001:2015 Certified, Approved by AICTE, Affiliated to JNTU, KAKINADA)
VADLAMUDI-522213, GUNTUR Dist., ANDHRA PRADESH.
DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING

VIGNAN’S LARA INSTITUTE OF TECHNOLOGY & SCIENCE


(An ISO 9001:2015 Certified, Approved by AICTE, Affiliated to JNTU, KAKINADA)
VADLAMUDI-522213, GUNTUR Dist., ANDHRA PRADESH.

CERTIFICATE
This is to certify that the major project work entitled “IMPLEMENTATION OF HIGH-SPEED
AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER”
is a bona fide work done by

18FE1A0499 - P.RIZWANA

18FE1A0487 - N.TEJASWINI
18FE1A0463 - K.YOGA SRIRAM

18FE1A0493 - P. HARSHA

Under my guidance and submitted in partial fulfilment of the requirements for the award of
the Degree of Bachelor of Technology in Electronics and Communication Engineering by
Jawaharlal Nehru Technological University, Kakinada.

Head of the Department


Project Guide

(Dr. VENKATA KISHORE PERLA) (Dr. B.HARISH)


M.Tech, Ph.D M.Tech,Ph.D

External Examiner
ACKNOWLEDGEMENT

We are grateful to the Department of Electronics and Communication Engineering,


VIGNAN’S LARA INSTITUTE OF TECHNOLOGY & SCIENCE, which gives us an
opportunity to have profound technical knowledge there by enabling us to complete the project.

We would like to thank our beloved principal Dr. PHANEENDRA KUMAR for
providing a great support for us in completing our project and for giving us the opportunity of
doing the project.

We feel elated to thank Dr. B.HARISH Professor and our Head of the Department
(HOD), for inspiring us all the way and arranging all the facilities and resources needed for our
project.

We are very thankful to our beloved coordinators Dr. RAJA CHANDRASEKAR,


Dr. JITENRA KUMAR SAINI, Mrs. SANDHYA RANI, Mr.K.VIJAYA VARDHAN for
inspiring all the way and arranging all the facilities and resources needed for project. Their
efforts in this aspect are beyond the preview of the acknowledgement.

It is with immense pleasure that we would like to express our indebted gratitude to our
guide Dr. VENKATA KISHORE PERLA who guided us a lot and encouraged us in every step of
our project work. His invaluable moral support and guidance through the project helped us to a
greater extent. We are thankful to him for his valuable suggestions and discussions during the
project.

We express our hearty thanks to all the staff members and non-teaching staff for all their
help and co-operation extended in bringing out this project successfully in time.

Project Associates

Ms. P.RIZWANA
Ms. N.TEJASWINI

Mr. K.SRIRAM
Mr. P.HARSHA
DECLARATION
We hereby declare that the project entitled “IMPLEMENTATION OF
HIGH-SPEED AND AREA-EFFICIENT THREE OPERAND BINARY
ADDER” has been undertaken by us and this work has been submitted to
“VIGNAN’S LARA INSTITUTE OF TECHNOLOGY AND SCIENCE”
affiliated to JNTUK, Kakinada in the partial fulfillment of the requirements for the
award of the degree of Bachelor of Technology (B.Tech) in Electronics and
Communication Engineering (ECE), is the result of the work done by us under the
guidance of Dr. P. VENKATA KISHORE PERLA Assistant Professor of ECE
department.

We further declare that this project work has not been submitted in full or
partial requirements for the award of any degree in any other educational institution.

Project Associates

Ms. P.RIZWANA

Ms. N.TEJASWINI
Mr. K.SRIRAM

Mr. P.HARSHA
ABSTRACT

Three operand binary adder is the basic functional unit to perform the Modular Arithmetic in
various cryptography and Pseudo-Random Bit Generator. Existing methods like Carry Save
Adder (CS3A) is the widely used technique to perform three-operand addition. However, the
ripple-carry stage in the CS3A leads to a high propagation delay. Moreover, a parallel prefix
two-operand adder such as Han-Carlson (HCA) significantly reduces the critical path delay but
increases the area. Hence, a new high-speed and area-efficient adder architecture is proposed
using pre-compute bitwise addition followed by carry prefix computation logic to perform the
three-operand binary addition that consumes substantially less area, low power and drastically
reduces the adder delay and also the proposed adder achieves the lowest ADP and PDP than the
existing three-operand adder techniques. It is implemented on zed board using Xilinx.

v
LIST OF CONTENTS Page no:

ABSTRACT ....................................................................................................... v
LIST OF CONTENTS....................................................................................... vi-vii
LIST OF FIGURES ...........................................................................................viii-ix
CHAPTER I: INTRODUCTION .................................................................... 1- 11

1.1 Introduction to VLSI......................................................................... 1


1.2 VLSI Design Cycle ............................................................................ 2

1.3 Physical Design Cycle ....................................................................... 4


1.4 History of VLSI ................................................................................. 8

1.5 Applications of VLSI ....................................................................... 10

1.6 Advantages of VLSI Technology ..................................................... 11

CHAPTER II: LITERATURE SURVEY ...................................................... 12-18


2.1 Types of Adder
2.1.1 Ripple Carry Adder ........................................................... 12

2.1.2 Carry Look Ahead Adder ................................................ 13

2.1.3 Carry Select Adder ........................................................... 13

2.1.4 Carry Skip Adder ............................................................. 15

2.1.5 Carry Save Adder ............................................................. 16

2.1.6 Parallel Prefix Adder........................................................ 17

2.2 A Comparison table among different types of adders ........................ 18

CHAPTER III: EXISTING ADDERS ........................................................... 19-22

3.1 Carry Save 3-Operand Adder ..........................................................19


3.2 Han Carlson 3-Operand Binary Adder ........................................... 20

3.3 Reason for Implementing Proposed Adder .................................... 22

CHAPTER IV: PROPOSED ADDER ............................................................. 23-26


4.1 Block Diagram of Proposed Adder .................................................... 23

vi
4.2 Operation of proposed Adder ............................................................. 23

4.3 Advantages of Proposed Adder ......................................................... 26

CHAPTER V: XILINX SOFTWARE ............................................................... 27-42


5.1 Xilinx ISE Overview ......................................................................... 27
5.2 Project Navigator Overview ............................................................. 28

5.3 Creating a Project .............................................................................. 33

CHAPTER VI: APPLICATIONS ........................................................................ 43-46


6.1 Cryptography........................................................................................ 43

6.2 Types of Cryptography Algorithms .................................................... 43


6.3 Block Diagram of MDCLCG ............................................................... 44
6.3.1 Internal Block Diagram of LCG ........................................... 44
6.4 Operation of MDCLCG........................................................................ 45

6.5 Simulation Waveforms .......................................................................... 46

CHAPTER VII: RESULTS .................................................................................. 47-48


7.1 Simulation waveforms .......................................................................... 47

7.2 Output Comparison ............................................................................... 48

7.3 Application Results ............................................................................... 48

CHAPTER VIII: APPENDIX ............................................................................. 49-58

8.1 Codes ..................................................................................................... 49

8.1.1 CS3A ....................................................................................49


8.1.2 HC3A .................................................................................. 50

8.1.3 Proposed Adder ................................................................... 53

8.1.5 Code for MDCLCG ............................................................ 58

8.2 References .................................................................................................61

vi
i
LIST OF FIGURES Page no:

Fig 1.1: VLSI Design flow ............................................................................................... 3


Fig 1.2: Design Process Steps of Circuit Layout ............................................................5
Fig 2.1: Ripple Carry Adder ............................................................................................ 12
Fig 2.2: Carry Look ahead Adder ................................................................................... 13
Fig 2.3: Carry Select Adder............................................................................................. 14

Fig 2.4: Carry Skip Adder ................................................................................................ 15


Fig 2.5: Carry Save Adder ................................................................................................ 16
Fig 2.6: 16-bit Kogge-Stone Adder .................................................................................. 17
Fig 2.7: Processing Unit .................................................................................................... 18
Fig 2.8: Buffer Unit ........................................................................................................... 18

Fig 3.1: Block diagram of Carry Save 3-Operand Adder ..................................................... 19


Fig 3.2: Block diagram of Han Carlson 3-Operand Adder… .......................................... 20
Fig 3.3: 8-bit Han Carlson Adder....................................................................................... 21

Fig 3.4: Internal circuit of black and grey cells ................................................................ 21


Fig 4.1: Proposed Adder ..................................................................................................... 23

Fig 4.2: Bit-addition logic ................................................................................................... 24

Fig 4.3: Base Cell ................................................................................................................. 24

Fig 4.4: Black cell ................................................................................................................25


Fig 4.5: Grey cell ................................................................................................................ 25

Fig 4.6: Sum Logic ............................................................................................................. 26


Fig 5.1:Project Navigator .................................................................................................... 29
Fig 5.2: Sources Window .................................................................................................... 30

Fig 6.1: Block diagram of MDCLCG.............................................................................. 44


Fig 6.2: Linear Congruential Generator ............................................................................ 44
Fig 6.3: Simulation Waveforms of MDCLCG ................................................................... 46

Fig 7.1: CS3A Simulation Waveforms ................................................................................. 47


Fig 7.2: HC3A Simulation Waveforms .............................................................................. 47

viii
Fig 7.3: Proposed Adder Simulation Waveforms ............................................................. 47

Fig 7.4: Comparing Delay and area of Different Adders .................................................. 48

Fig 7.5: Results of MDCLCG using Different Adders .....................................................48

ix
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

CHAPTER – 1
INTRODUCTION

With the advance of VLSI technology, many computing-intensive applications such as


multimedia processing, digital communication can now be realized in hardware to either speed
up the operation or reduce energy consumption. The essence of digital computing lies in the full
adder design. The design criteria of a full adder are usually multi-fold. Transistor count is a
primary concern that largely affects the design complexity of many function units such as
multiplier and algorithmic logic unit (ALU). Two other important yet often conflicting design
criteria are power consumption and speed. A better metric would be the power delay product or
energy consumption per operation to indicate the optimal design tradeoffs. Related to the power
consumption is the lowest supply voltage in which the design can still operate properly.

The batteries driven and portable devices are in great demand in many industrial applications
which need the implementation of low power and area efficient devices. Moore's law was
discovered by Gordon Moore in 1965. He was the Co-founder of INTEL Corporation. He has set
the pace for our modern digital revolution and utilized that the computing world increases in
power and decreases in cost. He has predicted that several transistors in an integrated circuit
would quadruple every two years. This prediction is known as Moore's Law. Today, many
industrial applications are designed in the nanometer range. The transistor size is restricted with
the phenomena like Short Channel Effects including the hot carrier effect and tunnelling through
oxide thickness. In the CPU, the arithmetic logic unit (ALU) is a crucial part. An adder cell is an
important unit of an ALU. Many digital circuit adders are used to perform the addition of
numbers. In many computers, adders are used in other parts of the processor to calculate
addresses, table indices and similar operations. Due to the increase in the demand for portable
devices such as mobile phones, laptops, tablets and the need for area and power efficient V LSI
circuits is arisen. Low power adder cells are used in Low power applications.

1.1 Introduction to VLSI:

VLSI- Very Large Scale Integration is the process of creating an IC by


incorporating several transistors on single chip. It came to existence in 1970’s when complex
semiconductors and communication technologies were developing.

DEPARTMENT OF ECE,VLITS 1
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

Before Introduction of VLSI most IC’s had limited set of tasks to perform.
As VLSI allows the designer to add all the external peripherals, CPU, ROM and ROM into
single chip, we can perform several functions by using one IC.

Due to evolution of technology in current years there is rapid advancement


in large scale Integration and system design applications. With the advent of Very Large Scale
Integration (VLSI) designs, the number of applications of integrated circuits (ICs) in high-
performance computing, controls, telecommunications, image and video processing has been
rising at a very fast pace.

The Computer-Aided Design (CAD) has additionally helped the


development in the intricacy and execution of coordinated circuits in the VLSI innovation. With
such an exceptional increment in multifaceted nature, it is more essential than any other time in
recent memory to deal with the plan procedure, so as to keep up the unwavering quality, quality,
and extensibility of a given structure. The procedure incorporates "definition, execution and
control of structure strategies in an adaptable and configurable manner". Speed of advancement
in elite registering, media communications and purchaser gadgets in a quickly evolving business
sector, formative expenses, and cost engaged with instance of missteps, assume a basic job in a
business domain. Subsequently, it requires structures that can be prepared rapidly, inexpensively
and botches conveyed to the bleeding edge at the most punctual, maybe, before manufacture
arrange.

1.2 VLSI Design Cycle

The VLSI configuration relates to structure of a solitary incorporated circuit to execute a


complex advanced capacity. Ordinarily, the plan procedure is an iterative procedure that calibrates a
thought for a gadget which can be made through different dimensions of structure reflection. The
procedure is intricate and includes a progression of steps that incorporates determination to
manufacture, in which the coordinated circuit is delivered. Starting with conceptual prerequisites, the
procedure includes changing over these necessities into a register exchange depiction, e.g., control
stream, registers and math and legitimate activities, which is reproduced and tried. It is then moved
to circuit portrayal including doors, transistors and interconnections. At this crossroads, reproduction
is utilized to check every part. shapes epitomizing circuit components and their interconnection.

The particular advances associated with VLSI configuration cycle are represented in Figure 1.1.
These means are framework detail, utilitarian structure, rationale configuration, circuit plan,
physical structure, creation and testing.

DEPARTMENT OF ECE,VLITS 2
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

Fig 1.1: VLSI Design Flow

System Specifications

Plan details are required to set out the standards for the structure. While chipping
away at the plan, the fundamental variables to be considered in this procedure incorporate
physical measurements (size of the chip), execution, usefulness, decision of manufacture
innovation and structure methods [10]. The normal final products of the entire procedure are the
determinations for the speed, size, usefulness and intensity of the VLSI circuit.

Behavioural Description

Conduct portrayal is then made to break down the plan as far as usefulness, execution,
consistence to given measures, and different details. The result of this progression is typically
timing chart or different connections between sub-units. This stage is to improve the general
structure process and decrease the intricacy of the resulting stages.

High Level Synthesis

Rationale configuration step changes the social determination into a register exchange
level (RTL) portrayal that incorporates the word widths, control stream, register designation,
rationale and number juggling activities. Further, the practical units are communicated as crude
rationale tasks (NAND, NOT, and so forth.). This depiction can be spoken to as a Hardware
Description Language (HDL), in particular Verilog and VHDL. The primary goal of this
progression is to limit the quantity of Boolean articulations.

DEPARTMENT OF ECE,VLITS 3
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

Logic Synthesis

Rationale amalgamation is a procedure by which theoretical type of wanted


circuit conduct. An innovation subordinate portrayal of the circuit is made, which changes the
rationale articulations into a circuit portrayal with parts, for example, cells, macros, doors,
transistors, and interconnections gathered in a net list . Amid usage of certain topologies,
rationale conditions are separated and mapped to accessible physical circuit hinders in the circuit
topology. The rightness and timing of every part are checked by the rationale union.

Physical design

Physical structure is a phase in the standard arrangement cycle which trails the circuit
plan. At this movement, circuit depictions of the parts (contraptions and interconnects) of the
structure are changed over into geometric depictions of shapes which, when delivered in the
relating layers of materials, will ensure the necessary working of the portions. T his geometric
depiction is called facilitated circuit group. The last execution of the circuit is evaluated through
a minimized course of action of the territory and exact steering of wires. Being a NP -difficult
issue, the physical structure is additionally separated into various sub-issues, which is signified
as parceling, arrangement and directing. The focal point of this exploration is to consider the
apportioning and position.

Fabrication and testing

Finally, the wafer is produced and diced in a manufacture office. So as to guarantee that
the chips meet all the structure and practical necessities, each chip is bundled and tried. In any
case, the achievement of the whole procedure firmly relies upon the connection between
conceptual models at the larger amount and physical usage at the lower level.

1.3 Physical Design Cycle

The rationale blend and circuit configuration results in the circuit parts, which are
separated from a physical library and changed over into rectangular shapes with fixed
measurements. The circuit segments are called as cells or modules and the interconnections as
nets which are gathered as a netlist. The planning imperatives on sign proliferation ways along
nets are characterized. A total format of the circuit, where every one of the cells are situated on
the chip without covering and all the interconnection ways finished, is the yield of the physical
plan arrange. This format is accomplished in different stages: apportioning, floor planning,
arrangement, directing and compaction. Figure 1.2 delineates the phases of circuit design.

DEPARTMENT OF ECE,VLITS 4
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

Fig. 1.2: Design Process Steps of Circuit Layout

Partitioning

The building change requests can be dealt with by a compelling and proficient parceling
device by enormously diminishing the unpredictability of the structure procedure. In addition, last
item as for creation cost and framework execution is assessed dependent on the nature of the
apportioning. Parcelling is a procedure to separate a circuit or framework into a gathering of littler
segments. It is a structure task that breaks a huge framework into littler pieces to be actualized on
discrete cooperating parts. While in the meantime, it likewise goes about as an algorithmic strategy
to comprehend troublesome and complex combinatorial streamlining issues as in rationale.

The span of VLSI plans has expanded to frameworks of a huge number of transistors. The
multifaceted nature of the circuit has turned out to be high to the point that it is hard to structure and
recreate the entire framework without disintegrating it into sets of littler sub-frameworks.
Subsequently, the circuits are apportioned by gathering the segments into squares otherwise
called sub-circuits or modules. Be that as it may, the real apportioning procedure depends on
variables like number of obstructs, the span of the squares, and the quantity of interconnections
between the squares. The yield of dividing is a lot of squares alongside the interconnections
required by squares, which are alluded to as a netlist.

DEPARTMENT OF ECE,VLITS 5
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

Floor planning

Floor organizing is the route toward recognizing structures that should be put close to
each other, and assigning space for them so as to meet the once in a while conflicting
destinations of open space (cost of the chip), required execution, and the hankering to have
everything close to everything else. Floor arranging incorporates finding the arrangement and
relative introduction of the modules with the goal that the absolute gadget region is limited. The
arrangement of the modules is done on the premise that emphatically associated segments come
nearer to one another. After floor arranging, the steering area must be partitioned into channels
and switchboxes.

Placement

An urgent errand in chip configuration is situation which requires the places of the
modules to be settled on a given chip region. The arrangement impacts the complete length of
wires required to interconnect them, and therefore on the presentation of the chip, the sign
change times and the utilization of intensity. Meeting time imperatives were not a troublesome
assignment before, yet in present day times because of mechanical progression and developing
multifaceted nature, the chip configuration has turned into a complex and te dious procedure.
Likewise, associated advancement objectives and tight requirements should be tended to. Thus,
the situation necessities continue changing, as position instruments turned into an essential
segment in incorporated plan streams at various stages and situations. Because of its iterative
nature, arrangement has made the general turnaround time significantly delicate to the runtime of
the position. The fluctuated reasons legitimize the requirement for adaptable yet amazing and
quick position calculations which is basic for quicker turnaround and least time to showcase.

To total up, the fundamental target of the position is to locate a base territory plan of the
obstructs that permits fulfillment of interconnections between the squares. Two stages are
incorporated into standard cell situation. The principal stage incorporates the making of
beginning situation. The second stage incorporates evaluation of the underlying arrangement
pursued by enhancement for the emphasis till the format contains least zone and pursues the plan
determinations. So as to allow interconnections, space is left clear between the squares. Amid
arrangement, an expected measure of steering space is included between the phones. In one of the
past works of the creator, it has been demonstrated that assessing the space is vital, as an excessive
amount of room can prompt imperfect designs or too little space may discount the ideal (briefest)
courses for all nets, likewise the fruition of the interconnections can likewise turned out to be

DEPARTMENT OF ECE,VLITS 6
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

incomprehensible. In such an occasion, a modification of the cells ends up fundamental.


Subsequently, it is astute to incorporate the calculation of the courses into the arrangement task.

In any case, the nature of the situation can't be assessed until the steering stage is finished.
Another instance of situation might be completed if routable structure isn't accomplished because of
time requirement. Further, the exertion is made to diminish the quantity of cycles by evaluating the
required directing space since when the places of the squares are fixed, it is difficult to improve the
steering just as the general execution of the circuit. In this way, it is apparent that the productive
situation calculation is basic for a decent directing and circuit execution.

Routing

Routing is viewed as a standout amongst the most convoluted strides in the back-end
configuration process. It alludes to finding reasonable ways inside the accessible format space
where the wires are associated with the ideal arrangement of pins. It tries to limit complete wire
length yet as for track limit limitations. The fundamental objective of this stage is to guarantee
that the interconnections are finished between the squares according to the predetermined netlists
in semi hand craft. At the point when free spaces are empty, they are divided into channels and
switchboxes, which are utilized to finish all circuit interconnections through the briefest
conceivable wire length. Directing takes a two-organize strategy: worldwide steering and
definite steering. Worldwide directing associates the squares of the circuit without considering
the precise geometric subtleties of each wire and stick. Worldwide steering determines the "free
course" of a wire through various locales in the directing space. Point by point directing doles
out real tracks and vias for nets.

The two particular issues that emerge because of these steering is to adjust the densities of
the directing channels and to relegate explicit wire fragments to every association. The goals of the
worldwide directing calculation are to circulate the associations among the channels to guarantee that
the channel densities are adjusted and to lessen the quantity of "turns" for every association. The
principle disadvantage of two-organize procedure of steering, i.e., worldwide and nitty gritty
directing, is that it doesn't give suitable chances to handle issues which emerge from sign
postponement, cross talk and procedure imperatives. Batterywala, et al. set forward a transitional
advance of track task among worldwide and point by point directing to address these issues.
Amid this stage, the worldwide steering data can be utilized to productively deliver these issues
and to help the itemized switch in accomplishing the wiring fruitions.

DEPARTMENT OF ECE,VLITS 7
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

1.4 HISTORY OF VLSI

Due to the restricted pin number the digital devices Testing is more difficult. At-speed the
real-life testing implies the analog and digital subsystems. The analog test had the ad-hoc nature it
is particularly difficult enough access terminals due to the lack of enough test signals readout
outputs. Under the study such complex systems have Specific tests. Since long ago, for the digital
circuitry which can be considered adequate there exist general test techniques. It is always not
possible for such complex digital systems to find out the efficient solutions. For the advanced
technologies they used in the past today the fault models, of digital are not as well- as established
more complex. One may find in the modern metal oxide semiconductor technologies are not
kindly reflecting the faults models like stuck-at models, Traditional.

The electrical level of being satisfactorily modeled far from the failure mechanisms as it is
affecting the micro sensors (or nano sensors). To the alternative complex system on a chip the
interconnects of a New is under developed, like construction of systematic investment plan
which addresses the 3-D integration. This is connecting to the idea of two or even more chips,
based on mounting together through the silicon. The advantages of metal oxide semiconductor
gain the higher density of every chip. The problem of power dissipation is still a problem but the
semiconductor is not yet mature.

The efficient clock distribution is also an issue for the quite Timing quite formidable task
for the large chips. Related to the totally-synchronous designs there are many problems to the
clock skew, clock distribution, along the chip to the clock power consumption, fan-out. There
are mixed-signal components; to require a clock it is even more complex. To the Beginning
there are partially-synchronous techniques to be popular; there are not always compensated
disadvantages and advantages. Related to the clocking itself, the chip activity is restricted to the
processing events to avoid the problems that synchronism the power dissipation.

The circuit parts which are not operative can be asleep by activity controlled clock instead
of activity periodic. The use of a non-synchronous paradigm is now targeting the Even mixed-
signal circuitry (basically, data converters). Related to the technology and its modeling there are
other difficulties As far as incorporating to very large scale integration sensors and actuators.
Based on the electrical simulators may require non-electrical models which have to be compatible
with the regular design flow of Sensors and actuators. The design flow extensions or modifications
must be targeted in the modern days of very large scale integration research of the extended
models.

DEPARTMENT OF ECE,VLITS 8
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

The incorporation of non-digital devices forces to much more complex flows, where
compatibility between components is essential about design flows. In addition to the first-silicon
working is not an easy task to verify the issues are more needed to the guaranteeing. Since long
ago the integrated Design flows can associate the tasks that are supported by Computer Aided
Design tools, in the so-called design frameworks and platforms which are becoming more
complex. The particular chip to be realized the number and diversity of tools to be used that are
dependent.

The digital world can be considered the Topics related to the mature comp lex very large
scale integration enough in the frame. Since all these topics need some revisiting Nevertheless,
they are impact on complexity has increasing, and within the chip have to cooperate the digital
circuitry that need to coexist with the More-than-Moore devices. With the efficiency of non-
digital subsystems the Architectural issues need to adapt and manage.

Architecture as can be seen there are various blocks which are need to coding in the tool,
but the question is why Verilog only can be used and why in what way does System Verilog
have advantage over Verilog. DUT Block is the Device under test i.e. the top module for which
coding is done by using Verilog. If all the remaining Blocks are coded via Verilog then we have
to instantiate coding by defining module name for each block, module to module communication
is very hectic thus we don’t prefer Verilog for coding of other blocks instead prefer System
Verilog. The main reason is System Verilog includes OOPs concepts thus defining each blocks
codes in a class format provides easy way of coding as compare to Verilog.

There are many companies which provide simulating advanced verification tools namely QuestaSim,
ModelSim, Xilinx, Cadence etc. Full adders are important components in applications such as digital
signal processors (DSP) architectures and microprocessors. Apart from the basic addition adders also
used in performing useful operations such as subtraction, multiplication, division, address
calculation, etc. In most of these systems the adder lies in the critical path that determines the overall
performance of the system. In this paper conventional complementary metal oxide semiconductor
(CMOS) and adiabatic adder circuits are analyzed in terms of power and transistor count using
0.18UM technology. Communication systems use the concept of transmitting information using
the electrical distribution network as a communication channel. To enable the transmission data
signal modulated on a carrier signal is superimposed on the electrical wires. Typical power lines
are designed to handle 50/60 Hz of AC power signal; Power has become one of the most
important paradigms of design convergence for multi Giga hertz communication systems such as
optical data links, wireless products, microprocessor &ASIC/SOC designs.

DEPARTMENT OF ECE,VLITS 9
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

POWER consumption has become a bottleneck in microprocessor design. The core of a


microprocessor, which includes the largest power density on the microprocessor. In an effort to
reduce the power consumption of the circuit, the supply voltage can be reduced leading to
reduction of dynamic and static power consumption.

Lowering the supply voltage, however, also reduces the performance of the circuit, which is
usually unacceptable. One way to overcome this limitation, available in some application domains, is
to replicate the circuit block whose supply voltage is being reduced in order to maintain the same
throughput .This paper introduces design aspects for low power phase locked loop using VLSI
technology. This phase locked loop is designed using latest 45nm process technology parameters,
which in turn offers high speed performance at low power. The main novelty related to the 45nm
technology such as the high-k gate oxide, metal-gate and very low-k interconnect dielectric
described. VLSI Technology includes process design, trends, chip fabrication, real circuit
parameters, circuit design, electrical characteristics, configuration building blocks, switching
circuitry, translation onto silicon, CAD, practical experience in layout design.

Majority of Digital Signal Processing (DSP) applications require arithmetic blocks such
as multipliers and adders for hardware realization of complex algorithms. Power consumption of
arithmetic blocks need to be minimized by use of low power techniques. In this paper, an
experimental setup is developed to identify the sources of power dissipation and remedies that
can be adopted to minimize power dissipation in arithmetic blocks. Use of low power techniques
such as Multi Vt, variable Vt, pipelining, geometry scaling and use of appropriate load
capacitance have been used to reduce power dissipation. A 4 -bitpipelined adder is designed and
the power dissipation is reduced to 4.17µW from 9.6µW. The designed pipelined adder can be
used for DSP applications.

1.5 Applications of VLSI

1. DSP

2. Communications

3. Microwave and RF

4. MEMS

DEPARTMENT OF ECE,VLITS 10
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

5. Cryptography

6. Consumer Electronics

7. Automobiles

8. Space Applications

9. Robotics

10. Health domain/Medical field.

1.6 Advantages of VLSI Technology

1. Reduces the Size of Circuits.

2. Reduces the effective cost of the devices.

3. Increases the Operating speed of circuits

4. Requires less power than Discrete components.

5. Higher Reliability

6. Occupies a relatively smaller area.

DEPARTMENT OF ECE,VLITS 11
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

Chapter- 2
Literature Survey
Types of Adders:
There are many proposed adders. Each adder has its advantages and limitations.
Overcoming the drawbacks from the previous adders, different adders are proposed. Let us
briefly discuss about the adders.

2.1.1 Ripple Carry Adder:


As the name suggest the arithmetic addition is done in the form of ripple.
Ripple carry adder consist of Full Adders(FA), but the number of full adders depends on the
number of bits to be calculated as shown. Less area is occupied in comparison to other adders,
only FA is present as an adding component. The number of inputs are two and the number of
outputs are also two. Speed is low, as the output of previous FA needs to be used as the input
for next FA. Each FA block generates the sum and when the addition between the two input
takes place but when it comes to carry the pattern is different. When a carry is generated it flows
to the input of the next full adder as an input, and the process of carry generated flowing into the
next full adder continues, hence the name is ripple carry adder. Time taken to calculate is more,
sum and carry values are calculated by all the FA’s. This adder is the basic building block of the
adders. It has few advantages like the power consumption is comparatively lower than the rest of
the adders & cost is less as it occupies a smaller area and uses only FA’s.

Fig 2.1: Ripple Carry Adder

DEPARTMENT OF ECE,VLITS 12
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

2.1.2 Carry Look ahead Adder (CLA) :

It is the type of adder which was designed to overcome the disadvantages of


RCA. The importance about carry look ahead adder is that here delay is much lesser than RCA,
the values of sum and carry are calculated by the initial P0 and G0 signals, so there is no need to
go through all the FA’s. Here, Adder propagates and generates signal which are used to calculate
the value of carry. The propagate signal is denoted by ‘Pi’ and has the formula Pi = Ai + Bi. And
generate signal being denoted by ‘Gi’ with the formula Gi = Ai * Bi [2]. The formula for carry is
Ci+1 = Gi + Pi*Ci. Here ‘i’ is the bit for which we are calculating the carry. Higher speed than
RCA, as CLA uses Propagate(Pi) and Generate(Gi) signals, also carry and sum is calculated by
using only the first P0 and G0 signals which is the main advantage of this adder and we can
derive Ci+2 and other carry outputs from those values. Its performance for calculation is at a
faster pace but the area occupied is more than RCA, presence of CLA logic and regular RC A
along with P and G signals as seen in Fig.3. Also, here the full adders are little bit modified then
we normally see in the regular full adders. Cost is higher than RCA, as it has more wiring tracks
and area is more than RCA.

Fig 2.2: Carry Look ahead Adder

2.1.3 Carry Select Adder (CSA):


This type of adder is used to overcome the demerits of RCA. Area occupied is more than
CLA and twice of RCA, has twice the number of FA’s than RCA & CLA, with 2:1 MUX for

DEPARTMENT OF ECE,VLITS 13
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

selecting the carry. As the name suggest the carry generated is selected on the basis of the
Cin provided at the start. Here we use two different carry-in values i.e 0 and 1 as shown in Fig.4.
Traditionally we use one carry value which gets generated from the previous block of the FA, for the
calculation of the full adder ahead of it, but here we have carry values of 0 and 1 instead of one
carry value. As there are two carry-in values so we use two sets of full adders for calculating the
sum and carry for each carry-in values. So, we pre-calculate the sum and the carry values for the
particular block. Now on the basis of Cin the sum and carry values are carry forwarded by the
Multiplexers, AND and OR gate. And hence we get the sum and carry values. When we
consider speed as a factor then it is faster than RCA, carry & sum each have two output values
for two possible input values of Cin i.e, 0 & 1. Depending on the initial value of Cin the value of
carry & sum is taken at output.

But the delay is more in CSA, although the values are calculated beforehand but the use of twice the
number of FA’s along with MUX and AND gate increases the delay. Cost is higher than RCA and
CLA, the number of FA’s is double the amount of FAs in RCA, usage of MUX & area occupied
is more.

Fig-2.3: Carry Select Adder

DEPARTMENT OF ECE,VLITS 14
MPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

2.1.4 Carry Skip Adder (CSKA):


The main function of this type of adder is to skip the carry calculation process. Here
adder uses carry look ahead adder for the calculation of Pi and Gi signals as seen in Fig.4. Area
is less than CSA, we use FA’s, AND logic gates for finding the Propagated signals and 2:1
MUX for selecting the output. All the inputs are processed simultaneously to generate the sum
and Pi signals. The important function is performed by the Pi signals as they decide whether to
skip the carry calculation or not. After the inputs, the sum and Pi values are generated, and adder
needs to AND all the Pi values. The output of the AND gate acts as a select line in 2:1 MUX. If
the value of the AND gate is 1 i.e. all the Pi signals are 1 then we use Cin as the carry output and
adder skips all the FA’s to process the calculation to find carry. But if AND gate value is 0 i.e.
any one of Pi signal is 0 then the carry is calculated same like in RCA, that is carry is calculated
from LSB to MSB. This type of adder uses both RCA and CLA for calculation.

It is faster than RCA, CLA, CSA & CSVA adders, depending upon the Pi signal
output from AND logic block the carry is selected. If output of AND logic block is 1 then
Cin(skipping the addition via FA’s) is selected as carry and if 0 is the output then carry is
generated via FA’s. Whereas the delay depends on the value/s of Pi & AND logic block, if AND
logic block gives output as ‘1’ then calculation over FA’s is skipped or else calculation happens
Via FAs.

Fig-2.4: Carry Skip Adder

DEPARTMENT OF ECE,VLITS 15
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

2.1.5 CARRY SAVE ADDER (CSVA):


In this adder rather adding only two inputs at a time and then adding Cin as carry bit, here
adder adds Cin as an input along with the other two inputs. Here area is directly proportional to
no. of inputs, the FAs occupy less area for a smaller number of inputs.

Fig- 2.5: Carry Save Adder

If the number of inputs increase, then the number of FA’s increase which increases the
area. It is used in application where it is needed to add more than two numbers at a time.
Number of FAs used here are less as compared to other types of adders.

When it comes to speed, it is slower than CSKA & PPA adders, for a greater number of
inputs the speed will be low as compared to a smaller number of inputs. As number of inputs
increase speed decreases. Here adder uses carry of one FA block with sum of the second FA
block to get the sum. As per shown in the upper part of the diagram, the important point is that
the sum and carry are calculated individually and not simultaneously. Firstly, the sum is
calculated without taking carry into consideration. Then adder calculates carry by leaving one
space from the LSB side as there is no carry at the start. After the sum and carry values are
calculated, then adder adds both of them together to get the final number as per shown in the
lower part of the diagram. Delay is variable, as the number of inputs increases delay also
increases, also CSVA is composed of RCA, so the delay is more. Cost is less, nominal cost as its
basic component is RCA, but cost may increase if number of inputs increase.

DEPARTMENT OF ECE,VLITS 16
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

2.1.6 Parallel Prefix Adder:


Although CSA and CLA are effective but when the number of bits start to increase, performance is
degrading in terms of delay i.e. when it rises to 32, 64 or 128 bits. Brent-Kung adder has a small area
in comparison to Kogge-stone adder which has the largest area. Han-Carlson adder has small area as
compared to Sklansky adder due to the presence of Logic level, Black and Gray dot operators and
Buffers. To overcome this, PPA uses multi-level tree of lookahead structures. Examples of PPA are
Brent-Kung, Sklansky, KoggeStone(Fig.8) and Han-Carlson adders. These adders work on the basic
operators namely black and gray dot. Each of these adders comprises of three blocks. Those are pre-
computation (which includes generation of P and G signals) Fig.6, prefix network (which includes
dot operators and buffers as shown in Fig.7) [7] and post-computation (which includes the final sum
generation block) [6]. Han-Carlson has more speed than all adders mentioned before, as it has lower
number of logic levels, wiring tracks are less and the number of prefix network is also less. PPA are
categorized on the basis of few factors like logic levels, wiring tracks and fan-outs. Han-Carlson
adder has a lesser delay time, various factors like wiring track, logic levels and middle prefix network
which include black & gray operator, AND, OR and buffer operations. Each type of adder has its
merits and demerits, so depending upon the application the selection of appropriate adder can be
done. To calculate the different trade off [7] factors we use the following equation L=log2n,l,f
and t Where, Logic levels: L + l(small L) Fanout: 2f+1(one) Wiring track: 2t Cost may increase
or decrease depending upon few factors, factors like wiring tracks, operators, buffers and logic
level decides the cost of the adder. Lesser the components lesser the price.

Fig-2.6: 16-bit Kogge-stone Adder

DEPARTMENT OF ECE,VLITS 17
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

Fig- 2.7: Processing Unit Fig - 2.8: Buffer Unit

2.2 Comparison table among different types of adders

DEPARTMENT OF ECE,VLITS 18
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

CHAPTER-3
EXISTING ADDERS

3.1 Carry Save 3-Operand Adder:


The three-operand binary addition can be carried out either by using two two-
operand adders or one three-operand adder. Carry-save adder (CS3A) is the area-efficient and
widely adopted technique to perform the three-operand binary addition in the modular arithmetic
used in cryptography algorithms.

CS3A contains array of full adders in two stages. Stage-1 directly adds the input
bits without considering the previous carry. Stage-2 adds the previous stage output to the carry
generated by the adjacent previous adder in the same stage. Hence the sum and carry is produced.

CS3A operates at good speed when operand size is small. As operand size
increases it also increases the delay of the circuit. CS3A occupies less area when compared to
other parallel prefix adders. It is mostly adopted when the operand size is small and in the
applications where area plays a vital role.

Block diagram of CS3A:

Fig - 3.1: Block diagram of Carry Save 3-Operand Adder


The dotted line indicates the critical path delay of the circuit.

a, b, c are the n-bit inputs and Cin is the input carry.

S is the n-bit sum output and Cout is the carry out.

Advantages of CS3A:

• Area occupied by the circuit is less.

DEPARTMENT OF ECE,VLITS 19
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

Limitations of CS3A:

• High circuit delay.

3.2 Han Carlson 3-Operand Binary Adder :

To overcome the problem encountered in CS3A that is to shorten the critical path delay, two
stages of parallel prefix two-operand adder can also be used. In literature, parallel prefix or
logarithmic prefix adders are the fastest two-operand adder techniques.

One of the parallel prefix adder is Han Carlson Adder. Basically it was designed
for two operand adders. In order to implement three-operand addition, two stages of Han Carlson
Adders are connected in such a way that output of first stage is given as input to the second stage
and named as Han Carlson 3-operand binary adder(HC3A).

Block diagram of HC3A:

Fig-3.2: Block Diagram of HC3A

DEPARTMENT OF ECE,VLITS 20
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

Internal view of HCA:

Fig - 3.3: 8-bit Han Carlson Adder

Fig – 3.4: Internal circuit of black and grey cells

DEPARTMENT OF ECE,VLITS 21
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

Advantage:
• Delay is less compared to CS3A.
Limitation:
• Area occupied is very high compared to CS3A.

3.3 Reason for Implementing Proposed Adder:


As above discussed there are limitations of high delay and more area occupation in the
existing methods. To bring trade-off between area and delay the proposed adder came into
existence.

DEPARTMENT OF ECE,VLITS 22
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

CHAPTER-4
PROPOSED ADDER

The proposed adder is a parallel prefix adder. It contains four stages.Each stage contributes
to the final result of the adder.

Below is a figure that shows the block diagram of proposed adder.

4.1 Block Diagram of Proposed Adder

Fig 4.1: Proposed Adder

• The dotted line shows the propagation delay.

4.2 Operation
Stage-1: Bit Addition Logic

• Stage-1 consists of an array of full adders. For n-bit addition there are n full adders.
• Full adder produces two outputs based on the given inputs.
• Let us consider that a, b, c are the vector inputs whose addition has to be done.
The output of a full adder be S’ and cy and they are given as:

DEPARTMENT OF ECE,VLITS 23
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

Fig – 4.2: Bit-addition logic

Stage-2: Base Logic

• It takes stage-1 inputs and compute generate and propagate bits with the following logic.

Fig-4.3: Base Logic

Stage-3: Propagate and Generate logic

• Stage-3 contains an array of black and grey cells which is arranged as shown in the
fig:4.1

DEPARTMENT OF ECE,VLITS 24
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

Fig - 4.4: Black cell.

Fig – 4.5: Grey cell

Stage-4: Sum Logic

• This is the last stage which computes sum and carry bits by taking generate and propagate
bits as inputs from the previous stage output.

DEPARTMENT OF ECE,VLITS 25
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

Fig – 4.5: Sum Logic

4.3 Advantages of Proposed Adder:

• It occupies less area when compared to Han Carlson Adder.


• It contains less delay compared to Carry Save Adder.

DEPARTMENT OF ECE,VLITS 26
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

CHAPTER-5

XILINX SOFTWARE

5.1 XILINX ISE OVERVIEW


The Integrated Software Environment (ISE™) is the Xilinx® is a specially designed
software for anyone to make their own designs using Xilinx device programming. Our design is
monitored and processed according to the steps present in the ISE design flow of the Project
navigator.

Design Entry

In design flow of ISE the first step is design entry. Based on your design objectives the
sources files are created during the design entry. Using a Hardware Description Language
(HDL), such as VHDL, Verilog, or ABEL, or using a schematic the top-level design is created.
In your design for the lower-level source files in your design the multiple formats are used.

Synthesis

The Synthesis is last step which comes after design entry and optional simulation. In this
process net list is created from VHDL, Verilog, or mixed language designs made by t he user.
The result of this step is net list which is given as input to the implementation step.

Implementation

You can run design implementation after synthesis, here into a physical file format the
logical design is converted and to the selected target device it can be downloaded. In one step
the implementation process is made to run from the project navigator, or separately you can run
each of the implementation processes. Depending on whether you are targeting a Field
Programmable Gate Array (FPGA) or a Complex Programmable Logic Device (CPLD) the
implementation process vary.

Verification

You can check the helpfulness of your structure at a couple of focuses in the arrangement stream.
You can use test framework programming to affirm the helpfulness and timing of your structure or a
section of your arrangement. The test framework decodes VHDL or Verilog code into circuit
helpfulness and grandstands sound eventual outcomes of the portrayed HDL to choose right circuit

DEPARTMENT OF ECE,VLITS 27
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

task. Reenactment empowers you to make and affirm complex limits in a by and large little
proportion of time. You can similarly seek after in-circuit check programming your device.

Device Configuration

In order to configure your device first a program file is generated. The configuration is
done by downloading the programming files from a host computer to a Xilinx device

5.2 PROJECT NAVIGATOR OVERVIEW


The design files are organized by the Project Navigator and through implementation the
programming to the targeted Xilinx® device are made to run the process from design entry. For your
Xilinx FPGA and CPLD designs the high level manager is project navigator, which allows you to do
the following: The design source files are added and created, and in the Sources window it appears.
In the Workspace the source files are modified. On your source files run the process in the Processes
window. In the Transcript window from the processes the output is viewed.

Project Navigator Main Window


The Project Navigator main window is shown by the following figure, through the
device configuration starting with design entry it allows you to manage your design. The below
figure (5.1) shows the project navigator window.

DEPARTMENT OF ECE,VLITS 28
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

Fig - 5.1: Project Navigator

1. Toolbar
2. Sources window
3. Processes window
4. Workspace
5. Transcript window

Using The Sources Window


The design source files are assembled into a project and in implementing your design for
a Xilinx® FPGA or CPLD is the first step. The source files you create are added to your project
is shown by the source tab in the source window, as shown in the following figure. By seeing
Creating a Project and Creating a Source File you can know more information about creating
projects and source files. The below figure (5.2) shows the source window.

DEPARTMENT OF ECE,VLITS 29
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

Fig - 5.2: Sources Window


The source records related with the picked Design View are seen (for example,
Synthesis/Implementation) with the help of source tab which is at the top of the
arrangement view ("sources for"). See the Using the Design Views for nuances. For
structures the "Amount of" drop-down overview, Resources portion, and Preserve
section are available which use Partitions. See the Using Partitions for nuances. The
levels of leadership of your arrangement are showed up by the source tab. By tapping the
notwithstanding (+) or less (-) images the levels are expanded and folded. Near an image
that shows its report type here each source record appears. In the Processes window the
system is open and is constrained by the archive you select. For adjusting the source
record in the Workspace twofold draw from on the source archive. See the Source File
Types for information on different record types.
You can alter the endeavor features, for example, the device family to focus on,
the top-level module type, the mix mechanical assembly, the test framework the test
structure, and the made reenactment language. For nuances, see Changing Project,
Source, and Snapshot Properties. Taking into account the source file you working, extra
tabs are presented in the source window.
• Always accessible: Sources tab, Snapshots tab, Libraries tab
• Constraints Editor: Timing Constraints tab
• Floor plan Editor: Translated Net list tab, Implemented Objects tab
• iMPACT: Configuration Modes tab
• Schematic Editor: Symbols tab

DEPARTMENT OF ECE,VLITS 30
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

• RTL and Technology Viewers: Design tab


• Timing Analyzer: Timing tab
The Xilinx Integrated Software Environment (ISE) is a powerful and complex set of
tools. The purpose of this guide is to help new users get started using ISE to compile their
designs. This guide provides a very high-level overview of how the tools work, and takes the
reader through the process of compiling.

Before the past 25 years the field programmable gate array increased lower cost per
transistor in logic cell counts, functionality relentlessly. Field programmable gate array have
taken steadily market share from the ASIC markets and gate array.

The field programmable gate array progression helped render gate arrays obsolete before
15 years ago. the exorbitant cost of designing and manufacturing the ASICs numerous trends,
doesn’t change the standards to reduce the Materials, of Bill in the face of economic times the
programmability software need in the face of economic times for both hardwar e, software to
create an environment of rough reduced staffing –converging the ASICs where the products
electronics in the favor of field programmable gate array are designers at a greater pace,
dumping. The convergence of programmable imperative trends.

With several hundred and thousands the cells programmable, the programmable field gate array
is available up to 11.2 transceivers of Gbps, Block RAM, of 38 Mb and digital signal processing
slices of 2,000. The field programmable gate array is leveraging the number of applications
Designers to ever-growing address. The opportunity things of all are considered, for field
programmable gate array to step up the pace in gobbling the application software to create an
environment of rough reduced staffing –converging the ASICs where the products electronics in
the favor of field programmable gate array are designers at a greater pace, dumping. The
convergence of programmable imperative trends.
To market quickly the opportunities of Xilinx is actively moving and help the customers
to get field programmable gate array of their innovation. This year, we introduce the goal in
mind the Spartan-6 and Virtex-6 Field programmable gate array families to design the tools,
need for hardware- and software- to develop the boards, the support was emphasized. In some
manner, these elements offered our customers to bring to closely defined and refined flows it is
tied to the silicon of targeted.

The Platform Design of Targeted approach as a pyramid (see Figure 1). The foundation
layer of the pyramid serves the Base Platform. It is composed of base development boards of our

DEPARTMENT OF ECE,VLITS 31
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

Virtex-6 and Spartan-6 field programmable gate array silicon, our ISE Design Suite. in which
we offer embedded, The Domain Specific Platform, is above the next layer reference designs of
DSP, tools of domain and connectivity IP, the forward plug daughter cards into the boards base
of market commission. the layer, of the top the Specific Platforms of various Market customers
these are composed of IP, communication, custom tools, video, or the market of AVB and
custom boards.

The value-added portions majority of their design efforts to choose the designs, they
significantly concentrate their overall design time to reduce. If they want, Customers can, design
every function from the scratch designs. Most customers certainly will choose to concentrate the
value-added portions of their designs to benefits the reinventing the wheel.

Approach changes of Smorgasbord

The Targeted Design Platform is part of the field programmable gate array approach to suit
specific design disciplines our tool flows are also refining. The figure out traditionally we offered all
our users’ which tools match the tasks. The smorgasbord tools left it up. The licenses suitable to their
budgets obtaining the number of tools. We will soon offer the editions of domain-specific Irish
specific exchange the Design to Suite the help user’s with specific jobs pair tools.

On top of the ISE design environment One of the digital signal processing bundles of edition to
improved the System Generator for digital signal processing synthesis, of Accel digital signal
processing of DSP-specific IP running. To find it useful the digital signal processing Edition is
primarily targeted to the developers of algorithm they are logic designers but not high density
lipoproteins they do some amount of algorithm development. So if digital signal processing the users
of Edition want to do a software application bit development of their algorithms they can add the
software development kit. The stand-alone tool is Xilinx Software Development Kit (SDK).

Over the last 10 years the businesses have fared the field programmable gate array and
application specific integrated circuits move to a Platform approach of Targeted Design it makes
sense. Xilinx’s business of roughly 80 percent came from the industry of wired and
communications of wireless before 10 years ago. With the rest of the semiconductor industry the
communications bubble burst circa 2001 dot-com, business affected and declined in turn of
adversely our business. The vertical application groups in 2002 Xilinx quickly created groups in
defense, aerospace, It should not happen again to the Xilinx of automotive, Medical (ISM),

DEPARTMENT OF ECE,VLITS 32
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

Scientific, and Industrial, it is wired and wireless communications broadcast as well. Xilinx
establish a much broader customer base.

Only accounts of 44 percent of our business today wireless and wireless


communications. to diversify the vertical groups of investment has allowed us point that the data
balance between is spread data processing, broadcasting, commercial, ISM, A&D. despite the
economic downturn brought by the mortgage crisis Xilinx has fared relatively well today, in the
U.S much broader scope.
In fact, the field programmable gate array take space from the business of application
scientific integrated circuits and application scientific service product more rapidly take share
from the opportunity of Xilinx and the rest. One segment that did not was ASICs While many
other semiconductor sectors of field programmable gate of the dot-com recession array bounced
back and grew out.
The height of the dot-com boom, during the year 2000, the height of the dot-com boom,
firm Gartner Dataquest, According to a report of 2007 from research there were roughly 7,750
application scientific integrated circuits starts. In the year 2005 the number of design had been
cut in the half by their 3,623 designs. Application scientific integrated circuits of all research
firms predict have continued to decline steadily to predict the design and it will continue to
decline over the future foreseeable.
Integrated circuits have become extremely more complex to manufacture the processes
exponentially it is more expensive, the historical impact of the last recession when you
take into the account it had application scientific integrated circuits and application
service provider. It is more feasible for the applications of smaller number. The last
recession will be longer and deeper than the most economists that predict this recession.
During this recession the application scientific integrated circuits has wonder it will starts

again by half drop or more. It is not a matter of how much they will drop certainly,
matter if they will drop.

5.3 CREATING A PROJECT

STEP-1: Go to file select New Project

DEPARTMENT OF ECE,VLITS 33
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

STEP-2: It will show New Project wizard as shown below. In that window type name of
the project and then click on Next.

After clicking on Next it will show another window as shown below. In that select
simulation, preferred language ect. After that click on Next and then click on Finish

DEPARTMENT OF ECE,VLITS 34
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

STEP-3: Right click on project name and select New Source

It will show New Source Wizard as shown below. In that select Source Type like VHDL

Module, Verilog Module etc. and give File Name and then click on Next button.

DEPARTMENT OF ECE,VLITS 35
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

It will show the following window. In that give the port variables like import, out port etc.
Then click on next button.

Click on Next button it will show Summary window and then click on Finish. Immediately
it shows code window, in code widow type the code.

After typing code click on save button or press ctrl+ s.

→ → →
STEP-4: Then select Implementation - select file name - double click on Check Syntax
If there are errors in code it will show red cross mark otherwise it shows right tick mark
as shown in below figure.

DEPARTMENT OF ECE,VLITS 36
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

DEPARTMENT OF ECE,VLITS 37
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

And then double click on View RTL Schematic it will show RTL Schematic.

And then double click on View Technology Schematic it will show technology Schematic.

DEPARTMENT OF ECE,VLITS 38
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

Click on OK button.

STEP-5: Select Simulation and then select file name

DEPARTMENT OF ECE,VLITS 39
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

Double click on Behavioral Check syntax .If there is no errors it will show following window.

Double Click on Simulate Behavioral Model .Then it shows wave form window as shown
below.

DEPARTMENT OF ECE,VLITS 40
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

In waveform window right click on in port variable and select Force Constant

Immediately it show another window in that window enter input value at Force to Value
and click on Apply and then click on OK.

Repeat the process to second input .

STEP-6: Finally click on RUN button

DEPARTMENT OF ECE,VLITS 41
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

It show input, output waveform window

DEPARTMENT OF ECE,VLITS 42
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

CHAPTER-6
APPLICATIONS

• A Full Adder’s circuit can be used as a part of many other larger circuits like Ripple Carry
Adder, which adds n-bits simultaneously.
• The dedicated multiplication circuit uses Full Adder’s circuit to perform Carryout
Multiplication.
• Full Adders are used in ALU- Arithmetic Logic Unit. In order to generate memory
addresses inside a computer and to make the Program Counter point to next instruction,
the ALU makes use of Full Adders.
• Full adders are also used in generating Pseudo-Random Bits and in many Cryptographic
algorithms.

6.1 Cryptography:

Cryptography is the study of secure communications techniques that allow only


the sender and intended recipient of a message to view its contents. The term is derived from the
Greek word kryptos, which means hidden.

To achieve optimal system performance while maintaining physical security, it is


necessary to implement the cryptography algorithms on hardware. Modular arithmetic such as
modular exponentiation, modular multiplication and modular addition is frequently used for the
arithmetic operations in various cryptography algorithms. Therefore, the performance of the
cryptography algorithm depends on the efficient implementation of the congruential modular
arithmetic operation. The most efficient approach to implement the modular multiplication and
exponentiation is the Montgomery algorithm whose critical operation is based on three-operand
binary addition.

6.2 Types of Cryptography Algorithms

Linear Congruential generator (LCG) is one of the main fundamental cryptography


algorithms. It is easier to design an LCG but it is doesn’t provide much security. Because as it is
based on a simple algorithm one can easily predict the output which leads to insecurity. Hence a

DEPARTMENT OF ECE,VLITS 43
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

secured algorithm is designed based on the principles of an LCG named as MODIFIED DUAL
COMBINED LINEAR CONGRUENTIAL GENERATOR (MDCLCG).

6.3 Block Diagram of MDCLCG:

Fig - 6.1: Block diagram of MDCLCG

6.3.1 Internal Block Diagram of LCG:

Fig – 6.2: Linear Congruential Generator

DEPARTMENT OF ECE,VLITS 44
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

6.4 Operation of MDCLCG:

• Considering multiplier constants as a1, a2, a3, a4 and increment constants as b1,b2,
b3,b4 for 32 bit MDCLCG.

• Let initial seeds : (x0,y0,p0,q0).

• Generalized formula for LCG-> Xi=(Ai*X(i-1)+Bi) mod 2^n.


where n= Number of bits.

• The sequences are computed as follows:

Xi={x1,x2,x3,x4,…}

Yi={y1,y2,y3,y4,…}

Pi={p1,p2,p3,p4,…}

Qi={q1,q2,q3,q4,…}

• The sequences Bi and Ci are in the MDCLCG architecture are generated by


comparing Xi with Yi and Pi with Qi respectively using magnitude Comparator.

• Finally, the pseudorandom bit Zi is generated by Bi XOR Ci.

Let us now take an example:

The 32-bit MDCLCG architecture is designed with the constant values of a1 = 65, b1 = 117, a2
= 16385, b2 = 221, a3 = 4097, b3 = 21359, a4 = 65537, b4 = 533, m = 232 and initial seeds of
(x0, y0, p0, q0) = (5183, 91356, 39771, 7392) which generates the sequence as follows,

x1 = (a1 × x0 + b1)

= 2^r1 × x0 + x0 + b1 mod 232

= (65 × 5183 + 117)mod 232

=( 2^6 × 5183 + 5183 + 117)mod 2^32

= (331712 + 5183 + 117)mod 2^32

= 337012 mod 2^32

= 337012

DEPARTMENT OF ECE,VLITS 45
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

In the above equation the three input modulo-2n addition is performed using the proposed three-
operand adder technique. Similarly, the sequences x2, x3,... can be computed as follows

Xi = {337012, 21905897, 1423883422, 2358109331,...}

In the same way, other sequences such as Yi , Pi and Qi are also computed as follows,

Yi = {1496868281, 1923524246, 474752883, 640215120,...}

Pi = {162963146, 1940099641, 2898752936, 606226711,...}

Qi = {484450037,1003823370, 1558127391,2147362100,...}

The sequences Bi and Ci in the MDCLCG architecture are generated by comparing Xi with
Yi and Pi with Qi respectively using the magnitude comparator as follows,

Bi = {0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0,...}

Ci = {0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1,...}
The pseudorandom bit sequence Zi is obtained by Bi ⊕ Ci as highlighted below,

Zi = {0, 1, 0, 1, 1, 1, 0, 0, 1, 0, 1, 0, 1, 1,...}

6.5 Simulation Waveforms of MDCLCG using Proposed Adder

Fig-6.3: Simulation Waveforms of MDCLCG

DEPARTMENT OF ECE,VLITS 46
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

CHAPTER-7
RESULTS
7.1 Simulation Waveforms

Fig – 7.1: CS3A Simulation Waveforms

Fig – 7.2: HC3A Simulation Waveforms

Fig – 7.3: Proposed Adder Simulation Waveforms

DEPARTMENT OF ECE,VLITS 47
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

7.2 Comparing Delay and area of Different Adders

Adder Delay(in nanoseconds) Area (in LUT)


CS3A 15.021 66
HC3A 13.055 192
Proposed Adder 9.345 128

Fig 7.4 Comparing Delay and area of Different Adders

7.3 Results of MDCLCG

MDCLCG using Delay(in nanoseconds) Area (in LUT)


CS3A 89.170 3413
HC3A 80.395 5333
Proposed Adder 71.163 4533

Fig-7.5: Results of MDCLCG using Different Adders

IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

DEPARTMENT OF ECE,VLITS 48
CHAPTER-8
APPENDIX
8.1 CODES:

8.1.1 CS3A:

module CS3A_16bit(
input [15:0] a,
input [15:0] b,
input [15:0] c,
input cin,
output [16:0] sum,
output cout
);
wire [15:0] M;
wire [15:0] L;
wire [15:0] k;
genvar i;
generate for(i=0;i<16;i=i+1)begin: Full_Adder1
FA stage1(.a(a[i]),.b(b[i]),.c(c[i]),.sum(M[i]),.carry(L[i]));
end
endgenerate
HA label1(.a(cin),.b(M[0]),.sum(sum[0]),.carry(k[0]));
genvar j;
generate for(j=1;j<16;j=j+1)begin: Full_adder2
FA stage2(.a(L[j-1]),.b(M[j]),.c(k[j-1]),.sum(sum[j]),.carry(k[j]));
end
endgenerate
HA label2(.a(k[15]),.b(L[15]),.sum(sum[16]),.carry(cout));
endmodule

// FULL ADDER

DEPARTMENT OF ECE,VLITS 49
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

module FA(
input a,
input b,
input c,
output sum,
output carry
);
assign sum= a ^ b^c;
assign carry=(a & b)|(b & c)|(c & a);
endmodule

// HALF ADDER
module HA(
input a,
input b,
output sum,
output carry
);
assign sum= (a ^ b);
assign carry=(a & b);
endmodule

8.1.2 HC3A:
module HC3A_16bit(
input [15:0] a,
input [15:0] b,
input [15:0] c,
input cin,
output [15:0] sum,
output cout
);
wire [15:0] s;

DEPARTMENT OF ECE,VLITS 50
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

wire c1;
HCA2_code stage1(.a(a),.b(b),.cin(cin),.sum(s),.cout(c1));
HCA2_code stage2(.a(s),.b(c),.cin(c1),.sum(sum),.cout(cout));
endmodule
// 2 Operand Han Carlson Adder
module HCA2_code(
input [15:0] a,
input [15:0] b,
input cin, output
[15:0] sum, output
cout
);
wire [15:0] p;
wire [15:0] g;
wire [15:0] gr;

wire [15:0] bg1,bp1,bg2,bp2,bg3,bp3;


genvar i;
generate for(i=0;i<16;i=i+1)begin : HA_stage_1
HA stage1(.a(a[i]),.b(b[i]),.sum(p[i]),.carry(g[i]));
end
endgenerate
//stage1
gray_cell label1(.a(p[1]),.b(g[0]),.c(g[1]),.g(gr[1]));
generate for(i=3;i<16;i=i+2)begin: stage1
black_cell label2(.a(p[i]),.b(g[i-1]),.c(g[i]),.r(p[i]),.s(p[i-
1]),.g(bg1[i]),.p(bp1[i])); end
endgenerate
// stage 2
gray_cell label3(.a(bp1[3]),.b(gr[1]),.c(bg1[3]),.g(gr[3]));
generate for(i=5;i<16;i=i+2)begin: stage2
black_cell label4(.a(bp1[i]),.b(bg1[i-2]),.c(bg1[i]),.r(bp1[i]),.s(bp1[i-
2]),.g(bg2[i]),.p(bp2[i])); end

DEPARTMENT OF ECE,VLITS 51
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

endgenerate
//stage 3
gray_cell label5(.a(bp2[5]),.b(gr[1]),.c(bg2[5]),.g(gr[5]));
gray_cell label6(.a(bp2[7]),.b(gr[3]),.c(bg2[7]),.g(gr[7]));
generate for(i=9;i<16;i=i+2)begin: stage3
black_celllabel7(.a(bp2[i]),.b(bg2[i-4]),.c(bg2[i]),.r(bp2[i]),.s(bp2[i-
4]),.g(bg3[i]),.p(bp3[i]));
end
endgenerate
//stage 4
gray_cell label8(.a(bp3[9]),.b(gr[1]),.c(bp3[9]),.g(gr[9]));
gray_cell label9(.a(bp3[11]),.b(gr[3]),.c(bg3[11]),.g(gr[11]));
gray_cell label10(.a(bp3[13]),.b(gr[5]),.c(bg3[13]),.g(gr[13]));
gray_cell label11(.a(bp3[15]),.b(gr[7]),.c(bg3[15]),.g(gr[15]));
generate for(i=2;i<16;i=i+2) begin: even_stage
gray_cell label12(.a(p[i]),.b(gr[i-1]),.c(g[i]),.g(gr[i]));
end
endgenerate
assign sum[0] =p[0]^cin;
assign sum[1]=p[1]^g[0];
generate for(i=2;i<16;i=i+1)begin: assignment
assign sum[i]=p[i] ^ gr[i-1];
end
endgenerate
assign cout=gr[15];
endmodule
// HALF ADDER
module HA(
input a,
input b,
output sum,
output carry

DEPARTMENT OF ECE,VLITS 52
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

);
assign sum= (a ^ b);
assign carry=(a & b);
endmodule

// GREY CELL
module gray_cell(
input a,
input b,
input c,
output g
);

assign g=(a & b) |


c; endmodule

//BLACK CELL
module black_cell(
input a,
input b,
input c,
input r,
input s,
output g,
output p
);
assign g=(a & b) | c;
assign p=(r & s);
endmodule

8.1.3 Proposed Adder:


module proposed_adder16(
input [15:0] a,

DEPARTMENT OF ECE,VLITS 53
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

input [15:0] b,
input [15:0] c,
input cin,
output [15:0] sum,
output cout
);

wire [15:0] cy;


wire [15:0] s,p,g;
wire x;
// assign x=0;
genvar i;
//stage1--->bit addition logic
generate for(i=0;i<16;i=i+1)begin: stage1
FA bit_addition_logic(.a(a[i]),.b(b[i]),.c(c[i]),.sum(s[i]),.carry(cy[i]));
end
endgenerate
//stage2--->base logic
base l1(.s(s[0]),.cy(cin),.p(p[0]),.g(g[0]));
generate for(i=1;i<16;i=i+1)begin: stage2
base base_logic(.s(s[i]),.cy(cy[i-
1]),.p(p[i]),.g(g[i])); end
endgenerate
assign x = cy[15];
//stage3--->propogate and logic
stage3_proposal pg_logic(.p(p),.g(g),.cin(x),.sum(sum),.cout(cout));
endmodule

// FULL ADDER
module FA(
input a,
input b,

DEPARTMENT OF ECE,VLITS 54
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

input c,
output sum,
output carry
);
assign sum= a ^ b^c;
assign carry=(a & b)|(b & c)|(c & a);
endmodule

// BASE LOGIC
module base(
input s,
input cy,
output p,
output g
);
assign p=s ^ cy;
assign g=s & cy;
endmodule

// PROPOGATE AND GENERATE


LOGIC module stage3_proposal(
input [15:0] p,
input [15:0] g,
input cin, output
[15:0] sum, output
cout
);
wire [15:0] gr;
wire [15:0] bg1,bp1,bg2,bp2,bg3,bp3;
genvar i;
gray_cell label1(.a(p[1]),.b(g[0]),.c(g[1]),.g(gr[1]));
//stage 1

DEPARTMENT OF ECE,VLITS 55
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

generate for(i=3;i<16;i=i+2)begin: stage1


black_cell label2(.a(p[i]),.b(g[i-1]),.c(g[i]),.r(p[i]),.s(p[i-1]),.g(bg1[i]),.p(bp1[i]));
end
endgenerate
// stage 2
gray_cell label3(.a(bp1[3]),.b(gr[1]),.c(bg1[3]),.g(gr[3]));
generate for(i=5;i<16;i=i+2)begin: stage2
black_cell label4(.a(bp1[i]),.b(bg1[i-2]),.c(bg1[i]),.r(bp1[i]),.s(bp1[i-2]),.g(bg2[i]),.p(bp2[i]));
end
endgenerate

//stage 3
gray_cell label5(.a(bp2[5]),.b(gr[1]),.c(bg2[5]),.g(gr[5]));
gray_cell label6(.a(bp2[7]),.b(gr[3]),.c(bg2[7]),.g(gr[7]));
generate for(i=9;i<16;i=i+2)begin: stage3
black_cell label7(.a(bp2[i]),.b(bg2[i-4]),.c(bg2[i]),.r(bp2[i]),.s(bp2[i-
4]),.g(bg3[i]),.p(bp3[i]));
end
endgenerate
//stage 4
gray_cell label8(.a(bp3[9]),.b(gr[1]),.c(bp3[9]),.g(gr[9]));
gray_cell label9(.a(bp3[11]),.b(gr[3]),.c(bg3[11]),.g(gr[11]));
gray_cell label10(.a(bp3[13]),.b(gr[5]),.c(bg3[13]),.g(gr[13]));
gray_cell label11(.a(bp3[15]),.b(gr[7]),.c(bg3[15]),.g(gr[15]));
generate for(i=2;i<16;i=i+2) begin: even_stage
gray_cell label12(.a(p[i]),.b(gr[i-1]),.c(g[i]),.g(gr[i]));
end
endgenerate
assign sum[0] =p[0]^cin;
assign sum[1]=p[1]^g[0];
generate for(i=2;i<16;i=i+1)begin: assignment
assign sum[i]=p[i] ^ gr[i-1];

DEPARTMENT OF ECE,VLITS 56
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

end
endgenerate
assign cout=gr[15];
endmodule

// GRAY CELL
module gray_cell(
input a,
input b,
input c,
output g
);

assign g=(a & b) |


c; endmodule

// BLACK CELL
module black_cell(
input a,
input b,
input c,
input r,
input s,
output g,
output p
);
assign g=(a & b) |
c; assign p=(r & s);
endmodule

DEPARTMENT OF ECE,VLITS 57
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

8.1.4 Code for MDCLCG:


module MDCLCG_Proposed_Adder(
input [15:0] a1,
input [15:0] a2,
input [15:0] a3,
input [15:0] a4,
input [15:0] x0,
input [15:0] y0,
input [15:0] p0,
input [15:0] q0,
input [15:0] b1,
input [15:0] b2,
input [15:0] b3,
input [15:0] b4,

input [15:0] m,
output [5:1] Z
);
wire [5:1] B,C;
CLCG_Proposed m1(.a1(a1),.a2(a2),.x0(x0),.y0(y0),.b1(b1),.b2(b2),.m(m),.B(B));
CLCG_Proposed m2(.a1(a3),.a2(a4),.x0(p0),.y0(q0),.b1(b3),.b2(b4),.m(m),.B(C));
assign Z[1] = B[1]>=C[1];
assign Z[2] = B[2]>=C[2];
assign Z[3] = B[3]>=C[3];
assign Z[4] = B[4]>=C[4];
assign Z[5] = B[5]>=C[5];
Endmodule

module CLCG_Proposed(
input [15:0] a1,
input [15:0] a2,

DEPARTMENT OF ECE,VLITS 58
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

input [15:0] b1,


input [15:0] b2,
input [15:0] x0,
input [15:0] y0,
input [15:0] m,
// output [15:0] x1,x2,x3,x4,x5,y1,y2,y3,y4,y5,
output [5:1] B
);
wire [15:0] x1,x2,x3,x4,x5,y1,y2,y3,y4,y5;
LCG_Proposed_Adder l1(.a(a1),.x0(x0),.b(b1),.m(m),.x1(x1),.x2(x2),.x3(x3),.x4(x4),.x5(x5));
LCG_Proposed_Adder l2(.a(a2),.x0(y0),.b(b2),.m(m),.x1(y1),.x2(y2),.x3(y3),.x4(y4),.x5(y5));
assign B[1] = x1>=y1;
assign B[2] = x2>=y2;
assign B[3] = x3>=y3;
assign B[4]=x4>=y4;
assign B[5] = x5>=y5;

Endmodule
module LCG_Proposed_Adder(
input [15:0] a,
input [15:0] x0,
input [15:0] b,
input [15:0] m,
output [15:0] x1,x2,x3,x4,x5
);
wire [15:0] k0,k1,k2,k3,k4,c,sum1,sum2,sum3,sum4,sum0;
wire c1,cin;
assign cin=0;
genvar i;
generate for(i=0;i<16;i=i+1) begin: Assigning
assign c[i]=0;
End
Endgenerate

DEPARTMENT OF ECE,VLITS 59
IMPLEMENTATION OF HIGH-SPEED VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

assign k0=(a * x0);

generate for(i=3;i<16;i=i+2)begin: stage1

proposed_adder16 l1(.a(k0),.b(b),.c(c),.cin(cin),.sum(sum0),.cout(c1));
assign x1= sum0 % (2 ** m);
assign k1=(a*x1);
proposed_adder16 l2(.a(k1),.b(b),.c(c),.cin(cin),.sum(sum1),.cout(c1));
assign x2= sum1 % (2 ** m);
assign k2=(a*x2);
proposed_adder16 l3(.a(k2),.b(b),.c(c),.cin(cin),.sum(sum2),.cout(c1));
assign x3= sum2 % (2 ** m);
assign k3=(a*x3);
proposed_adder16 l4(.a(k3),.b(b),.c(c),.cin(cin),.sum(sum3),.cout(c1));
assign x4= sum3 % (2 ** m);
assign k4=(a*x4);
proposed_adder16 l5(.a(k4),.b(b),.c(c),.cin(cin),.sum(sum4),.cout(c1));
assign x5= sum4 % (2 ** m);
Endmodule

DEPARTMENT OF ECE,VLITS 60
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

8.2. REFERENCES:

1. M. M. Islam, M. S. Hossain, M. K. Hasan, M. Shahjalal, and Y. M. Jang, “FPGA


implementation of high-speed area-efficient processor for elliptic curve point multiplication over
prime field,” IEEE Access, vol. 7, pp. 178811–178826, 2019.

2. Z. Liu, J. GroBschadl, Z. Hu, K. Jarvinen, H. Wang, and I. Verbauwhede, “Elliptic curve


cryptography with efficiently computable endomorphisms and its hardware implementations for
the Internet of Things,” IEEE Trans. Comput., vol. 66, no. 5, pp. 773–785, May 2017.

3. Z. Liu, D. Liu, and X. Zou, “An efficient and flexible hardware implementation of the dual-
field elliptic curve cryptographic processor,” IEEE Trans. Ind. Electron., vol. 64, no. 3, pp.
2353–2362, Mar. 2017.

4. B. Parhami, Computer Arithmetic: Algorithms and Hardware Design. New York, NY, USA:
Oxford Univ. Press, 2000.

5. P. L. Montgomery, “Modular multiplication without trial division,” Math. Comput., vol. 44,
no. 170, pp. 519–521, Apr. 1985.

6. S.-R. Kuang, K.-Y. Wu, and R.-Y. Lu, “Low-cost high-performance VLSI architecture for
montgomery modular multiplication,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol.
24, no. 2, pp. 434–443, Feb. 2016.

7. S.-R. Kuang, J.-P. Wang, K.-C. Chang, and H.-W. Hsu, “Energy-efficient high-throughput
montgomery modular multipliers for RSA cryptosystems,” IEEE Trans. Very Large Scale Integr.
(VLSI) Syst., vol. 21, no. 11, pp. 1999–2009, Nov. 2013.

8. S. S. Erdem, T. Yanik, and A. Celebi, “A general digit-serial architecture for montgomery


modular multiplication,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 25, no. 5, pp.
1658–1668, May 2017.

9. R. S. Katti and S. K. Srinivasan, “Efficient hardware implementation of a new pseudo-


random bit sequence generator,” in Proc. IEEE Int. Symp. Circuits Syst., Taipei, Taiwan, May
2009, pp. 1393–1396.

10. A. K. Panda and K. C. Ray, “Modified dual-CLCG method and its VLSI architecture for
pseudorandom bit generation,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 66, no. 3, pp. 989–
1002, Mar. 2019.

11. A. Kumar Panda and K. Chandra Ray, “A coupled variable input LCG method and its VLSI
architecture for pseudorandom bit generation,” IEEE Trans. Instrum. Meas., vol. 69, no. 4, pp.
1011–1019, Apr. 2020.

DEPARTMENT OF ECE,VLITS 61
IMPLEMENTATION OF HIGH-SPEED AND AREA-EFFICIENT VLSI ARCHITECTURE OF THREE OPERAND BINARY ADDER

12. N. Weste and K. Eshraghian, Principles of CMOS VLSI Design—A Systems Perspective.
Reading, MA, USA: Addison-Wesley, 1985.

13. T. Kim, W. Jao, and S. Tjiang, “Circuit optimization using carry-saveadder cells,” IEEE
Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 17, no. 10, pp. 974–984, Oct. 1998.

14. A. Rezai and P. Keshavarzi, “High-throughput modular multiplication and exponentiation


algorithms using multibit-scan–multibit-shift technique,” IEEE Trans. Very Large Scale Integr.
(VLSI) Syst., vol. 23, no. 9, pp. 1710–1719, Sep. 2015.

15. A. K. Panda and K. C. Ray, “Design and FPGA prototype of 1024- bit Blum-Blum-Shub
PRBG architecture,” in Proc. IEEE Int. Conf. Inf. Commun. Signal Process. (ICICSP),
Singapore, Sep. 2018, pp. 38–43.

DEPARTMENT OF ECE,VLITS 62

You might also like