Professional Documents
Culture Documents
Introduction
1
1.1 Preamble
The National Institute for Science and Technology (NIST), U.S. Department of
Commerce, in the year 1997 invited proposals from researcher and academic groups for
developing a new symmetric-key encryption standard. The evaluation criteria were
Security, Cost and Implementation characteristics of the Algorithm. Security was the
most important criteria, which encompassed the features such as resistance of the
algorithm to cryptanalysis, soundness of its mathematical basis and randomness of the
algorithm output [1] (James Nechvatal, 2000). Cost was the second important
evaluation criteria that encompassed licensing requirements, computational efficiency
on various platforms, memory requirements and hardware implementations. The third
evaluation criterion was implementation characteristics such as flexibility, hardware and
software suitability, and algorithm simplicity. After reviewing the results of the
preliminary research and analysis by cryptographic research community, NIST decided
to propose Rijndael algorithm developed by Joan Daemen and Vincent Rijmen as the
Advanced Encryption Standard (AES). The Rijndael algorithm demonstrated was
having the best performance on both hardware and software platforms. It also had the
shortest encryption/decryption time and also known to be resistant to all known linear
and differential cryptanalysis.
I
(Akashi Satoh, 2001). The hardware implementations of AES can be optimized for
speed, size and power consumption. Two major targeted platforms for AES
implementations are Field Programmable Gate Arrays (FPGA) and Application Specific
Integrated Circuits (ASIC). There are many possible architectural options to the
hardware design of AES on FPGAs and ASICs. These architectural options includes,
internal pipelining, external pipelining, rolling and loop unrolling. Selection of these
options depends on requirement of different speed/area trade-offs for different
applications of AES algorithm.
2
like, USB pen drives, inductively powered RF identification (RFID), smart cards and
wireless sensor networks (WSNs) etc. The optimization methodology may include
resource sharing between encryptor/decryptor units and on-the-fly computations to
reduce area. Complexity in computation can be reduced through the use of look-up-
tables (LUTs) but requires high memory space. Duplicating and pipelining the
hardware, required for round units, can achieve higher speed, while folding of the
architectures can achieve smaller area of implementation.
The motivating force for our research is to develop architectures for high throughput,
and low area with lesser effect of trade-offs on each other, so as to integrate the
cryptographic hardware as an IP core into the application chipsets. This ensures
encryption of data, right at the place of origination.
1.3 Objectives
Detailed objectives are set to achieve the above mentioned principle objectives so as to
follow a methodology to pursue them. They are mentioned as below:
1. To develop on the fly computation for memory less architecture for low area
and low power implementation of AES sub-processes as listed below:
a. Substitute Byte operation
b. MixColumn operation
c. Key expansion
2. To reduce the memory requirement, for Look-Up-Table (LUT) approach of
implementation of SubstituteByte, MixColumn and Key Expansion for
encryption and decryption
3. To develop a combined architecture for encryption and decryption, sharing
maximum resources between them.
4. To design and verify higher throughput, rolled architecture for all key sizes,
sharing maximum hardware resources between encryption and decryption.
5. To design and verify higher throughput, pipelined architecture for all key
sizes, sharing maximum hardware resources between encryption and
decryption.
6. To develop a combined systolic architecture for encryption and decryption
data path.
7. To develop a systolic architecture for Key Expansion unit.
Primarily the research carried out is organized in two directions, one for investigating
high speed architectures and the other for investigating minimum area architectures for
implementing AES. The AES architecture has two concurrent data paths, namely as
Encryption/Decryption (ED) data path and Key Expansion (KE) data path. The round
key generated in the KE data path is required in the corresponding round of ED data
path. In order to achieve the objectives: High throughput and Low area, architectural
transformations are employed. In addition to the concurrent ED and KE data path of
AES, there exist concurrencies and scheduling of the sub-process within the data paths,
in design, for implementing AES on hardware platforms. Our investigations are divided
in two stages. The first stage is being the actual design of the architecture, supporting
either high throughput or low area implementation. The second stage would be the
layouting and physical design, using 180nm standard cell libraries and computation of
throughput, area and power consumption of the implementations.
The architectures designed are hence implemented on Virtex-4 FPGA, prior to physical
layouting. Intentionally an oversized device has been selected, so that the logic and
4
routing resource constraints of the device does not affect the functional verification of
the design. A timing simulation verifies the post synthesis and post technology
mapping functionality, of the design. The physical layout design is done on 180nm
technology using Taiwan Semiconductor Manufacturing Company (TSMC) standard
cell libraries.
The adjoining chart in Figure 1.1 shows the development stages of the investigation
carried out to meet the goals. The memory based design employs LUTs, require
minimum computations and is less complex for hardware implementation, whereas the
OTF computation methodology requiring no memory blocks, is higher in complexity
for hardware implementation. A further optimization of the LUT or OTF design
strategies, for low area can be achieved by using rolled architecture and high throughput
can be achieved using pipelined architecture. The upper part of the chart illustrates the
two strategies, whereas, the lower part of the chart gives option for performance of the
design on throughput and area count. Systolic array architecture is a midway solution
and can also be considered as optimal between the low area and high throughput
implementation.
AES Architecture
i \
(Low Area Implementations) (High Speed Implementations)
Rolled Architecture Pipelined Architecture
(Optimal Implementations)!/'
Systolic Architecture
All the three architectures (Rolled, Pipelined and Systolic) designed are developed and
their first hand verification is conducted on Xilinx ISE 9.1, onVertex-4 FPGA devices.
All the simulations (Functional, Post Synthesis and Post Routing) are performed using
ModelSim. The verified architecture is then synthesized using Cadence's RTL Compiler
12.1. The synthesis is done using 180nm TSMC standard cell libraries. The first
estimation of the clock frequency is determined on post synthesis design netlist. The
design is then optimized for minimizing the hardware used in the post synthesis design,
on the basis of synthesis report generated. Sufficient slack in timing is considered while
writing the constraints for synthesis, so as to accommodate other eventualities that may
crop up while physical layout. The final post synthesis design in netlist form is taken
for the final physical layout, which is performed using Cadence's SoC Encounter 12.1.
Multiple iterations of clock synthesis and routing is done to determine maximum clock
frequency attainable by the design.
The organization of the thesis is as follows: Chapter 2 discusses the literature survey
and highlights of the work of various researchers in the field of hardware
implementation of AES. Chapter 3 describes the AES algorithm, specifications and key
expansion. Chapter 4 encompasses the concept of Look-Up-Table and On-The-Fly
computation approaches, used for implementing AES. Design of architectures, their
performance, comparison and conclusions for rolled, pipelined and Systolic
architectures, are described in Chapter 5, and 6. The later Chapter summarizes the
contributions of our research work, conclusions and scope for further investigations.
The thesis ends with references, publications and information regarding Indian patent
filed, based on the work carried out.