You are on page 1of 20

Module 3: Pipelining

Pipelining: basic concepts, data hazards, instruction

hazards, influence on instruction sets, data path and

control considerations, performance considerations,

exception handling
Pipelining
To improve the performance of a CPU we have two options:
1)Improve the hardware by introducing faster circuits.
2)Arrange the hardware such that more than one operation can be performed at the
same time.
Since there is a limit on the speed of hardware and the cost of faster circuits is quite high, we
have to adopt the 2nd option.
Pipelining is a process of arrangement of hardware elements of the CPU such that its overall
performance is increased. Simultaneous execution of more than one instruction takes place in
a pipelined processor.
Design of a basic pipeline
• In a pipelined processor, a pipeline has two ends, the input end and the
output end.
• Between these ends, there are multiple stages/segments such that the
output of one stage is connected to the input of the next stage and each
stage performs a specific operation.
• Interface registers are used to hold the intermediate output between two
stages. These interface registers are also called latch or buffer.
• All the stages in the pipeline along with the interface registers are
controlled by a common clock.
Basic Concepts include: Role of Cache memory & Pipeline Performance

Role of cache memory:

• Cache memory plays a crucial role in improving the efficiency and

performance of pipelining in computer architectures.

• Its role is to reduce memory latency and ensure that the pipeline stages

have fast and efficient access to the data and instructions they need.
Here are the key roles of cache memory in pipelining:
Reducing Memory Access Latency:
• Pipelining involves the execution of multiple instructions or stages in parallel.
• Each stage may require access to memory (for fetching instructions or data).
• Cache memory serves as a high-speed buffer between the CPU and main
memory (RAM).
• It stores frequently accessed instructions and data, allowing the pipeline stages
to access them quickly without waiting for slower main memory access.
• This helps in reducing memory access latency, which is critical for maintaining
the pipeline's efficiency.
Instruction Fetch:
• In a pipelined processor, the instruction fetch stage is crucial.
• Cache memory holds frequently used program instructions, which
means that the instruction fetch stage can often access instructions
from the cache instead of going to main memory.
• This reduces the time needed to fetch instructions and keeps the
pipeline filled with instructions to execute.
Data Access:
• Pipelining often involves stages that require data from memory, such as the execution stage.
• Cache memory can store frequently used data elements, allowing the execution stage to
access data quickly.
• This reduces the chances of pipeline stalls due to data dependencies.
Avoiding Memory Bottlenecks:
• Without cache memory, the pipeline may suffer from frequent stalls as it waits for data or
instructions to be fetched from the slower main memory.
• Cache memory helps in avoiding these bottlenecks by providing faster access to frequently
used items.
Increasing Throughput:
• Cache memory's ability to supply data and instructions quickly to the pipeline stages
increases the overall throughput of the processor.
• The pipeline can continue processing instructions without significant delays due to memory
accesses, resulting in higher performance.
Pipeline Interlocks and Hazard Handling:
• Cache memory can also aid in handling pipeline hazards, such as data hazards or control
hazards.
• By providing quick access to frequently used data and instructions, cache memory helps in
resolving hazards and ensuring that the pipeline stages can continue processing without
stalls or delays.
Pipeline performance:

• It refers to the efficiency and speed at which a pipelined processor or system can execute

tasks or instructions.

• Pipelining is a technique used in computer architecture to increase the throughput and

overall performance of a processor by breaking down the execution of instructions into a

series of sequential stages.

• Each stage is executed in parallel with the subsequent stages, allowing multiple

instructions to be in various stages of processing at the same time.


Several factors contribute to pipeline performance:
Instruction Throughput:
• Pipeline performance is often measured by the number of instructions completed per
unit of time.
• A well-designed pipeline can achieve a high instruction throughput, meaning it can
process a large number of instructions per clock cycle.
Clock Speed:
• The clock speed of the pipeline determines how quickly each stage of the pipeline can
execute its portion of the instruction.
• Higher clock speeds generally result in better pipeline performance, but they can also
lead to increased power consumption and heat generation.
Pipeline Depth:
•The depth of the pipeline, which is the number of stages it consists of, affects
performance.
•A deeper pipeline allows for finer-grained instruction processing but can introduce
additional latency.
•Balancing pipeline depth is crucial to optimizing performance.
Pipeline Hazards:
•Hazards occur when one instruction depends on the result of a previous instruction that
has not yet completed.
•Pipeline hazards can stall the pipeline, reducing performance.
•Techniques such as forwarding and hazard detection units are used to mitigate these
issues.
Branch Prediction:
•Efficient branch prediction mechanisms are essential to pipeline performance.
•Accurate predictions reduce the impact of branch instructions on pipeline stalls.
•Branch target buffers and branch history tables are common components for improving
branch prediction.
Data Dependencies:
• Data dependencies between instructions can also affect pipeline performance.
• Forwarding, speculative execution, and out-of-order execution are techniques used to
address data dependencies and keep the pipeline flowing efficiently.
Cache Memory:
• As mentioned earlier, cache memory plays a significant role in pipeline performance by
reducing memory access latency.
• Faster and more efficient caches lead to better performance.
Instruction Mix:
• The mix of instructions in a program can impact pipeline performance.
• Programs with many data hazards, branches, or memory accesses may not pipeline as
efficiently as programs with a simpler instruction mix.
Data Dependencies: Data dependencies between instructions can also affect pipeline
performance. Forwarding, speculative execution, and out-of-order execution are techniques
used to address data dependencies and keep the pipeline flowing efficiently.

Pipeline Balancing: Ensuring that each pipeline stage takes approximately the same amount
of time to execute is critical. Imbalanced pipelines can lead to underutilized resources or
bottlenecks.

Efficiency of Stages: The efficiency and resource utilization of each pipeline stage are
important. If a stage is frequently idle or underutilized, it can limit the overall pipeline
performance.

Overall System Design: Pipeline performance is not solely determined by the CPU's design.
The memory hierarchy, system architecture, and peripheral devices all play a role in overall
system performance.
Hazards:
• In computer architecture and pipelining, hazards are situations that can
potentially lead to incorrect or unpredictable behaviour in a pipelined
processor.
• Hazards can occur due to the overlapping execution of instructions in
different stages of the pipeline.
• There are three main types of hazards in pipelining:
 Structural hazard
 Data/ Instructional hazard
 Control Hazard
Data/Instructional Hazard:
Read after Write (RAW)
Read after Write(RAW) is also known as True dependency or Flow dependency. A read-after-write (RAW) data
hazard is when an instruction refers to a result that has not yet been computed or retrieved. This can happen
because, even when an instruction is executed after another, the previous instruction has only been processed
partially through the pipeline.
For example, consider the two instructions:
I1. R2 <- R5 + R3
I2. R4 <- R2 + R3
The first instruction computes a value to be saved in register R2, and the second uses this value to compute a
result for register R4. However, in a pipeline, when the operands for the second operation are retrieved, the
results of the first operation have not yet been stored, resulting in a data dependence.
Instruction I2 has a data dependence since it is dependent on the execution of instruction I1.
Write after Read (WAR)
Write after Read(WAR) is also known as anti-dependency. These data hazards
arise when an instruction's output register is utilized immediately after being
read by a previous instruction.

As an example, consider the two instructions:

I1. R4 <- R1 + R5

I2. R5 <- R1 + R2

When there is a chance that I2 will end before I1 (i.e., when there is
concurrent execution), it must be assured that the result of register R5 is not
saved before I1 has had a chance to obtain the operands.
Write after Write (WAW)
Write after Write(WAW) is also known as output dependency. These data
hazards arise when the output register of instruction is utilized for write
after the previous instruction has been written.
As an example, consider the two instructions:
I1. R2 <- R4 + R7
I2. R2 <- R1 + R3
The write-back (WB) of I2 must be postponed until I1 has completed its
execution.
Handling Data Hazard
There are various methods to handle the data hazard in computer architecture that occur in the program. Some of
the methods to handle the data hazard are as follows:
• Forwarding is the addition of specific circuitry to the pipeline. This approach works because the needed
values take less time to travel via a wire than it does for a pipeline segment to compute its result.
• Code reordering - Reordering refers to the practice of rearranging instructions in a program to optimize
performance, particularly in the context of pipelined processors. It involves changing the order of instructions
in the source code or machine code to reduce data hazards and improve instruction throughput. Code
reordering can be done at the assembly or compiler level and is typically a software-level optimization
technique.
• Stall Insertion inserts one or more installs(no-op instructions) into the pipeline, delaying execution of the
current instruction until the needed operand is written to the register file; unfortunately, this method reduces
pipeline efficiency and throughput.

You might also like