You are on page 1of 2

What is superscalar architecture?

Superscalar architecture is a method of parallel computing used in many processors. In a


superscalar computer, the central processing unit (CPU) manages multiple instruction pipelines to
execute several instructions concurrently during a clock cycle. This is achieved by feeding the
different pipelines through a number of execution units within the processor. To successfully
implement a superscalar architecture, the CPU's instruction fetching mechanism must intelligently
retrieve and delegate instructions. Otherwise, pipeline stalls may occur, resulting in execution units
that are often idle.

To visualize how this works, consider a hospital surgical unit that consists of areas for admittance,
surgery, and recovery. Patients can move in only one direction, from admittance to recovery, and it
takes the same amount of time to go through each of the areas. Assume the admitting area can
handle three patients at a time and there are three surgical teams, each of which can work on a
single patient. Also assume the recovery area has an indeterminate number of beds, but can
accommodate only one person per bed. When the unit is working correctly, the admitting area
processes three patients at a time, sends one to each of the teams, and immediately processes
another three patients. Even though the surgical teams can handle only one patient at a time,
because there are three of them, they will have passed their charges on by the time the new ones
arrive. The paths the three patients take are analogous to instructions flowing through three
pipelines in a CPU clock cycle. The admitting area is like a fetching mechanism, the surgery teams
are like execution units, and the recovery room is like the registers or cache to which the units
write their results.

To illustrate the kind of problems that can occur in superscalar architectures, consider what would
happen if the staff of the admitting area in the example were not very competent. For example, if
they passed a patient in need of a kidney transplant to a surgical team before the donor kidney
was available, the team wouldn't be able to go to work. Suddenly, there would be a bottleneck at
the admitting area because only two surgical teams would be available for new patients. Another
bottleneck could occur if a surgical team tried to assign a patient to an already occupied bed in the
recovery area. Again, a bottleneck would appear because the team would not be available until
the bed was emptied and the team could move the current patient into it. Stalls like this happen in
processors when an execution unit tries to perform a task that is dependent on the results of as
yet uncalculated instructions. This is why it is important that CPUs carefully manage the order in
which they process instructions.

A superscalar processor is a CPU that implements a form


of parallelism called instruction-level parallelism within a single processor. In contrast to
a scalar processor that can execute at most one single instruction per clock cycle, a
superscalar processor can execute more than one instruction during a clock cycle by
simultaneously dispatching multiple instructions to different execution units on the processor. It
therefore allows for more throughput (the number of instructions that can be executed in a unit
of time) than would otherwise be possible at a given clock rate. Each execution unit is not a
separate processor (or a core if the processor is a multi-core processor), but an execution
resource within a single CPU such as an arithmetic logic unit.
In Flynn's taxonomy, a single-core superscalar processor is classified as an SISD processor
(Single Instruction stream, Single Data stream), though many superscalar processors support
short vector operations and so could be classified as SIMD (Single Instruction stream, Multiple
Data streams). A multi-core superscalar processor is classified as an MIMD processor
(Multiple Instruction streams, Multiple Data streams).
While a superscalar CPU is typically also pipelined, superscalar and pipelining execution are
considered different performance enhancement techniques. The former executes multiple
instructions in parallel by using multiple execution units, whereas the latter executes multiple
instructions in the same execution unit in parallel by dividing the execution unit into different
phases.
The superscalar technique is traditionally associated with several identifying characteristics
(within a given CPU):

 Instructions are issued from a sequential instruction stream


 The CPU dynamically checks for data dependencies between instructions at run time
(versus software checking at compile time)
 The CPU can execute multiple instructions per clock cycle

 Simple superscalar pipeline. By fetching and dispatching two instructions at a time, a maximum of two instructions
per cycle can be completed. (IF = Instruction Fetch, ID = Instruction Decode, EX = Execute, MEM = Memory
access, WB = Register write back, i = Instruction number, t = Clock cycle [i.e., time])

superscalar
Superscalar describes a microprocessor design that makes it possible for more than
one instruction at a time to be executed during a single clock cycle . In a superscalar
design, the processor or the instruction compiler is able to determine whether an
instruction can be carried out independently of other sequential instructions, or whether
it has a dependency on another instruction and must be executed in sequence with it.
The processor then uses multiple execution units to simultaneously carry out two or
more independent instructions at a time. Superscalar design is sometimes called
"second generation RISC ."

You might also like