Unit 4 Lecture Notes

NEHRU INSTITUTE OF ENGINEERING AND TECHNOLOGY
An ISO 9001: 2015 & 14001: 2015 Certified Institution, Affiliated to Anna University, Chennai
Approved by AICTE, New Delhi, Recognized by UGC with Section 2(f) and 12(B),
Re-accredited by NAAC “A+”, NBA Accredited (UG Courses: AERO & CSE)
“Nehru Garden”, Thirumalayampalayam, Coimbatore - 641 105.
UNIT IV PROCESSOR
Faculty Name : Mrs.jeni narayanan L A

Branch : IT
Sub Code : CS3351
Year : II
Subject Name : Digital Principles &
Semester : III
Computer Organisation
INSTRUCTION EXECUTION:
As instructions are a part of the program which are stored inside the memory, so every time the
processor requires to execute an instruction, for that the processor first fetches the instruction from
the memory, then decodes the instruction and then executes the instruction. The whole process is
known as an instruction cycle.
Instruction cycle state transition diagram
Instruction execution :
Instruction execution needs the following steps, which are
 PC (program counter) register of the processor gives the address of the instruction which needs
to be fetched from the memory.
 If the instruction is fetched then, the instruction opcode is decoded. On decoding, the processor
identifies the number of operands. If there is any operand to be fetched from the memory, then
that operand address is calculated.
 Operands are fetched from the memory. If there is more than one operand, then the operand
fetching process may be repeated (i.e. address calculation and fetching operands).
 After this, the data operation is performed on the operands, and a result is generated.
 If the result has to be stored in a register, the instructions end here.

 If the destination is memory, then first the destination address has to be calculated. Then the
result is then stored in the memory. If there are multiple results which need to be stored inside
the memory, then this process may repeat (i.e. destination address calculation and store result).
 Now the current instructions have been executed. Side by side, the PC is incremented to
calculate the address of the next instruction.
 The above instruction cycle then repeats for further instructions.
Straight line sequencing:
 Straight line sequencing means the instruction of a program is executed in a sequential
manner(i.e. every time PC is incremented by a fixed offset).
 And no branch address is loaded on the PC.
Example –
 Here, programs and data are stored in the same memory, i.e. von Neumann architecture.
 First instruction of a program is stored at address i. PC gives address i and instruction stored at
that address i is fetched from the memory and then decoded and then operand A is fetched from
the memory and stored in a temporary register and then the instruction is executed(i.e. content
of address A is copied into processor register R0).
 Side by Side during decoding or execution, the PC gets incremented by 4(i.e. it contains the
address of the next instruction) because the instruction and memory segment is of 4 bytes. So
the instruction at address i is executed.
 So every time, the PC is incremented by 4. Therefore, the program is executing in a sequential
manner. And this process is called straight line sequencing.
Example 2 –
Straight line sequencing program for adding n numbers.
 The addresses of the memory locations containing the n numbers are represented as
NUM1,NUM2…..NUMn(i.e. NUM1 address includes first number).
 The first number is stored into processor register R0. And every other number is added to
register R0. Finally, when the program ends(i.e. n numbers are added, the result is placed in
memory location SUM
Straight line sequencing program for adding n numbers
 The second way is to use a loop to add n number. But here straight line sequencing is not used
because every time loop iteration ends, PC has to load the branch address and program
execution starts from that address.
 Here the location N stores the value of n. Processor register R1 is used as a counter to determine
the number of times the loop gets executed.
 The contents of the location N are moved into R1 at the start of program execution.
 After that, register R0 is cleared.
 The address LOOP is reloaded again and again until R1 becomes 0 (this means all numbers are
added).Every time a number is added, then the R1 value is decremented.
 When R1 becomes 0, we come out of the loop and the result which is stored at R1 is copied into
memory location SUM.
Memory consists of a large array of words or bytes, each with its own address. The CPU fetches
instructions from memory according to the value of the program counter. These instructions may
cause additional loading from and storing to specific memory addresses. A typical instruction-
execution cycle, for example, first fetches an instruction from memory The instruction is then
decoded and may cause operands to be fetched from memory.
After the instruction has been executed on the operands, results may be stored back in memory. The
memory unit see sortly a stream of memory addresses; it does not know how they are generated (by
the instruction counter, indexing, indirection, literal addresses, and so on) or what they are for
(instructions or data). Accordingly, we can ignore a program generates a memory address. We are
interested only in the sequence of memory addresses generated by the running program.
The following is a summary of the six steps used to execute a single instruction.
Step 1: Fetch instruction.
Step 2: Decode instruction and Fetch Operands.
Step 3: Perform ALU operation.
Step 4: Access memory.
Step 5: Write back result to register file.
Step 6: Update the PC.
BUILDING DATA PATH AND CONTROL IMPLEMENTATION SCHEME:
Datapath
 Components of the processor that perform arithmetic operations and holds data.
Control
· Components of the processor that commands the datapath, memory, I/O devices according to
the instructions of the memory.
Building a Datapath
· Elements that process data and addresses in the CPU - Memories, registers, ALUs.
· MIPS datapath can be built incrementally by considering only a subset of instructions
· 3 main elements are
Fig. 3.1 Datapath
· A memory unit to store instructions of a program and supply instructions given an address.
Needs to provide only read access (once the program is loaded).- No control signal is needed
· PC (Program Counter or Instruction address register) is a register that holds the address of the
current instruction
Ø A new value is written to it every clock cycle. No control signal is required to enable
write
Ø Adder to increment the PC to the address of the next instruction
An ALU permanently wired to do only addition. No extra control signal required
Fig. 3.2 Datapath portion for Instruction Fetch
Types of Elements in the Datapath

State element:
· A memory element, i.e., it contains a state

· E.g., program counter, instruction memory Combinational element:
· Elements that operate on values

· Eg adder ALU E.g. adder, ALU
Elements required by the different classes of instructions
· Arithmetic and logical instructions

· Data transfer instructions
· Branch instructions
R-Format ALU Instructions
· E.g., add $t1, $t2, $t3

· Perform arithmetic/logical operation
· Read two register operands and write register result
Register file:
· A collection of the registers

· Any register can be read or written by specifying the number of the register
· Contains the register state of the computer
Read from register
· 2 inputs to the register file specifying the numbers

• 5 bit wide inputs for the 32 registers
· 2 outputs from the register file with the read values

• 32 bit wide
· For all instructions. No control required.
Write to register file
· 1 input to the register file specifying the number 5 bit wide inputs for the 32 registers
· 1 input to the register file with the value to be written 32 bit wide
· Only for some instructions. RegWrite control signal.

ALU
· Takes two 32 bit input and produces a 32 bit output

· Also, sets one-bit signal if the results is 0
· The operation done by ALU is controlled by a 4 bit control signal input. This is set according
to the instruction.
DESIGN OF CONTROL UNIT:
The Control Unit is classified into two major categories:
1. Hardwired Control
2. Microprogrammed Control
HARDWIRED CONTROL:
The Hardwired Control organization involves the control logic to be implemented with gates, flip-
flops, decoders, and other digital circuits.
The following image shows the block diagram of a Hardwired Control organization.
o A Hard-wired Control consists of two decoders, a sequence counter, and a number of logic
gates.
o An instruction fetched from the memory unit is placed in the instruction register (IR).
o The component of an instruction register includes; I bit, the operation code, and bits 0 through
11.
o The operation code in bits 12 through 14 are coded with a 3 x 8 decoder.
o The outputs of the decoder are designated by the symbols D0 through D7.
o The operation code at bit 15 is transferred to a flip-flop designated by the symbol I.
o The operation codes from Bits 0 through 11 are applied to the control logic gates.
o The Sequence counter (SC) can count in binary from 0 through 15.
o In the hardwired organization, the control logic is executed with gates, flip-flops, decoders,
and other digital circuits. It can be optimized to make a quick mode of operation. In the
micro-programmed organization, the control data is saved in the control memory.
o The control memory is programmed to start the needed sequence of micro-operations. A
hardwired control requires changes in the wiring among the various elements if the design has
to be modified or changed.
o The block diagram of the control unit is displayed in the figure. It includes two decoders, a
sequence counter, and several control logic gates.
Some instruction that is read from the memory is placed in the Instruction Register (IR). Therefore,
the IR is divided into three elements such as I bit, opcode, and bits from 0 through 11. The opcodes
are decoded with a 3 * 8 decoder whose outputs are indicated by symbols D0 through D7.
The binary value of the respective opcode is the subscripted number in the symbol. The symbol I
which is the 15th bit of the instruction is transferred to a flip flop. The control logic gates have the
bits that are used from 0 through 11.
The sequence counter is 4-bit counts in binary from 0 through 15. It can be incremented or cleared
synchronously. The timing signals from T0 through T15 are the decoded outputs of the decoder.
MICROPROGRAMMED CONTROL:
The Micro programmed Control organization is implemented by using the programming approach.
In Micro programmed Control, the micro-operations are performed by executing a program

consisting of micro-instructions.
The following image shows the block diagram of a Microprogrammed Control organization.
o The Control memory address register specifies the address of the micro-instruction.
o The Control memory is assumed to be a ROM, within which all control information is
permanently stored.
o The control register holds the microinstruction fetched from the memory.
o The micro-instruction contains a control word that specifies one or more micro-operations for
the data processor.
o While the micro-operations are being executed, the next address is computed in the next
address generator circuit and then transferred into the control address register to read the next
microinstruction.
o The next address generator is often referred to as a micro-program sequencer, as it determines
the address sequence that is read from control memory.
The microprogrammed control stores its control data in the control memory. It can start the
important set of micro-operations, the control memory is programmed. The changes and
modifications in a micro-programmed control can be completed by upgrading the microprogram in
the control memory.
 INSTRUCTIONPIPELINING
Ascomputersystemsevolve,greaterperformancecanbeachievedbytakingadvantageofimprovem
entsin technology,such asfastercircuitry,use ofmultipleregistersratherthan asingleaccumulator, and
the use of a cache memory. Another organizational approach is instruction pipelining inwhich new
inputs are accepted at one end before previously accepted inputs appear as outputs at the otherend.
Figure 3.1a depicts this approach. The pipeline has two independent stages. The first stage
fetches aninstruction and buffers it. When the second stage is free, the first stage passes it the
buffered instruction.While the second stage is executing the instruction, the first stage takes
advantage of any unused memorycyclestofetchandbuffer
thenextinstruction.Thisiscalledinstructionprefetchorfetchoverlap.
This process will speed up instruction execution only if the fetch and execute stages were of
equalduration, the instruction cycle time would be halved. However, if we look more closely at this
pipeline(Figure 3.1b),wewillsee thatthisdoublingofexecutionrate isunlikelyfor3reasons:
1 Theexecutiontimewillgenerallybelongerthanthefetchtime.Thus,thefetchstagemayhavetow
aitforsome timebeforeitcanemptyitsbuffer.
2 Aconditionalbranchinstructionmakestheaddressofthenextinstructiontobefetchedunknown.
Thus, the fetch stage must wait until it receives the next instruction address from the
executestage.The execute stage maythenhave towaitwhile the nextinstructionisfetched.

3 When a conditional branch instruction is passed on from the fetch to the execute stage,
the fetchstage fetches the next instruction in memory after the branch instruction. Then, if the
branch is not taken,no time is lost .If the branch is taken, the fetched instruction must be
discarded and a new instructionfetched.
Togainfurtherspeedup,thepipelinemusthavemorestages.Letusconsiderthefollowingdecompositi
onoftheinstructionprocessing.
1. Fetchinstruction(FI):Readthenextexpectedinstructionintoabuffer.
2. Decodeinstruction(DI):Determinetheopcodeandtheoperandspecifiers.
3. Calculateoperands(CO):Calculatetheeffectiveaddressofeachsourceoperand.
Thismayinvolvedisplacement,registerindirect,indirect,orotherformsofaddresscalculati
on.
4. Fetchoperands(FO):Fetcheachoperandfrommemory.
5. Executeinstruction(EI):Performtheindicatedoperationandstoretheresult,ifany,inthes
pecifieddestinationoperandlocation.
6. Writeoperand(WO):Storetheresultinmemory.
Figure3.2 showsthatasix-stagepipelinecanreducetheexecutiontimefor 9
instructionsfrom54timeunitsto14timeunits.
3.2 TimingDiagramforInstructionPipelineOperation
FO and WO stages involve a memory access. If the six stages are not of equal duration, there
will besome waiting involved at various pipeline stages. Another difficulty is the conditional branch
instruction,whichcan invalidate severalinstruction fetches.Asimilarunpredictableeventisan interrupt.
3.3 TimingDiagramfor InstructionPipelineOperationwithinterrupts

Figure 3.3 illustrates the effects of the conditional branch, using the same program as Figure
3.2.Assume that instruction 3 is a conditional branch to instruction 15. Until the instruction is
executed, thereis no way of knowing which instruction will come next. The pipeline, in this
example, simply loads thenextinstructioninsequence(instruction4)andproceeds.
In Figure 3.2, the branch is not taken. In Figure 3.3, the branch is taken. This is not determined
untilthe end of time unit 7.At this point, the pipeline must be cleared of instructions that are not
useful. Duringtime unit8,instruction15entersthe pipeline.
No instructions complete during time units 9 through 12; this is the performance penalty
incurredbecause we could not anticipate the branch. Figure 3.4 indicates the logic needed for
pipelining to accountforbranches andinterrupts.
3.4 Six-stageCPUInstructionPipeline
Figure 3.5 shows same sequence of events, with time progressing vertically down the figure,
andeach row showing the state of the pipeline at a given point in time. In Figure 3.5a (which
corresponds toFigure 3.2), the pipeline is full at time 6, with 6 different instructions in various
stages of execution, andremains full through time 9; we assume that instruction I9 is the last
instruction to be executed. In Figure3.5b, (which corresponds to Figure 3.3), the pipeline is full at
times 6 and 7. At time 7, instruction 3 is inthe execute stage and executes a branch to instruction 15.
At this point, instructions I4 through I7 areflushedfromthepipeline,so thatat time
8,onlytwoinstructionsareinthe pipeline,I3andI15.
Forhigh-performanceinpipeliningdesignermuststillconsiderabout:
1 At each stage of the pipeline, there is some overhead involved in moving data from buffer
tobuffer andin performingvarious preparation and delivery functions. This overhead can
appreciablylengthenthetotalexecutiontime ofasingleinstruction.
2 Theamountof controllogicrequiredtohandlememory
andregisterdependenciesandtooptimize the use of the pipeline increases enormously with the
number of stages. This can lead to asituation where the logic controlling the gating between
stages is more complex than the stages beingcontrolled.

3 Latchingdelay:Ittakestimeforpipelinebufferstooperateandthisaddstoinstructioncycletime.
AnAlternativePipeline depiction
PipeliningPerformance
Measuresofpipelineperformanceandrelativespeedup:
Thecycletimetofaninstructionpipelineisthetimeneededtoadvanceasetofinstructionsonestagethroughth
e pipeline;eachcolumninFigures3.2and3.3representsone cycle time.
Thecycletimecanbedeterminedas
t=max[ti]+d=tm+d1…i…k
where
ti=timedelayofthecircuitryintheithstage ofthepipeline
tm=maximumstagedelay(delaythroughstagewhichexperiencesthelargestdelay)k
=numberofstagesintheinstructionpipeline
d =time delayofalatch,needed toadvance signalsand data fromonestagetothe next
In general, the time delay d is equivalent to a clock pulse and tm W d. Nowsuppose that n
instructionsare processed, with no branches. Let Tk,nbe the total time required for a pipeline with k
stages to executeninstructions.Then
Tk,n=[k+ (n -1)]t
ThisequationiseasilyverifiedfromFigures3.1.Theninthinstruction.Now consider a processor with
equivalent functions but no pipeline, and assume that the instructioncycle time is kt. The speedup
factor for the instruction pipeline compared toexecution without thepipelineis definedas
 PipelineHazards
A pipeline hazard occurs when the pipeline, or some portion of the pipeline, must stall
becauseconditions do not permit continued execution. Such a pipeline stall is also referred to as a
pipeline bubble.There arethree types ofhazards:resource,data,andcontrol.
RESOURCE HAZARDSA resource hazard occurs when two (or more) instructions thatare already
inthe pipeline need the same resource. The result is that the instructions must be executed in serial
ratherthan parallel for a portion of the pipeline. A resource hazard is sometime referred to as
astructuralhazard.
Let us consider a simple example of a resource hazard.Assume a simplified five-stage
pipeline, inwhich each stage takes one clock cycle. In Figure 3.6a which a new instruction enters the
pipeline eachclock cycle. Now assume thatmainmemory has a single portand thatall instruction
fetches anddatareads and writes must be performed one at a time. In this case, an operand read to or
write from memorycannot be performed in parallel with an instruction fetch. This is illustrated in
Figure 3.6b, which assumesthat the source operand for instruction I1 is in memory, rather than a
register. Therefore, the fetch in-struction stage of the pipeline must idle for one cycle before
beginning the instruction fetch for instructionI3.Thefigure assumes
thatallotheroperandsareinregisters.
3.5 ExampleofResourceHazard
DATA HAZARDS A data hazard occurs when two instructions in a program are to be executed
insequence and both access a particular memory or register operand. If the two instructions are
executed instrict sequence, no problem occurs butif the instructions are executed in a pipeline, then
the operandvalue is to be updated in such a way as to produce a different result than would occur
only with strictsequential execution of instructions. The program produces an incorrect result
because of the use ofpipelining.
Asanexample,considerthefollowingx86 machineinstructionsequence:
ADD EAX, EBX /* EAX = EAX +
EBXSUBECX,EAX/*ECX=ECX-
EAX
Thefirstinstructionaddsthecontentsofthe32-bitregistersEAXandEBXand storestheresultinEAX.
ThesecondinstructionsubtractsthecontentsofEAXfromECXand storestheresultinECX.
Figure 3.7 shows the pipeline behaviour. The ADD instruction does notupdate register EAX
untilthe end of stage 5, which occurs at clock cycle 5. But the SUB instruction needs that value at
thebeginning of its stage 2, which occurs at clock cycle 4. To maintain correct operation, the
pipeline muststall for two clocks cycles. Thus, in the absence of special hardware and specific
avoidance algorithms,sucha data hazardresultsininefficientpipeline usage.
There arethreetypesofdatahazards;
3.6 ExampleofResourceHazard
• Read after write (RAW), or true dependency:.A hazard occurs if the read takes place
before thewriteoperationiscomplete.
• Writeafterread (RAW),orantidependency:Ahazard
occursifthewriteoperationcompletesbefore thereadoperationtakesplace.
• Write after write (RAW), or output dependency: Two instructions both write to the
samelocation.Ahazardoccursifthewriteoperationstakeplaceinthereverseorderoftheintended
sequence.The exampleofFigure3.7is a RAWhazard.
CONTROL HAZARDS A control hazard, also known as a branch hazard, occurs when the
pipelinemakes the wrong decision on a branch prediction and therefore brings instructions into the
pipeline thatmustsubsequentlybediscarded.

Unit 4 Lecture Notes

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Unit 4 Lecture Notes

Uploaded by

Copyright:

Available Formats

NEHRU INSTITUTE OF ENGINEERING AND TECHNOLOGY

Faculty Name : Mrs.jeni narayanan L A

Instruction cycle state transition diagram

 If the result has to be stored in a register, the instructions end here.

Straight line sequencing program for adding n numbers

Fig. 3.1 Datapath

An ALU permanently wired to do only addition. No extra control signal required

Fig. 3.2 Datapath portion for Instruction Fetch

Types of Elements in the Datapath

· A memory element, i.e., it contains a state

· E.g., program counter, instruction memory Combinational element:

· Elements that operate on values

Elements required by the different classes of instructions

· Arithmetic and logical instructions

R-Format ALU Instructions

· E.g., add $t1, $t2, $t3

· A collection of the registers

Read from register

· 2 inputs to the register file specifying the numbers

· 2 outputs from the register file with the read values

· For all instructions. No control required.

Write to register file

· Only for some instructions. RegWrite control signal.

· Takes two 32 bit input and produces a 32 bit output

DESIGN OF CONTROL UNIT:

The Control Unit is classified into two major categories:

In Micro programmed Control, the micro-operations are performed by executing a program

the control memory.

executestage.The execute stage maythenhave towaitwhile the nextinstructionisfetched.

3.3 TimingDiagramfor InstructionPipelineOperationwithinterrupts

stages is more complex than the stages beingcontrolled.

You might also like