1111

© All Rights Reserved

0 views

1111

© All Rights Reserved

- VHDL Implementation of a MIPS RISC Processor.pdf
- PH-4-Quiz
- Lec02 Review
- Computer Architecture
- Flynn classification
- notes3b
- MIPS-Implementation
- ARM Processor Asczsignment(1)
- RENU_HT.ASM
- Efficient Routers for SoC Systems Using Wave-Pipelining
- Information Systems for Business and Beyond.pdf
- pipeline2
- Robotic
- Control Unit Design
- Sam 7043
- pipe-islped2004-slides
- dsp.pdf
- Microprocessors Lecture 2.docx
- praneeth.pdf
- fall 2015 pipeline project

You are on page 1of 29

Terminology

task

subtask

stage

staging register

k

Tpl =

, where ti is the processing time,

di is the delay by the staging register, and k is the

number of stages

i 1

(ti di )

(continued)

Tseq = (t )

pipeline cycle time, tmax = Max(ti+di), 1 I k

clock frequency = 1/ tmax

k

i 1

Tseq/k + d

N Tseq

speedup, S = ( k N 1) tcyc ,where N is the number

of tasks.

(continued)

processing times of the stages are same,

tcyc = Tseq / k.

N k

Therefore, Sideal becomes

k N 1

If N , Sideal k

(continued)

C= L.k + Cp where Cp =

and

c L is the cost

of each staging register.

To minimize the composite cost per the

computation rate, k = CpLTseq

d

k

i 1

(continued)

equal is a complicated and time-consuming process

It is essential to maximum performance that the stages be

close to balanced.

It is done for commercial processors, although it is not easy

and cheap to do

term of handling exception or interrupts.

A deep pipeline increases the interrupt handling overhead.

Pipeline Types

Instruction pipelines

arithmetic pipelines

processor pipelines: a cascade of

processors each executing a specific

module in the application program.

Instruction pipeline

reservation table

Row : stages

Column : pipeline cycles

often determined by the stages

requiring memory access.

Control Hazard

The target address of branch will be known only

after the evaluation of the condition.

The pipeline predicts that the branch will not be

taken.

It would be to start fetching the target instruction

sequence into a buffer while the nonbranch

sequence is being fed into the pipeline.

Arithmetic pipelines

Consider S = A + B, where A=(Ea,Ma), B=(Eb,

Mb), and S=(Es,Ms)

Addition steps (Figure 3.5)

Add mantissas

Normalize Ms and adjust Es for the sum normalization

Round Ms

Renormalize Ms and adjust Es

3.7)

Arithmetic pipelines(cont.)

Consider P= A x B, where A=(Ea,Ma), B=(Eb, Mb),

and P=(Ep,Mp)

Multiplication steps (Figure 3.8)

Add exponents

Multiply mantissas

Normalize Mp and adjust Ep

Round Mp

Renormalize Mp and adjust Ep

Arithmetic pipelines(cont.)

Multifunction pipeline

To perform more than one operation

A control input is needed for proper

operation of the multifunction pipeline.

Figure 3.10 : floating point add/multiplier

Classification scheme by

Ramamoorthy and Li

Functionality

unifunctional

multifunctional

Configuration

static

dynamic

Mode of operation:

scalar

vector

Performance

must be kept full and flowing smoothly.

Two conditions of smooth flow of a pipeline:

the rate of input of data

data interlocks between the stages

operation per cycle(once it is full)

Example 3.2 : non-linear pipeline

Structural hazard

Due to the non-availability of

appropriate hardware

One obvious way of avoiding structural

hazard is to insert additional hardware

into the pipeline.

Example 3.3

pipeline

In cycle 3, 4, 5, and 6, simultaneous accesses are

needed.

If we assume that the machine has separate data

and instruction caches, in cycles 5 and 6 the

problems are solved.

One way to solve the problem in cycle 4 is to stall

the ADD instruction (Figure 3.13)

performance.

Collision vectors

pipeline

Latency: the number of cycles that elapse

between two initiation.

Latency sequence: the latencies between

successive initiations

Collision: it occurs if a stage in the pipeline is

required to perform more than one task at

any time.

Collision vectors(cont.)

distances between two entries on some row

of RT.

Collision vector can be derived from

forbidden set F and can be utilized to

control the initiation of operations in the

pipelines.

CV = (vn-1,vn-2,,v2,v1)

Vi =1 if i is in the forbidden set

Examples

Example 3.4

(a) Overlapped RT

(b) Collision Vector(CV)

Collision case and no collision case

Control

Place the CV in a shift reg.

If the LSB of the shift reg. Is 1, do not initiate an

operation at that cycle; shift the CV right once,

inserting 0 at the vacant MSB position

If the LSB of the shift reg. Is 0, initiate a new

operation at that cycle; shift the CV right once,

inserting 0 at the vacant MSB position. In order to

reflect the superposing status due to the new

initiation over the original one, perform a bit-by-bit

OR of the original CV with the content of the shift reg.

3.2.3 Performance

Figure 3.15(a)

The CV of Figure 3.11 : (00111)

Figure 3.15(a) shows the state transitions.

3.2.3 Performance

Average latency

simple cycle

greedy cycle

MAL(Minimum average Latency)

Figure 3.17

Vxx, Vxy, Vyx, Vyy

resources. Data hazard

data forwarding

internal forwarding

write-read forwarding

read-read forwarding

write-write forwarding

memory/memory architectures

(continued)

Conditional Branches

branch prediction

delayed branch

branch-prediction buffer

branch history

multiple instruction buffers

Interrupts

precise interrupt scheme

Instruction deferral

scoreboard

Tomosulos algorithm

Performance evaluation

per unit time

minimizing the total time required to handle

a specific sequences of initiation table types

CDC Star-100

CDC 6600

MIPS R-4000

3.6 Summaries

improve the performance beyond the

ideal CPI case:

superpipeline

superscalar

VLIW(Very Long Instruction Word)

End of Chapter 3

- VHDL Implementation of a MIPS RISC Processor.pdfUploaded byMitali Dixit
- PH-4-QuizUploaded bystudent1985
- Lec02 ReviewUploaded byNoman Ali Shah
- Computer ArchitectureUploaded byKarthekeyan
- Flynn classificationUploaded byMukesh Kumar
- notes3bUploaded byBobo Joo
- MIPS-ImplementationUploaded bydilinox
- ARM Processor Asczsignment(1)Uploaded bydelinquent_abhishek
- RENU_HT.ASMUploaded bykgrhoads
- Efficient Routers for SoC Systems Using Wave-PipeliningUploaded bysenthilvl
- Information Systems for Business and Beyond.pdfUploaded byjohnatan17
- pipeline2Uploaded byvinnisharma
- RoboticUploaded byRadha Krishnan Madhura Muthu
- Control Unit DesignUploaded byJacob Jayaseelan
- Sam 7043Uploaded byAlain Scialoja
- pipe-islped2004-slidesUploaded bySuda Krishnarjunarao
- dsp.pdfUploaded byHancy Narandra
- Microprocessors Lecture 2.docxUploaded byAriane Faye Salonga
- praneeth.pdfUploaded byNguyễn Khắc Huy
- fall 2015 pipeline projectUploaded byapi-242608772
- What is Instruction CycleUploaded bySukhi Dhiman
- practicals_CPUUploaded byapi-3745439
- lec1Uploaded bySANTOSH4176
- My Hardwork of 1 and Half Hour PooUploaded byrajkamina
- OpenCL Emu DocumentationUploaded byBaskar Arumugam
- simon_m_g_w_bUploaded byNaveen Jain
- tdcUploaded byAllan Correa
- computerUploaded byapi-317917980
- cpuuUploaded byapi-317917980
- PaperUploaded byAnshima Jain

- Introduction to XMLUploaded byimadpr
- Exception Handling CPPUploaded byshardapatel
- DATA COMUploaded byshardapatel
- httpdUploaded byAbhi Shelke
- ProbabilityUploaded byshardapatel
- Heuristic SearchUploaded byLalit Kumar
- Atul KahateUploaded byshardapatel
- Chapter3_HeuristcSearchUploaded byEr Asmitaa Kumar
- Heuristics in Judgment and Decision-making - Wikipedia, The Free EncyclopediaUploaded byshardapatel
- Design and Analysis of Algorithms Course NotesUploaded byshardapatel
- thesisUploaded byshardapatel
- LoadUploaded byshardapatel
- Critical Section ProblemUploaded byMohamyz EDðiÊ
- MCQ VB _ Jatin KotadiyaUploaded byshardapatel
- SAD NOTESUploaded byshardapatel
- Data MiningUploaded bySelvarani Js
- Timeline of Operating Systems - Wikipedia, The Free EncyclopediaUploaded byshardapatel
- notes in critical systemUploaded byshardapatel
- study materialUploaded byshardapatel
- JEE Advanced 2014 Key Paper 2Uploaded byAnweshaBose
- Business Data ProcessingUploaded byshardapatel
- Round Robin CPU Scheduling » GATE FundasUploaded byshardapatel
- OS Interview QuestionsUploaded byshardapatel
- 76 73 Heuristic Search UI1Uploaded byshardapatel
- AI05_04_HeurSearch notesUploaded byshardapatel

- Take the Guesswork Out of Virtualization With BRCUploaded byjcrodriguez83
- PLCs and Pneumatic SystemsUploaded byKazi Mehdi
- CybersecurityUploaded byapi-19920690
- MainUploaded bynrsolis
- 01 Introduction Microprocessor and InterfaceUploaded byFahad Mahmood
- Neos Chronos 2013 Insights SeriesUploaded byNeos Chronos
- CSE Google Glass ReportUploaded by46harmeet
- CrowdPesa Company profileUploaded byJohn Samson Karanja
- BMW Stard-edi GuidelinesUploaded byMirek Godzwon
- User Manual UPO22 15_20RTUploaded bykeimakamisama
- Design of Low Power Carry Select Adder By Using VHDLUploaded byIJSTE
- Isdn CodeUploaded byniko67
- HCI IXlab PaperUploaded bycarishurd
- Class1 PreUploaded byRaga
- Arp PoisoningUploaded byNiranjana Karandikar
- Robo Form ManualUploaded byDave Short
- 04-TM-1812 AVEVA Everything3D™ (1.1) Structural Modelling Rev 1.0.pdfUploaded byMarian Apostolache
- Master Data Manager 3.2.0.1 Installation GuideUploaded bymiitian
- Red Hat Design Patterns in Production SystemsUploaded bymegacb
- Help RNC CommandUploaded byMular Abdul Shukoor
- HAClusterEnv_awsclmstUploaded byptk386
- How to Install petrel 2010Uploaded byDwi Prasetyo Utomo
- SQL DeveloperUploaded byAlex 'morz' Bergamini
- Vyatta - QoSUploaded byLee Wiscovitch
- Chapter 7 Wireless, Mobile Computing, And Mobile Commerce (Student Slide)_0_0Uploaded bynona
- Design of Automatic Medication DispenserUploaded byCS & IT
- MATLAB IntroductionUploaded byvamsee007
- 11950Uploaded byshah435457
- Ujvnl Ps Cost Planning Budgeting & Project Release User ManualUploaded byanand
- PNMSj System RequirementUploaded byllaczo1