Welcome to Scribd!

Loop Unrolling: Replicate Loop Body To Expose More Parallelism

Uploaded by

0% found this document useful (0 votes)

33 views3 pages

Loop unrolling replicates the loop body to expose more parallelism and reduce loop control overhead. It uses different registers per replication to avoid loop-carried anti-dependencies, where a store is followed by a load of the same register. An example loop unrolling shows replicating the loop body operations over multiple cycles, improving instructions per cycle from 1 to 1.75 but increasing register and code size requirements.

Original Description:

cs6303

Original Title

UNit 33

Copyright

Available Formats

DOCX, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

33 views3 pages

Loop Unrolling: Replicate Loop Body To Expose More Parallelism

Uploaded by

mails2ranjith

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 3

Search inside document

Loop Unrolling

Replicate loop body to expose more parallelism

Reduces loop-control overhead

Use different registers per replication

Called register renaming
Avoid loop-carried anti-dependencies
Store followed by a load of the same register
Aka name dependence
Reuse of a register name

CS6303 COMPUTER ARCHITECTURE

Loop Unrolling Example

Loop:

ALU/branch

Load/store

cycle

addi $s1, $s1,16

$t0, 0($s1)

nop

$t1, 12($s1)

addu $t0, $t0, $s2

$t2, 8($s1)

addu $t1, $t1, $s2

$t3, 4($s1)

addu $t2, $t2, $s2

$t0, 16($s1)

addu $t3, $t4, $s2

$t1, 12($s1)

nop

$t2, 8($s1)

$t3, 4($s1)

bne

$s1, $zero, Loop

IPC = 14/ 8 = 1.75

Closer to 2, but at cost of registers and code size
CS6303 COMPUTER ARCHITECTURE

Dynamic Multiple Issue

Superscalarprocessors
CPU decides whether to issue 0, 1, 2, each cycle
Avoiding structural and data hazards

Avoids the need for compiler scheduling

Though it may still help

Code semantics ensured by the CPU

CS6303 COMPUTER ARCHITECTURE

Dynamic Pipeline Scheduling

Allow the CPU to execute instructions out of order to avoid

stalls
But commit result to registers in order

Example

lw
$t0, 20($s2)
addu $t1, $t0, $t2
sub
$s4, $s4, $t3
slti $t5, $s4, 20
Can start sub while addu is waiting for lw

CS6303 COMPUTER ARCHITECTURE

Dynamically Scheduled CPU

Preserves
dependencies

Hold pending
operands

Results also sent to any

waiting reservation
stations
Reorders buffer for
register writes
Can supply operands
for issued instructions
CS6303 COMPUTER ARCHITECTURE

Manual de Operacion Juniper
Document194 pages
Manual de Operacion Juniper
Pepo
100% (1)
Legato Networker
Document277 pages
Legato Networker
srevenco2318
100% (1)
3.design Patterns LAB Record
Document15 pages
3.design Patterns LAB Record
Ladi Pradeep Kumar
No ratings yet
Assel Mill SMS Meer
Document8 pages
Assel Mill SMS Meer
Bhavsar Kaushal
No ratings yet
Molten Metal Level Control: - Digital Camera Sensor
Document2 pages
Molten Metal Level Control: - Digital Camera Sensor
Tuyen Nguyen
No ratings yet
CS 352H: Computer Systems Architecture: Topic 10: Instruction Level Parallelism (ILP) October 6 - 8, 2009
Document25 pages
CS 352H: Computer Systems Architecture: Topic 10: Instruction Level Parallelism (ILP) October 6 - 8, 2009
Sudip Kumar Dey
No ratings yet
Using ACSLS: Last Updated On 8 August 2005. New Items Are Marked by
Document16 pages
Using ACSLS: Last Updated On 8 August 2005. New Items Are Marked by
Mujahid Ali Khan
No ratings yet
What Is Diameter: Interfaces: CX, DH, DX, GQ, Rf/Ro, SH
Document7 pages
What Is Diameter: Interfaces: CX, DH, DX, GQ, Rf/Ro, SH
Dian Agung Nugroho
No ratings yet
Snort Rule Cheat Sheet: Header Format
Document2 pages
Snort Rule Cheat Sheet: Header Format
alucian
No ratings yet
Eigrp: Proiectarea Rețelelor
Document30 pages
Eigrp: Proiectarea Rețelelor
user_iuli
No ratings yet
Type of Attack Description Methods and Tools: War Driving
Document5 pages
Type of Attack Description Methods and Tools: War Driving
jldrez
No ratings yet
FTX4056F Applied Investment First Class Test
Document7 pages
FTX4056F Applied Investment First Class Test
vrshaf001
No ratings yet
Sap Hana Speaks Sqlscript en 2014-01-08
Document5 pages
Sap Hana Speaks Sqlscript en 2014-01-08
Vivekananda Chary
No ratings yet
Sample PL/SQL Programs
Document29 pages
Sample PL/SQL Programs
sharatvemuri
100% (1)
A Technical Overview of Couchbase
Document96 pages
A Technical Overview of Couchbase
vultt
No ratings yet
Superpipelining
Document7 pages
Superpipelining
BravoYusuf
No ratings yet
Trabajo Protecciones
Document10 pages
Trabajo Protecciones
Daniel Gallego Toro
No ratings yet
Oracle MongoDB Replication
Document14 pages
Oracle MongoDB Replication
Saeed Meethal
No ratings yet
ToastMasters Investing
Document7 pages
ToastMasters Investing
James Peet
No ratings yet
File Archiver
Document1 page
File Archiver
api-235670933
No ratings yet
CA Unit 3 Answers
Document10 pages
CA Unit 3 Answers
shreya.ramya
No ratings yet
C C C CC C
Document9 pages
C C C CC C
CESHEMIRATES
No ratings yet
6.002x Week 3 Exercises Solutions: ST ND TH
Document4 pages
6.002x Week 3 Exercises Solutions: ST ND TH
uploaditnow29
No ratings yet
L 0 ILP Optional Extra Topic
Document44 pages
L 0 ILP Optional Extra Topic
Lekshmi
No ratings yet
Me 303 CH8
Document11 pages
Me 303 CH8
Osman Kutlu
No ratings yet
L11 Pipelined Datapath and
Document31 pages
L11 Pipelined Datapath and
Vidhi Kishor
100% (1)
Using ACSLS: Last Updated On 8 August 2005. New Items Are Marked by
Document15 pages
Using ACSLS: Last Updated On 8 August 2005. New Items Are Marked by
Srinivas Kumar
No ratings yet
PL SQL Functions
Document219 pages
PL SQL Functions
Saipratap Rajolla
No ratings yet
LTE PHY Fundamentals
Document9 pages
LTE PHY Fundamentals
Michael Snook
No ratings yet
Poc Guide For Vsan
Document36 pages
Poc Guide For Vsan
newbeone
No ratings yet
ACA Introduction Lect
Document27 pages
ACA Introduction Lect
WIKIWSW
No ratings yet
Pipelined Functions
Document21 pages
Pipelined Functions
Atif Khan
No ratings yet
Week 2 (Introduction To PL SQL)
Document53 pages
Week 2 (Introduction To PL SQL)
kaushik4end
No ratings yet
SQL Questions
Document3 pages
SQL Questions
Mahima Prasad
No ratings yet
Internals Latches
Document14 pages
Internals Latches
maleem
No ratings yet
An Introduction To 802.11ac: September 2011
Document10 pages
An Introduction To 802.11ac: September 2011
javierdb2012
No ratings yet
Electrical and Computer Engineering Computer Organization and Architecture CSE 332 Credits - 3 Prerequisites: CSE 231 Digital Logic Design
Document41 pages
Electrical and Computer Engineering Computer Organization and Architecture CSE 332 Credits - 3 Prerequisites: CSE 231 Digital Logic Design
Nz Saad
No ratings yet
Det FF PDF
Document5 pages
Det FF PDF
09mitrat
No ratings yet
PARALLELISM VIA INSTRUCTIONS: Pipelining Exploits The Potential Parallelism Among Instructions. This Parallelism Is
Document2 pages
PARALLELISM VIA INSTRUCTIONS: Pipelining Exploits The Potential Parallelism Among Instructions. This Parallelism Is
Berkay Özerbay
No ratings yet
De' (F9.g'%hf@@"5i%: JKH%L!"#$M% N"8++#%kohd% N"8++#%kohd%'e#+59'8% +G'5%JKH%
Document6 pages
De' (F9.g'%hf@@"5i%: JKH%L!"#$M% N"8++#%kohd% N"8++#%kohd%'e#+59'8% +G'5%JKH%
mani_id
No ratings yet
Estiamate Register
Document4 pages
Estiamate Register
Sandeep Ray
No ratings yet
Comparative Study On Low-Power High-Performance Standard-Cell Flip-Flops
Document9 pages
Comparative Study On Low-Power High-Performance Standard-Cell Flip-Flops
Gowtham Sp
No ratings yet
PLSQL Notes
Document19 pages
PLSQL Notes
gadhireddy
No ratings yet
Sky Spark White Paper
Document7 pages
Sky Spark White Paper
John Kabler
No ratings yet
Sample 1. FOR Loop: Input Table
Document12 pages
Sample 1. FOR Loop: Input Table
sridhar1717
No ratings yet
Axe - O - Ma (C DSP User Guide: v1.1.4, October 19, 2012
Document18 pages
Axe - O - Ma (C DSP User Guide: v1.1.4, October 19, 2012
Martin Moergenroet
No ratings yet
CS3350B Computer Architecture: Lecture 6.3: Instructional Level Parallelism: Advanced Techniques
Document24 pages
CS3350B Computer Architecture: Lecture 6.3: Instructional Level Parallelism: Advanced Techniques
AsHraf G. ElrawEi
No ratings yet
Linux Aplikasi: Dr. Sugeng Pribadi
Document75 pages
Linux Aplikasi: Dr. Sugeng Pribadi
Diego
No ratings yet
So, Ware: Instruc (On Set
Document16 pages
So, Ware: Instruc (On Set
Rohan Jain
No ratings yet
TX850, An 850W Unit With A Single Whopping 70A 12V Rail and Four 8 Pin PCI-E Connectors by Carlos Julio10
Document22 pages
TX850, An 850W Unit With A Single Whopping 70A 12V Rail and Four 8 Pin PCI-E Connectors by Carlos Julio10
iammia
No ratings yet
PL SQL Functions
Document219 pages
PL SQL Functions
boddu_raghunarayana
No ratings yet
Ccnpv6 Switch Lab3-1 Default STP Student
Document7 pages
Ccnpv6 Switch Lab3-1 Default STP Student
Jonas Torres
100% (1)
Mininet and Openflow Labs
Document14 pages
Mininet and Openflow Labs
Renu
No ratings yet
转 Event 'Utl - file I - O' - - CodeAntenna
Document4 pages
转 Event 'Utl - file I - O' - - CodeAntenna
kruemeL1969
No ratings yet
Understanding The Stack
Document9 pages
Understanding The Stack
李子
No ratings yet
User Tutorial For Aqua Sim 10 26
Document10 pages
User Tutorial For Aqua Sim 10 26
Aziz Ur Rehman
No ratings yet
C++ and Root For Physicists
Document55 pages
C++ and Root For Physicists
Mark Edmund
No ratings yet
3Rd Tutorial On Pspice: Simple Subcircuits in Pspice
Document7 pages
3Rd Tutorial On Pspice: Simple Subcircuits in Pspice
ganeshdhole16
No ratings yet
Hexapod Robot: Dalhousie Mechanical Engineering Senior Year Design Team 2
Document17 pages
Hexapod Robot: Dalhousie Mechanical Engineering Senior Year Design Team 2
Hồ Sỹ Nam
No ratings yet
IMDL 40-H: Nominal Airflow: 205 L/s Cooling Capacity (KW)
Document1 page
IMDL 40-H: Nominal Airflow: 205 L/s Cooling Capacity (KW)
Zaw Moe Khine
No ratings yet
ELEN 468 Advanced Logic Design: MIPS Microprocessor
Document19 pages
ELEN 468 Advanced Logic Design: MIPS Microprocessor
Mahendranath Cholleti
No ratings yet
PNEUMATICS AND AIR CIRCUITS UNDERSTANDING THE CASCADE VALVE AND PLC UNDERSTANDING
From Everand
PNEUMATICS AND AIR CIRCUITS UNDERSTANDING THE CASCADE VALVE AND PLC UNDERSTANDING
Mitch Edwards
No ratings yet
Pocket Guide to Chemical Engineering
From Everand
Pocket Guide to Chemical Engineering
Carl R. Branan
Rating: 3 out of 5 stars
3/5 (2)
Transport Phenomena II Essentials
From Everand
Transport Phenomena II Essentials
The Editors of REA
Rating: 4 out of 5 stars
4/5 (1)
Amazon Mostly Asked Interview Questions
Document2 pages
Amazon Mostly Asked Interview Questions
mails2ranjith
No ratings yet
Ramanujan Computing Centre Anna University: CHENNAI-600 025, INDIA
Document2 pages
Ramanujan Computing Centre Anna University: CHENNAI-600 025, INDIA
mails2ranjith
No ratings yet
Form Filling
Document1 page
Form Filling
mails2ranjith
No ratings yet
Unit - 4: Parallelism
Document5 pages
Unit - 4: Parallelism
mails2ranjith
No ratings yet
NCR Selfserv™ 25: The Ideal Fit For Your Upgr Ade Plans
Document2 pages
NCR Selfserv™ 25: The Ideal Fit For Your Upgr Ade Plans
Fire tv
No ratings yet
Win Hydro Pro 10
Document4 pages
Win Hydro Pro 10
ferdinanfirdaus
No ratings yet
2 - DarkZone 1111
Document12 pages
2 - DarkZone 1111
Ahmad Y Naufan
100% (1)
How Microsoft Lost Its Mojo: Steve Ballmer and Corporate America's Most Spectacular Decline - Vanity Fair
Document3 pages
How Microsoft Lost Its Mojo: Steve Ballmer and Corporate America's Most Spectacular Decline - Vanity Fair
nebukadnezar
No ratings yet
Langaugaes and Traslators
Document5 pages
Langaugaes and Traslators
Wakunoli Lubasi
No ratings yet
Online Judge PDF
Document17 pages
Online Judge PDF
Pedro
No ratings yet
COA Lecture 18-Fully Associative, Set Associative PDF
Document20 pages
COA Lecture 18-Fully Associative, Set Associative PDF
A3 Aashu
No ratings yet
Scan Cycle Load From Communication
Document2 pages
Scan Cycle Load From Communication
lindembm
No ratings yet
Lead Generation Tactics
Document19 pages
Lead Generation Tactics
Ar Hitesh Parmar
No ratings yet
1 ElasticBucklingandSupportConditions
Document4 pages
1 ElasticBucklingandSupportConditions
hoangduong260979
No ratings yet
Bolting Solutions Catalog
Document74 pages
Bolting Solutions Catalog
cyrano1091
No ratings yet
Computer Operator Practical Paper 2071
Document4 pages
Computer Operator Practical Paper 2071
amit
No ratings yet
KX-TDA 100-200 2-0 SC Card Install Upgrade Guide
Document16 pages
KX-TDA 100-200 2-0 SC Card Install Upgrade Guide
Zlatan Damyanov
No ratings yet
XBRL I.E Extensible Business Reporting Language
Document15 pages
XBRL I.E Extensible Business Reporting Language
meenakshi56
No ratings yet
Basics of PLC Programming PDF
Document24 pages
Basics of PLC Programming PDF
Claudia Meza
No ratings yet
Chose
Document82 pages
Chose
rita
No ratings yet
J17 P13 Q8 PDF
Document2 pages
J17 P13 Q8 PDF
binduann
No ratings yet
1991-02 The Computer Paper - BC Edition
Document72 pages
1991-02 The Computer Paper - BC Edition
thecomputerpaper
No ratings yet
Dbms-Unit1 Notes For Bca
Document18 pages
Dbms-Unit1 Notes For Bca
Yugendran S
No ratings yet
ICFES Case Study
Document2 pages
ICFES Case Study
Carlos Alvarez
No ratings yet
Multilin General Electric
Document16 pages
Multilin General Electric
Luis Alberto Serrano Mesa
No ratings yet
Operator's Guide: Advia
Document100 pages
Operator's Guide: Advia
nguyen minh
100% (1)
Android Apk
Document4 pages
Android Apk
Trujillo88George
No ratings yet
Kristen Urchell Resume
Document1 page
Kristen Urchell Resume
api-309607014
No ratings yet
FI - Consolidation
Document4 pages
FI - Consolidation
Ananthakumar A
No ratings yet
Opentext Imaging Enterprise Scan 1050 Installation Guide English Cles100500 Igd en PDF
Document46 pages
Opentext Imaging Enterprise Scan 1050 Installation Guide English Cles100500 Igd en PDF
Mihai Paunescu
No ratings yet