2.0 DD2356 DiscussingSpeedUp

Uploaded by

Daniel Araújo

0% found this document useful (0 votes)

2 views13 pages

Original Title

2.0-DD2356-DiscussingSpeedUp

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

2 views13 pages

2.0 DD2356 DiscussingSpeedUp

Uploaded by

Daniel Araújo

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 13

Search inside document

DD2356 – Discussing Speed-Up

Stefano Markidis
What is Speedup?
• In the simplest form,
• Speedup(code,sys,p) = TB/Tp
• Speedup measures the ratio of performance between two objects
• Versions of same code, with different number of processors
• Serial and vector versions
• C and Fortran
• Two algorithms computing the “same” result

2
Speedup Can Be Useful

• The key is choosing the correct baseline for comparison

• For our serial vs. vectorization examples, using compiler-provided
vectorization, the baseline is simple – the same code, with vectorization
turned off
• For parallel applications, this is much harder:
• Choice of algorithm, decomposition, performance of baseline case

3
Parallel Speedup
• For parallel applications, Speedup is typically
defined as
– Speedup(code,sys,p) = T1/Tp
– Where T1 is the time on one
processor and Tp is the time using p
processors
• Can Speedup(code,sys,p) > p?
– That means using p processors is
more than p times faster than using
one processor

4
Speedup and Memory
Super-linear speed-up due to memory

• Yes, speedup on p processors can be

greater than p.
• Consider the case of a memory-bound
computation with M words of memory
• If M/p fits into cache while M does not,
the time to access memory will be
different in the two cases:
– T1 uses the STREAM main memory
bandwidth
– Tp uses the appropriate cache
bandwidth

5
Are there Upper Bounds on Speedup?

• Let’s look at a simple code. Assume that almost

all of it is perfectly parallizable (fraction f). The
remainder, fraction (1-f) can’t be parallelized at
all.
– That is, there is work that takes time W
on 1 process; a fraction f of that work
will take time Wf/p on p processors
• What is the maximum possible speedup as a
function of f?

6
Question

Stop here and try to compute the maximum speedup by computing T1 and Tp in
terms of p and f.

7
Amdahl’s Law
• T1 = (1-f) W + fW = W
• Tp = (1-f) W + fW/p
• Speedup = T1/Tp = W / ((1-f)W+fW/p)
– As p goes to infinity, fW/p goes to zero
– the maximum speedup is 1/(1-f)
• So if f = 0.99 (all but 1% parallelizable)
– the maximum speedup is 1/(1-.99)=1/(.01)= 100

8
Notes on Amdahl’s Law

• Its pretty depressing – if any non- parallel code slips into the application,
the parallel performance is limited
• In many simulations, however, the fraction of non-parallelizable work
is 10-6 or less
– Due to large arrays or objects that are perfectly parallelizable

9
N1/2 – Another Measure
• When measuring performance as a function of a parameter (such as number of
processors) one question is:
• At what value of p is half of the possible performance achieved?
• For example, for parallel performance, how many processors are required
to achieve half of the possible performance?
• Answer depends on the specific situation

10
Example N1/2

• Consider the Amdahl’s law example

• Maximum possible speedup at an infinite number of processes is 1/(1-f)
• Question: At how many processes is half of the possible speedup achieved?

11
Answer for N1/2
• 1⁄2 of maximum speedup is 1/(2(1-f))
• Speedup(p) = 1/((1-f)+f/p)
• To find p, set these equal (use their inverses)
– 2(1-f) = (1-f) + f/p
• 1-f = f/p à p = f/(1-f)
• E.g., for f = .99, p = 9900 (for a speedup of only 50!)

12
Overhead and Performance
• N1/2 a convenient way to look at performance whenever
• T = overhead + cn
• In the Amdahl’s law case, the overhead is the serial (non-parallelizable) fraction, and
the number of processors is n
• In vectorization, n is the length of the vector and the overhead is any cost of starting up
a vector calculation
• Including checks on pointer aliasing, pipeline startup, alignment checks

5 ParConcepts
Document68 pages
5 ParConcepts
R
No ratings yet
Intro To Communication: - Advantages
Document13 pages
Intro To Communication: - Advantages
Xin Zhao
No ratings yet
Analytical Modeling of Parallel Systems: Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar
Document67 pages
Analytical Modeling of Parallel Systems: Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar
zoravar
No ratings yet
Week 7
Document27 pages
Week 7
hussmalik69
No ratings yet
OOAD
Document67 pages
OOAD
Bobby Jasuja
No ratings yet
Week 7
Document27 pages
Week 7
Kamran Anwar
No ratings yet
Parallel Programming: Sathish S. Vadhiyar Course Web Page
Document36 pages
Parallel Programming: Sathish S. Vadhiyar Course Web Page
Anna Poorani
No ratings yet
Karp
Document5 pages
Karp
Magali Arellano
No ratings yet
Principles of Scalable Performance
Document61 pages
Principles of Scalable Performance
Parth Shah
No ratings yet
Module 4
Document12 pages
Module 4
Bijay Nag
No ratings yet
CS Chap7 Multicores Multiprocessors Clusters
Document65 pages
CS Chap7 Multicores Multiprocessors Clusters
Đỗ Trị
No ratings yet
Parallel Computing Chapter 7 Performance and Scalability: Jun Zhang Department of Computer Science University of Kentucky
Document26 pages
Parallel Computing Chapter 7 Performance and Scalability: Jun Zhang Department of Computer Science University of Kentucky
Widya Meiriska
No ratings yet
Unit 4
Document64 pages
Unit 4
rishisharma4201
No ratings yet
Amdahl's Law: S (N) T (1) /T (N)
Document46 pages
Amdahl's Law: S (N) T (1) /T (N)
thilaga
No ratings yet
Analytical Modeling of Parallel Systems: Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar
Document36 pages
Analytical Modeling of Parallel Systems: Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar
Priya Patel
No ratings yet
L11 Granularity
Document36 pages
L11 Granularity
Mohamadi Y
No ratings yet
Lecture 4 Analytical Modeling of Parallel Programs
Document11 pages
Lecture 4 Analytical Modeling of Parallel Programs
rahul
No ratings yet
Pc98 Lect5 Part1 Speedup
Document36 pages
Pc98 Lect5 Part1 Speedup
Debopriyo Banerjee
No ratings yet
Complexity Analysis Time-Space Trade-Off
Document51 pages
Complexity Analysis Time-Space Trade-Off
Jatin Singh
No ratings yet
Ece453 Lec 03 Handout
Document10 pages
Ece453 Lec 03 Handout
suresh s
No ratings yet
Week 2-3 - Analysis of Algorithms
Document7 pages
Week 2-3 - Analysis of Algorithms
Xuân Dương Vương
No ratings yet
Parallel Computing
Document30 pages
Parallel Computing
Martina Janeva
No ratings yet
Parallel & Distributed Computing:: Spring-2020 Lec#1
Document19 pages
Parallel & Distributed Computing:: Spring-2020 Lec#1
Hamzah Akhtar
No ratings yet
Complexity of Algorithms 1
Document27 pages
Complexity of Algorithms 1
gra
No ratings yet
CSE703 Module1
Document245 pages
CSE703 Module1
suryanshmishra425
No ratings yet
2014fa CS61C L31 DG PipelineII 6up
Document4 pages
2014fa CS61C L31 DG PipelineII 6up
hpss77
No ratings yet
Appendix A
Document93 pages
Appendix A
zeewox
No ratings yet
Lecture Week - 3 Amdahl Law 1
Document19 pages
Lecture Week - 3 Amdahl Law 1
imhafeezniazi
No ratings yet
Performance Analysis: PE PE
Document10 pages
Performance Analysis: PE PE
ManojSudarshan
No ratings yet
Design Ana of Algo Handbk
Document21 pages
Design Ana of Algo Handbk
Eboka Chukwuka
No ratings yet
12 MPIProgramPerformance
Document33 pages
12 MPIProgramPerformance
Dr. V. Padmavathi Associate Professor
No ratings yet
Advanced Computer Architecture
Document18 pages
Advanced Computer Architecture
johnleons
No ratings yet
Software Engineering
Document51 pages
Software Engineering
Hamza Khan
No ratings yet
Unit 1
Document44 pages
Unit 1
Rahil Khan Khan
No ratings yet
M116C 1 M116C 1 Lect02-Performance
Document23 pages
M116C 1 M116C 1 Lect02-Performance
tinhtrilac
No ratings yet
Performance Metrics For Parallel Programs: 8 March 2010
Document44 pages
Performance Metrics For Parallel Programs: 8 March 2010
tt_aljobory3911
No ratings yet
Ram, Pram, and Logp Models
Document72 pages
Ram, Pram, and Logp Models
Ruby Mehta Malhotra
No ratings yet
PC 2
Document44 pages
PC 2
ims
No ratings yet
Operating Systems: Scheduling
Document34 pages
Operating Systems: Scheduling
GES ISLAMIA PATTOKI
No ratings yet
Pipe Lining
Document66 pages
Pipe Lining
sir.erlan
No ratings yet
Complexity of An Algorithm Lect-2.pps
Document21 pages
Complexity of An Algorithm Lect-2.pps
Tanmay Baranwal
No ratings yet
Parallel Programming: in C With Mpi and Openmp Michael J. Quinn
Document73 pages
Parallel Programming: in C With Mpi and Openmp Michael J. Quinn
Rohan Patil
No ratings yet
CSCE626 Amato LN PerformanceAnalysisMethodology
Document19 pages
CSCE626 Amato LN PerformanceAnalysisMethodology
Raj Kumar Yadav
No ratings yet
Lecture 09
Document25 pages
Lecture 09
Syed Badshah
No ratings yet
Parallel Computation Models: Slide 1
Document28 pages
Parallel Computation Models: Slide 1
jyoti
No ratings yet
Unit 1 Algorithm Performance Analysis and Measurement
Document61 pages
Unit 1 Algorithm Performance Analysis and Measurement
None Other
No ratings yet
Uni1-2 Pipelining
Document12 pages
Uni1-2 Pipelining
Aditi Verma
No ratings yet
WINSEM2018-19 - CSE2003 - ETH - SJT311 - VL2018195002472 - Reference Material I - Overviewofalgorithm
Document12 pages
WINSEM2018-19 - CSE2003 - ETH - SJT311 - VL2018195002472 - Reference Material I - Overviewofalgorithm
kumarkl
No ratings yet
Apple T Notes
Document29 pages
Apple T Notes
8527479210
No ratings yet
Lec 01
Document2 pages
Lec 01
SNaveenMathew
No ratings yet
Num Tech
Document39 pages
Num Tech
andres python
No ratings yet
Lesson 2
Document33 pages
Lesson 2
Jhesson James Manuel
100% (1)
Instruction Set Architecture and Principles
Document42 pages
Instruction Set Architecture and Principles
Amna
No ratings yet
Algorithem Chapter One
Document43 pages
Algorithem Chapter One
Oz G
No ratings yet
Deeplearning Ai
Document57 pages
Deeplearning Ai
Jian Quan
No ratings yet
Two Forms of Pipelining: - E.g., Floating Point Operations
Document36 pages
Two Forms of Pipelining: - E.g., Floating Point Operations
slogeshwari
No ratings yet
AlgorithmChapter I
Document61 pages
AlgorithmChapter I
brooklightme
No ratings yet
Chapter One: Al-Khowarizmi, Is Defined As Follows: Roughly Speaking
Document20 pages
Chapter One: Al-Khowarizmi, Is Defined As Follows: Roughly Speaking
Yaffa
No ratings yet
Par Seq Algorithms
Document44 pages
Par Seq Algorithms
RAHUL KUMAR
No ratings yet
Digital Signal Processing: A Practical Guide for Engineers and Scientists
From Everand
Digital Signal Processing: A Practical Guide for Engineers and Scientists
Steven Smith
Rating: 4.5 out of 5 stars
4.5/5 (7)
Wansview IP Camera Manual
Document54 pages
Wansview IP Camera Manual
Maria Jose Parea
No ratings yet
UCLA - What Statistical Analysis Should I Use - SPSS
Document54 pages
UCLA - What Statistical Analysis Should I Use - SPSS
mllabate
No ratings yet
Quality
Document102 pages
Quality
Nathan Bukoski
No ratings yet
Brother DCP8070D Error Codes
Document2 pages
Brother DCP8070D Error Codes
Rafał Krzysztof Kowalski
No ratings yet
Config Cisco 881 Isla Partida
Document4 pages
Config Cisco 881 Isla Partida
Johan Frederick Foitzick
No ratings yet
AOC LE32H158I Service Manual
Document165 pages
AOC LE32H158I Service Manual
vmalvica
No ratings yet
Image Compression - SPECK 070711
Document37 pages
Image Compression - SPECK 070711
Md Rizwan Ahmad
No ratings yet
TQM Clause 45
Document4 pages
TQM Clause 45
MychaWong
No ratings yet
Demski2016 Open Data Models For Smart Health Interconnected Applications The Example of OpenEHR
Document9 pages
Demski2016 Open Data Models For Smart Health Interconnected Applications The Example of OpenEHR
Cristina Cazacu
No ratings yet
CDM Series Det SVC Man
Document109 pages
CDM Series Det SVC Man
RaulJoseOleagaFlores
No ratings yet
Resume 1
Document2 pages
Resume 1
Quick Learning
No ratings yet
Finall Edited Project - Hotel Management System
Document66 pages
Finall Edited Project - Hotel Management System
gere
No ratings yet
Yamaha NU1 MIDI Control Codes
Document6 pages
Yamaha NU1 MIDI Control Codes
isotherm
No ratings yet
MiniMate Manual
Document58 pages
MiniMate Manual
alexandly
100% (1)
Model Course 7.01 Master and Chief Mate
Document439 pages
Model Course 7.01 Master and Chief Mate
kemime
100% (1)
Operator Precedence
Document2 pages
Operator Precedence
Jaishree Chakare
No ratings yet
MAPEH 6 - Q1 Exam - 2019 - With Ans Key - BY JULCAIPS
Document4 pages
MAPEH 6 - Q1 Exam - 2019 - With Ans Key - BY JULCAIPS
gay basquinas
No ratings yet
Granqvist Svec PEVoC6 Microphones
Document14 pages
Granqvist Svec PEVoC6 Microphones
Beverly Paman
No ratings yet
VB Script
Document15 pages
VB Script
Neelima Madala
No ratings yet
Expert Systems Principles and Programming, Fourth Edition
Document35 pages
Expert Systems Principles and Programming, Fourth Edition
John Edokawabata
100% (3)
Data Checking PDF
Document9 pages
Data Checking PDF
denzelfawn
No ratings yet
Esim Programmer
Document198 pages
Esim Programmer
Weslley Moura
No ratings yet
Comprehensive Exam
Document4 pages
Comprehensive Exam
Monit Agrawal
No ratings yet
User Manual: Integrated Stall Rental E-Government Management System (ISRMS)
Document20 pages
User Manual: Integrated Stall Rental E-Government Management System (ISRMS)
Celesamae Tangub Vicente
No ratings yet
Mendocino College 2013 Spring Schedule
Document54 pages
Mendocino College 2013 Spring Schedule
LakeCoNews
No ratings yet
Parallelized Deep Neural Networks
Document34 pages
Parallelized Deep Neural Networks
Lance Legel
No ratings yet
Dreambox 500 For Newbies 6.1
Document140 pages
Dreambox 500 For Newbies 6.1
Jay McGovern
No ratings yet
Skimming and Scanning
Document17 pages
Skimming and Scanning
Desi Natalia
No ratings yet
Lesson 17-Optimization Problems (Maxima and Minima Problems)
Document18 pages
Lesson 17-Optimization Problems (Maxima and Minima Problems)
Jhonnel Capule
No ratings yet