Professional Documents
Culture Documents
• Element to block
– Better surface to volume
• Task merging
• Task reduction
– Reduces communication
• Dependencies?
• Speedup?
• Computing each
element of y can be
done independently
Monday, April 17, 2023 BE_Comp_HPC_Algorithm Design... KNH 17
Matrix-Vector Multiplication (Limited Tasks)
• AxB=C x =
• A[i,:] • B[:,j] = C[i,j]
• Row partitioning
– N tasks
• Block partitioning
– N*N/B tasks
Sequential
4-way
Monday,parallel
April 17, 2023 BE_Comp_HPC_Algorithm Design... KNH 22
Pipeline Performance
• N data and T tasks
• Each task takes unit time t
• Sequential time = N*T*t
• Parallel pipeline time = start + finish + (N-2T)/T * t
= O(N/T) (for N>>T)
• Try to find a lot of data to pipeline
• Try to divide computation in a lot of pipeline tasks
– More tasks to do (longer pipelines)
– Shorter tasks to do
• Pipeline computation is a special form of
producer-consumer parallelism
– Producer tasks output data input by consumer tasks
Monday, April 17, 2023 BE_Comp_HPC_Algorithm Design... KNH 23
Tasks Graphs
• Computations in any parallel algorithms can be viewed as a task
dependency graph
• Task dependency graphs can be non-trivial
– Pipeline Task 1 Task 2 Task 3 Task 4
Numbers are
time taken to
perform task
Li Li, “Model-based Automatics Performance Diagnosis of Parallel Computations,” Ph.D. thesis, 2007.
decrease
increase
Last stage
• G = (V, E)
– V finite set of points called vertices
– E finite set of edges
– e E is an pair (u,v), where u,v V
– Unordered and ordered graphs
• Adjacency list
Generate part 1
probability
distributions
Check if more
documents
Monday, April 17, 2023 BE_Comp_HPC_Algorithm Design... KNH 85
Parallelization Approach
• Map: partition the data collection into subsets of
documents and process each subset in parallel
• Reduce: assemble the partial frequency tables to
derive final probability distribution web 2
Parallel algorithm web
webweed2
2
1
weed 1
weedgreen1 2
Get next Count words and green 2
sun 2
green 1
document update statistics sun 1
Data Find and
sun moon1
moon 1
1
land 1 1
collection count words moon
land 1
landpart 1 1
part 1
part 1
Architectural issues
Flynn’s taxonomy (SIMD, MIMD, etc.), network
topology, bisection bandwidth, cache coherence, …
Different programming constructs
Mutexes, conditional variables, barriers, …
masters/slaves, producers/consumers, work queues,. …
Common problems
Livelock, deadlock, data starvation, priority inversion,
…dining philosophers, sleeping barbers, cigarette
smokers, …
green 1 web 1
sun 1 weed 1
moon 1 green 1
land 1 sun1 1
web
part 1 moon 1
moon web1green 1 1
split1 Supply multiple weedKEY 1 VALUE
land 1sun
green 1 1
green 1 … 1moon 1 1
processors sun 1
part
web KEY1land 1
VALUE
moon 1
Data green 1part 1
Map land 1
… 1web 1
Collection: part 1
KEY green
VALUE 1
split 2 web 1
……
…
… 1
green 1 KEY VALUE
… 1
KEY VALUE
Data
Collection:
split n
Monday, April 17, 2023 BE_Comp_HPC_Algorithm Design... KNH 90
MapReduce
MAP: Input data <key, value> pair
REDUCE: <key, value> pair <result>
Reduce
Data Map
Collection:
Split the data to
split1
Supply multiple
processors
Data Reduce
Collection: Map
split 2
……
…
Data
Collection: Reduce
Map
split n