Professional Documents
Culture Documents
OS Scalability
+
Multiprocessor Scheduling
26250
Throughput
17500
8750
0
0 2 4 6 8 10 12 14 16 18 20 22 24
Processors
Service Interface
Add brick
15,000
Single Shared Variable
11,250
7,500
3,750
15,000
Array of Per-CPU counters
11,250
7,500
3,750
• Solutions?
• Use padding so each per-CPU variable lies on different
cache line
7,500
3,750
0
1 3 5 7 9 11 13 15 17 19 21 23
CPU 0
CPU 0
Runnable CPU 0
Queue 0
Runnable CPU 1
Queue
…
Runnable CPU 1
Queue 1
… CPU N
Runnable CPU N
Queue N
!!!
Runnable CPU 0 Runnable CPU 0
Queue 0 Queue 0
Steal work
CPUs E
A
C
Scheduler queue (CPUs, time) B D
E D C B A
1,3 2,2 4,2 2,2 1,4
Time
• Alternative 2: Only use gang scheduling for thread groups that benefit
• Fill holes in schedule with any single runnable thread from those remaining
500
400 Morris A. Jette, in Job
300 Scheduling Strategies for
Parallel Processing, LNCS
200
1998, Volume 1459/1998
100
0
1 2 3 4 5 6 7 8 9 10 11
Number of CPUs
0.6
0.4 Model
OS activity is the culprit!
Sep-21-02
0.2 Nov-25-02
Lower is
0
0 512 1024 1536 2048 2560 3072 3584 4096 better!
# PEs
.. .. .. .. ..
• But if an event happens on a single node, it can affect all the other nodes
.. .. .. ..
.. .. .. ..
• The probability of a random event occurring increases with the node count
... .. .. .. .. .. ..
• We can tolerate the noise by coscheduling the activities of the
system software on each node
• Graphic from
https://computing.llnl.gov/linux/slurm/slurm.sc08.bof.pdf