You are on page 1of 26

Load Balancing Guardrails

Keeping Your Heavy Traffic on the Road to Low Response


Times
Isaac Grosof (CMU)
Ziv Scully (CMU)
Mor Harchol-Balter (CMU)

1
Goal: Optimal Load Balancing
Assumptions:
Q1: How to Stochastic Arrivals
Dispatcher dispatch? Known Sizes
Preempt-Resume
Objective:
Minimize
Q2: How to
SRPT

SRPT
SRPT

mean
response schedule?
time E[T]
Theorem:
SRPT is optimal

Servers
2
Prior Work on Dispatching
SRPT: Very little prior work FCFS: Tons of prior work

Dispatcher Dispatcher
SRPT

SRPT
SRPT

FCFS

FCFS
FCFS
3
Prior Work on Dispatching - FCFS
Join-Shortest-Queue (JSQ): Winston, Weber, FCFS: Tons of prior work
Whitt, Lin, Raghavendra, Foley, McDonald, Bramson, Lu,
Prabhakar, Eschenfeldt, Gamarnik, …

Join-Shortest-of-d-Queues (JSQ-d): Dispatcher


Vvedenskaya, Dobrushin, Karpelevich, Mitzenmacher,
Bramson, Ying, Srikant, Kang, Muckherjee, Borst,
Leeuqaarden, ...

Least-Work-Left (LWL): Lee, Longton, Kingman,

FCFS

FCFS
FCFS
Takahashi, Daly, Tijms, Van Hoorn, Ma, Mark, Breur, Hokstad,
Kimura, Gupta, Harchol-Balter, Dai, Zwart, Osogami, Whitt,

Size-Interval-Task-Assignment (SITA):
Harchol-Balter, Crovella, Murta, Bachmat, Sarfati, Vesilo,
Scheller-Wolf, …
4
Prior Work on Dispatching - SRPT
SRPT: Very little prior work Random dispatch: Trivial
First Policy Iteration (FPI) Heuristic
[Hyytiä & Aalto ‘12]
Dispatcher
Multilayered Round Robin
SRPT [Down & Wu ‘06]

SRPT
SRPT

5
?
Good for FCFS ⇒ Good for SRPT
Random SITA-E LWL Dispatcher
200

150

SRPT

SRPT
SRPT
Mean
response
time E[T] 100
for SRPT
servers 50 Distribution:
Bounded Pareto
[1, 106], α=1.5.
0
0.7 0.8 0.9 1 10 servers
Load (ρ)
6
Good for FCFS ⇏ Good for SRPT
Random SITA-E LWL Dispatcher
200

150

SRPT

SRPT
SRPT
Mean
response
time E[T] 100
for SRPT
servers 50 Distribution:
Bounded Pareto
[1, 106], α=1.5.
0
0.7 0.8 0.9 1 10 servers
Load (ρ)
7
Our Contribution: Guardrails
Disp. Policy P Disp. Policy P

SRPT
+ Guardrails

→ ≈
SRPT

SRPT
SRPT

SRPT

SRPT
SRPT
Possibly
Guaranteed
very
heavy traffic
bad
optimal 8
Our Contribution: Guardrails
Disp. Policy P Disp. Policy P

SRPT
+ Guardrails

→ ≈
SRPT

SRPT
SRPT

SRPT

SRPT
SRPT
k
1 1 1

Possibly
Guaranteed
very
heavy traffic
bad
optimal 9
Good for FCFS ⇏ Good for SRPT
Random SITA-E LWL Dispatcher
200
G-Random G-SITA-E G-LWL

150

SRPT

SRPT
SRPT
Mean
response
time E[T] 100
for SRPT
servers 50 Distribution:
Bounded Pareto
[1, 106], α=1.5.
0
0.7 0.8 0.9 1 10 servers
Load (ρ)
10
Dispatching to SRPT Servers

Dispatcher

SRPT

SRPT

SRPT

SRPT
11
Dispatching to SRPT Servers

SRPT

SRPT 12
Dispatching to SRPT Servers

SRPT

SRPT
Leads to
bad E[T]
A small job
needs me!

13
Problem: Small Job Imbalance
Balanced small jobs

More small More small


jobs left jobs right
Dispatcher

SRPT

SRPT
14
Problem: Small Job Imbalance
Balanced small jobs

More small More small


jobs left jobs right
Dispatcher

SRPT

SRPT
A small job
needs me!
15
Guardrails
Balanced small jobs

More small More small


jobs left jobs right
Dispatcher

SRPT

SRPT
16
Guardrails
Balanced small jobs

More small More small


jobs left jobs right
Dispatcher

SRPT

SRPT
17
Guardrails
Balanced small jobs

More small More small


jobs left jobs right
Dispatcher

SRPT

SRPT
18
Guardrails
Balanced small jobs

More small More small


jobs left jobs right
Dispatcher

SRPT

SRPT
19
Guardrails

To Do: Dispatcher
• >2 job sizes?
• >2 servers?

SRPT

SRPT
20
Guardrails

>2 job sizes? Dispatcher

Prob.

SRPT

SRPT
Size

21
Guardrails: Bucketing

Dispatcher

SRPT

SRPT
22
Guardrails: Bucketing
[1, 10] [10, 100] [100, 1000]

… …
SRPT

SRPT
23
Guardrails: Bucketing
[1, 10] [10, 100] [100, 1000]

… …
SRPT

SRPT

SRPT
SRPT
24
Precise Dispatching Requirement
•Job
  of size has rank ↔ *
Volume of rank work dispatched to server by time .
Guardrail requirement:
ranks servers times
 

 * is chosen as a function of load .

25
The Guardrail Theorem
  𝐸 [Resp .  Time  of   Disp .  Policy   P   with   Guardrails]  
lim ∀ 𝑃,
=1
𝜌 →1 𝐸 [Resp.  Time  of  Single   SRPT   Superserver ]

Disp. Policy P Disp. Policy P +

SRPT
Guardrails



SRPT

SRPT
SRPT

SRPT

SRPT
SRPT
w.r.t. k
1 1 1 E[T]
Possibly
Guaranteed heavy
Very bad
traffic optimal 26

You might also like