Congesstion

Congestion
Control
1
Principles of Congestion Control
Congestion:
❒ informally: “too many sources sending too much
data too fast for network to handle”
❒ manifestations:
❍ lost packets (buffer overflow at routers)
❍ long delays (queuing in router buffers)
❒ a highly important problem!
2
Causes/costs of congestion: scenario 1
❒ two senders, two receivers

❒ one router,
❒ infinite buffers
❒ no retransmission
3
❒ Throughput increases with load

❒ Maximum total load C (Each session C/2)
❒ Large delays when congested
❍ The load is stochastic
4
❒ one router,finite buffers
❒ sender retransmission of lost packet
5
❒ always: λ = λ
in out (goodput)
❍ Like to maximize goodput!
❒ “perfect” retransmission:
❍ retransmit only when loss:
λ > λ out
in
❒ retransmission of delayed (not lost) packet
❒ makes λ
in larger (than perfect case) for same λ out
6
out
out
out
λ
λ
λ
λ 'in = λin λ ’in λ ’in
“costs” of congestion:
❒ more work (retrans) for given “goodput”
❒ unneeded retransmissions: link carries multiple copies of
pkt
7
❒ four senders
❒ multihop paths
Q: what happens asλ
in
❒ timeout/retransmit and λ increase ?
in
8
Another “cost” of congestion:

❒ when packet dropped, any “upstream” transmission capacity used
for that packet was wasted!
9
Approaches towards congestion control
Two broad approaches towards congestion control:
End-end congestion Network-assisted

control: congestion control:
❒ no explicit feedback from ❒ routers provide feedback
network to end systems
❒ congestion inferred from ❍ single bit indicating
end-system observed loss, congestion (SNA,
delay DECbit, TCP/IP ECN,
❒ approach taken by TCP ATM)
❍ explicit rate sender
should send at
10
Goals of congestion control
❒ Throughput:
❍ Maximize goodput
❍ the total number of bits end-end
❒ Fairness:
❍ Give different sessions “equal” share.
❍ Max-min fairness
• Maximize the minimum rate session.
❍ Single link:
• Capacity R
• sessions m
• Each sessions: R/m
11
Max-min fairness
❒ Model: Graph G(V,e) and sessions s1 … sm
❒ For each session si a rate ri is selected.
❒ The rates are a Max-Min fair allocation:
❍ The allocation is maximal
• No ri can be simply increased
❍ Increasing allocation ri requires reducing
• Some session j
• rj ≤ ri
❒ Maximize minimum rate session.
12
Max-min fairness: Algorithm
❒ Model: Graph G(V,e) and sessions s1 … sm
❒ Algorithmic view:
❍ For each link compute its fair share f(e).
• Capacity / # session
❍ select minimal fair share link.
❍ Each session passing on it, allocate f(e).
❍ Subtract the capacities and delete sessions
❍ continue recessively.
❒ Fluid view.
13
Max-min fairness
❒ Example
❒ Throughput versus fairness.
14
Case study: ATM ABR congestion control
ABR: available bit rate:

❒ “elastic service”
❒ if sender’s path “underloaded”:
❍ sender can use available bandwidth
❒ if sender’s path congested:
❍ sender lowers rate
❍ a minimum guaranteed rate
❒ Aim:
❍ coordinate increase/decrease rate
❍ avoid loss!
15
RM (resource management) cells:

❒ sent by sender, in between data cells
❍ one out of every 32 cells.
❒ RM cells returned to sender by receiver

❒ Each router modifies the RM cell
❒ Info in RM cell set by switches
❍ “network-assisted”
❒ 2 bit info.
❍ NI bit: no increase in rate (mild congestion)
❍ CI bit: congestion indication (lower rate)
16
❒ two-byte ER (explicit rate) field in RM cell

❍ congested switch may lower ER value in cell
❍ sender’ send rate thus minimum supportable rate on path
❒ EFCI bit in data cells: set to 1 in congested switch

❍ if data cell preceding RM cell has EFCI set, sender sets CI
bit in returned RM cell
17
❒ How does the router selects its action:
❍ selects a rate
❍ Set congestion bits

❍ Vendor dependent functionality
❒ Advantages:
❍ fast response
❍ accurate response
❒ Disadvantages:
❍ network level design
❍ Increase router tasks (load).
❍ Interoperability issues.
18
End to end
control
19
End to end feedback
❒ Abstraction:
❍ Alarm flag.
❍ observable at the end stations
20
Simple Abstraction
21
Simple Abstraction
22
Simple feedback model
❒ Every RTT receive feedback
❍ High Congestion
Decrease rate
❍ Low congestion
Increase rate
❒ Variable rate controls the sending rate.
23
Multiplicative Update
❒ Congestion:
❍ Rate = Rate/2
❒ No Congestion:
❍ Rate= Rate *2
❒ Performance
❍ Fast response
❍ Un-fair:
Ratios unchanged
24
Additive Update
❒ Congestion:
❍ Rate = Rate -1
❒ No Congestion:
❍ Rate= Rate +1
❒ Performance
❍ Slow response
❒ Fairness:
❍ Divides spare BW equally
❍ Difference remains unchanged
25
AIMD Scheme
❒ Additive Increase
❍ Fairness: ratios improves
❒ Multiplicative Decrease
❍ Fairness: ratio unchanged
❍ Fast response overflow
❒ Performance:
❍ Congestion -
Fast response
❍ Fairness
26
AIMD: Two users, One link
Fairness
Rate of User 2
BW limit
Rate of User 1
27
TCP & AIMD: congestion
❒ Dynamic window size [Van Jacobson]
❍ Initialization: Slow start
❍ Steady state: AIMD
❒ Congestion = timeout
❍ TCP Taheo
❒ Congestion = timeout || 3 duplicate ACK

❍ TCP Reno & TCP new Reno
❒ Congestion = higher latency

❍ TCP Vegas
28
TCP Congestion Control
❒ end-end control (no network assistance)
❒ transmission rate limited by congestion window size, Congwin, over
segments:
Congwin
❒ w segments, each with MSS bytes sent in one RTT:
w * MSS
throughput = Bytes/sec
RTT
29
TCP congestion control:
❒ “probing” for usable ❒ two “phases”
bandwidth: ❍ slow start
❍ ideally: transmit as fast ❍ congestion avoidance
as possible (Congwin as ❒ important variables:
large as possible)
❍ Congwin
without loss
❍ threshold: defines
❍ increase Congwin until
loss (congestion) threshold between two
slow start phase,
❍ loss: decrease Congwin,
congestion control
then begin probing
phase
(increasing) again
30
TCP Slowstart
Host A Host B
Slowstart algorithm one segm
ent
RTT
initialize: Congwin = 1
for (each segment ACKed) two segm
ents
Congwin++
until (congestion event OR
four segm
CongWin > threshold) ents
❒ exponential increase
(per RTT) in window time
size (not so slow!)
31
TCP Taheo Congestion Avoidance
Congestion avoidance
/* slowstart is over */
/* Congwin > threshold */
Until (timeout) { /* loss event */
every ACK:
Congwin += 1/Congwin
}
threshold = Congwin/2
Congwin = 1
perform slowstart
TCP Taheo
32
TCP Reno
❒ Fast retransmit:
❍ After receiving 3 duplicate ACK
❍ Resend first packet in window.
• Try to avoid waiting for timeout
❒ Fast recovery:
❍ After retransmission do not enter slowstart.
❍ Threshold = Congwin/2
❍ Congwin = 3 + Congwin/2
❍ Each duplicate ACK received Congwin++
❍ After new ACK
• Congwin = Threshold
• return to congestion avoidance
❒ Single packet drop: great!
33
TCP Vegas:
❒ Idea: track the RTT
❍ Try to avoid packet loss
❍ latency increases: lower rate
❍ latency very low: increase rate
❒ Implementation:
❍ sample_RTT: current RTT
❍ Base_RTT: min. over sample_RTT
❍ Expected = Congwin / Base_RTT
❍ Actual = number of packets sent / sample_RTT
❍ ∆ =Expected - Actual
34
TCP Vegas
❒ ∆ = Expected - Actual
❒ Congestion Avoidance:
❍ two parameters: α and β , α <β
❍ If (∆ < α ) Congwin = Congwin +1
❍ If (∆ > β ) Congwin = Congwin -1
❍ Otherwise no change
❍ Note: Once per RTT
❒ Slowstart
❍ parameter γ
❍ If (∆ > γ ) then move to congestion avoidance
❒ Timeout: same as TCP Taheo

35
TCP Dynamics: Rate
❒ TCP Reno with NO Fast Retransmit or Recovery
❒ Sending rate: Congwin*MSS / RTT
❒ Assume fixed RTT
W/2
❒ Actual Sending rate:

❍ between W*MSS / RTT and (1/2) W*MSS / RTT
❍ Average (3/4) W*MSS / RTT
36
TCP Dynamics: Loss
❒ Loss rate (TCP Reno)
❍ No Fast Retransmit or Recovery
❒ Consider a cycle
W
W/2
❒ Total packet sent:
❍ about (3/8) W2 MSS/RTT = O(W2)
❍ One packet loss
❒ Loss Probability: p=O(1/W2) or W=O(1/√p)
37
TCP latency modeling
Q: How long does it take to Notation, assumptions:
receive an object from a ❒
Assume one link between
Web server after sending
client and server of rate R
a request?
❒ TCP connection establishment
❒ Assume: fixed congestion
❒ data transfer delay window, W segments
❒ S: MSS (bits)
❒ O: object size (bits)
❒ no retransmissions
❍ no loss, no corruption
38
TCP latency modeling
Optimal Setting: Time = O/R
Two cases to consider:

❒ WS/R > RTT + S/R:
❍ ACK for first segment in window returns before
window’s worth of data sent
❒ WS/R < RTT + S/R:
❍ wait for ACK after sending window’s worth of
data sent
39
TCP latency Modeling K:= O/WS
Case 1: latency = 2RTT + O/R Case 2: latency = 2RTT + O/R

+ (K-1)[S/R + RTT - WS/R]
40
TCP Latency Modeling: Slow Start
❒ Now suppose window grows according to slow start.
❒ Will show that the latency of one object of size O is:
O  S S
Latency = 2 RTT + + P  RTT +  − ( 2 P − 1)
R  R R
where P is the number of times TCP stalls at server:
P = min{Q, K − 1}
- where Q is the number of times the server would stall

if the object were of infinite size.
- and K is the number of windows that cover the object.
41
TCP Latency Modeling: Slow Start (cont.)
Example:
O/S = 15 segments
i n i t i a
K = 4 windows
Q=2
c o n n
P = min{K-1,Q} = 2
Server stalls P=2 times.
r e q u
42
TCP Latency Modeling: Slow Start (cont.)
S
+ RTT = time from when server starts to send segment
R
until server receives acknowledgement
2k −1
S
R
= time to transmit the k th window in it i a
S
 R + RTT − 2
+
k −1 S 
R 
= stall time after the k th
window c o n n
P
O
latency = + 2 RTT + ∑ stallTime p
R p =1
P
O S S
= + 2 RTT + ∑ [ + RTT − 2 k −1 ]
r e q
R k =1 R R
O S S
= + 2 RTT + P[ RTT + ] − ( 2 P − 1)
R R R
43
Flow Control
44
TCP Flow Control
flow control receiver: explicitly
sender won’t overrun informs sender of
receiver’s buffers by (dynamically changing)
transmitting too amount of free buffer
much, space
too fast ❍ RcvWindow field in
RcvBuffer = size or TCP Receive Buffer TCP segment
RcvWindow = amount of spare room in Buffer sender: keeps the amount
of transmitted,
unACKed data less than
most recently received
RcvWindow
receiver buffering
45
TCP: setting
timeouts
46
TCP Round Trip Time and Timeout
Q: how to set TCP Q: how to estimate RTT?
timeout value? ❒ SampleRTT: measured time from
❒ longer than RTT segment transmission until ACK
receipt
❍ note: RTT will vary
❍ ignore retransmissions,
❒ too short: premature
cumulatively ACKed segments
timeout
❒ SampleRTT will vary, want
❍ unnecessary
estimated RTT “smoother”
retransmissions
❍ use several recent
❒ too long: slow reaction
measurements, not just
to segment loss
current SampleRTT
47
TCP Round Trip Time and Timeout
EstimatedRTT = (1-x)*EstimatedRTT + x*SampleRTT
❒ Exponential weighted moving average
❒ influence of given sample decreases exponentially fast
❒ typical value of x: 0.1
Setting the timeout

❒ EstimtedRTT plus “safety margin”
❒ large variation in EstimatedRTT -> larger safety margin
Timeout = EstimatedRTT + 4*Deviation

Deviation = (1-x)*Deviation +
x*|SampleRTT-EstimatedRTT|
48

Congesstion

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Congesstion

Uploaded by

Copyright:

Available Formats

Congestion

❒ two senders, two receivers

❒ Throughput increases with load

Another “cost” of congestion:

End-end congestion Network-assisted

❒ Throughput versus fairness.

ABR: available bit rate:

RM (resource management) cells:

❒ RM cells returned to sender by receiver

❒ two-byte ER (explicit rate) field in RM cell

❒ EFCI bit in data cells: set to 1 in congested switch

❍ Set congestion bits

❒ Variable rate controls the sending rate.

❒ Congestion = timeout || 3 duplicate ACK

❒ Congestion = higher latency

❒ w segments, each with MSS bytes sent in one RTT:

❒ Timeout: same as TCP Taheo

❒ Actual Sending rate:

❒ Loss Probability: p=O(1/W2) or W=O(1/√p)

Optimal Setting: Time = O/R

Two cases to consider:

Case 1: latency = 2RTT + O/R Case 2: latency = 2RTT + O/R

where P is the number of times TCP stalls at server:

- where Q is the number of times the server would stall

- and K is the number of windows that cover the object.

Server stalls P=2 times.

Setting the timeout

Timeout = EstimatedRTT + 4*Deviation

You might also like