Professional Documents
Culture Documents
EXECUTIVE SUMMARY.
Random Early Detection (RED) algorithm was first proposed by Sally Floyd and Van Jacobson in 1) for Active Queue Management (AQM) and then standardized as a recommendation from IETF in 2). It is claimed that RED is able to avoid global synchronization of TCP flows, maintain high throughput as well as a low delay and achieve fairness over multiple TCP connections, etc. The introduction of RED has stirred considerable research interest in understanding its fundamental mechanisms, analyzing its performance and configuring its parameters to fit in various working environments. This report first describes RED algorithm in section I and then explains several analytical models in section II and IV, respectively. Specifically, section II discusses analytic evaluation of RED performance, which is based upon the paper 3). Section IV examines a feedback control model for RED, which was first introduced in paper 4). In section V, the parameter tuning for RED is discussed. The report is ended with a further discussion of selected topics and possible future work. Note that this report only focuses on the original RED algorithm, although numerous variants of RED have been proposed,.
Initialization avg <- 0 count <- -1 For each packet arrival if the queue is non-empty
avg(1-q )avg+qq
else
m f (time-q_time) avg(1-q )mavg
with probability pa : mark the arriving packet count <- 0 Else if maxth < avg mark the arriving packet count <- 0 Else count <- -1 When queue become empty q_time <- time Notations: [1] Saved Variables: avg: average queue size q_time: start of the queue idle time count: packets since last marked packet [2] Fixed Parameters: q : queue weight minth: minimum threshold for queue maxth: maximum threshold for queue maxp: maximum value for pb [3] Other: pa: current packet-marking probability q: current queue size Time: current time
As avg varies from minth to maxth , the packet-marking probability pb varies linearly from 0 to maxp :
pb avg minth maxp max th minth
Equation 1
The final packet-marking probability pa increases slowly as the count increases since the last marked packet:
pa pb 1-count pb
Equation 2
As discussed in Section V, this ensures that the gateway does not wait too long before marking a packet. The gateway marks each packet that arrives at the gateway when the average queue size avg exceeds maxth.
d(avg) =
0 1
Drop function for Tail Drop scheme: 0 d(q) = 1 if q > maximum buffer size if q < maximum buffer size
Equation 4
P(drop) 100%
P(drop) 100%
A. A RED router with bursty input traffic. We first derive a model of a RED router with a single input stream of bursty traffic. Assume that the arrival process is a batched Poisson process with rate and bursts (or batches) of B packets. Let the service time be exponentially distributed with mean -1. The offered load is defined to be = B/. Hence, the number of packets buffered in the queue forms a Markov chain with stationary distribution . This model is depicted in Figure 3: Model of RED router with bursty input traffic. It is worthwhile to note that this model does not really match empirically derived models of TCP and other bursty traffic patterns. However, it is analytically tractable; furthermore, our purpose here is to compare the relative impact of RED on bursty and less bursty traffic. We can imagine that the difference between a smooth input traffic and a batch Poisson process (as examined here) would be a lower bound to that observed between a smooth input and an input process with long range dependence. B router
Poisson
drop
Approximation 1: The RED router uses the same drop probability d(q) on al packets in the same burst, where q is the instantaneous queue size at the time the first packet in the burst arrives at the router. Furthermore, we choose maxth = K. Using PASTA property, we get the drop probability seen by a newly arrival: Tail Drop router:
P TD = ( K ) + ( K 1)( B 1 )+ B
1 + ( K B +1)( ) B
Equation 5
RED router:
PRED = ( K ) + ( K 1)d ( K 1) + + (1)d (1)
Equation 6
Note: The stationary distribution p for RED router is different from that for Tail Drop router.
Figure 4: Drop probability vs. offered load for different values of the burst size.
Figure 4 shows the drop probability of an incoming packet as a function of offered load for different burst sizes, obtained by previous analysis (with Approximation 1) and by simulation (without Approximation 1). The figure clearly shows that the approximation is very accurate, even for large values of the burst size. In addition, for large offered load, the drop probability is very close to that suffered by a smooth Poisson traffic in a Tail Drop router, which is given by the loss probability for the M/M/1/k queue.
PM / M / 1/ K =1
1 k 1 k +1
Equation 7
Noting that the drop probability is always higher with RED than with Tail Drop, we conclude that whatever the burst size,
1 1 PRED P TD =1 + o ( )
Equation 8
for >>1. 5
B. A RED router with bursty and smooth input traffic Now we consider a router with two input flows, one bursty with batch Poisson arrivals as discussed in A., the other a smoother (non batch) Poisson stream. We denote by (b) and (s) the load of the bursty and the smooth traffic, and by = (b) + (s) the total offered load. The model is depicted in Figure 5. B
Poisson
router
drop Poisson
Figure 5: Model of RED router with a mix of bursty and smooth traffic
Again, the total number of packets buffered in the queue defines a Markov chain with stationary distribution of . Using PASTA property, we obtain the drop probability of a packet for the bursty flow and the smooth flow in a Tail Drop router:
P TD ( b ) = ( K ) + ( K 1)(
and
B 1 1 ) + + ( K B +1)( ) B B
Equation 9
P TD ( s ) = ( K )
Equation 10
Since PTD(b) > PTD(s), it means that there is a bias against bursty traffic with Tail Drop router. For the RED scheme, the drop probability is
K PRED ( b ) = ( k )d ( k ) = PRED ( s ) k =1
Equation 11
Clearly, there is no bias against bursty traffic with RED. Note that for Tail Drop scheme, the drop probability is:
P TD =
(b ) (s ) P ( b )+ P (s ) TD TD
Equation 12
C. Including queue size averaging in the model. So far, we have assumed that the drop probability for RED scheme only depends on the instantaneous queue size. Once the queue size averaging is taken into consideration, the complexity of the model is increased phenomenally. However, note that when the weight q of the moving average scheme is small, as being recommended in 1), the estimated average queue size avg varies slowly so that consecutive packets belonging to the same burst are likely to experience the same drop probability d(avg). Hence, the Approximation 1 used in previous analysis is still valid in this case. As an example, consider a buffer of size K = 40 and RED parameters minth = 10, maxth = 30, maxp = 0.1 and q = 0.002. Figure 6 shows the drop probability as a function of the fraction of bursty traffic in the input traffic, obtained using the analytic expressions above (continuous line for RED, dashed for Tail Drop), and using simulations (done with queue size averaging and without Approximation 1). Several key observations can be made from Figure 6. First, simulation result supports the conclusion that Tail Drop scheme has bias against the bursty traffic. For RED scheme, however, the drop probability for bursty traffic and smooth traffic is the same. Moreover, the average drop probability of a mix of bursty traffic and smooth traffic for Tail Drop scheme remains a constant and equals to the drop probability of RED scheme with the same traffic mix. This can be expressed as
P (b) = P (s ) RED RED
(b) (s ) P (b) + P (s ) TD TD
Equation 13
When the fraction of bursty traffic is large, Figure 6 indicates that the RED scheme avoids bias against bursty traffic by increasing the drop probability of smooth traffic, without improving the drop probability of bursty traffic. In all cases, the drop rate of a flow going through a RED router does not depend on the burstiness of this flow, but only on the load it generates (refer to Equation 13 and Figure 6).
Figure 6: Drop probability vs. fraction of bursty traffic for an offered load of =2.
D. An important observation about PASTA. It is important to note that the analysis above heavily relies on the PASTA property of Poisson processes. In general, it is not true that the stationary distribution of the number of packets k buffered in the queue immediately before the arrival of a busrt of packets coincides with , the continuous-time stationary distribution of k. For instance, Figure 7 shows the drop probabilities obtained in a RED router with both a bursty input traffic with Pareto inter-arrival times between bursts and a Poisson input traffic. The Pareto coefficient in the figure is 1.4 and the RED parameters are those of Figure 6. Unlike what we saw earlier in the case of the batch Poisson arrival process, the drop probability for the Pareto traffic is different from the drop probability for smooth traffic even for the RED router. A further discussion about how the traffic model impacts on the RED performance is provided in section VI. SYNCHRONIZATION OF TCP FLOWS As shown in the previous section, TCP mechanism that uses Tail Drop scheme has bias against bursty traffic. If these packets belong to different TCP connections, these connections then experience losses at about the same time, decrease their rates/windows synchronously, and then tend to stay in synchronization. This phenomenon is referred to as the synchronization of multiple TCP connections. It is claimed that RED algorithm is able to avoid TCP synchronization problem. We investigate this claim in this section.
Figure 7: Drop probability for RED and Tail Drop vs. offered load for bursty (batch arrivals and Pareto distributed inter-arrivals) and smooth (Poisson) traffic, and a high fraction of bursty traffic (90%).
A. Tail Drop Assume that a drop occurs at time t = 0 in a Tail Drop router. Due to the memory-less property of exponential distribution, the next incoming packet is dropped if and only if its arrival time is smaller
than the service time of a packet. Thus when a packet is dropped, the next packet is dropped with probability p, where
e x (1 e x )dx = = P = 0 + +1
Equation 14
Equation 15
Hence
E ( NTD ) = p +1 Var ( NTD ) = p( p +1)
Equation 16 Equation 17
B. RED with instantaneous queue size Approximation 2: Consecutively dropped packets are dropped with the same probability. is the stationary distribution of the number of packets in the queue, conditionally to the fact that a drop occurred,
( i drop )
Equation 18
n 0
Equation 19
Equation 20
Equation 21
Figure 8 compares the analytic result with simulation for an offered load of =2 and RED parameters as in Figure 4.
Figure 8: Distribution of the number of consecutive drops for an offered load of =2.
C. RED with average queue size Since the addition of average queue size complicates the modeling and analysis of RED algorithm, simulation is again used as the resort. However, Approximation 2 is still valid in this case, especially when is parameter q small. It follows from Equation 19 that the distribution of the number of consecutive drops satisfies:
K 1 (k ) th k = max P (N > n) >9 RED K 1 ( k )d ( k ) k =0
n 0
Equation 22
Since the lower bound of P(NRED > n) in this case does not depend on n, the number of consecutive drops becomes with positive probability! The phenomena is illustrated by the simulation results of Figure 9 and can be explained as follows With high load, avg slowly oscillates around maxth. This results in long ( when w ->0) periods of consecutive drops (when q > maxth) and long ( when w ->0) periods of random drops (when q < maxth). The result shows that RED significantly increases the mean and variance of the number of consecutive drops, especially when is close to its recommended value of 0.002 (1). This suggests that deploying RED may in fact contribute to the synchronization of TCP flows.
10
Note that the conclusion drawn above is based upon a different definition of drop probability for RED algorithm than the one originally proposed in (1). More discussion on this topic is provided at the end of this section.
QUEUEING DELAY To compare the delay through a router with both the RED and Tail Drop management schemes, same model described in previous discussion is used, where the input traffic is a Poisson process of intensity . A. Tail Drop Using the M/M/1/K model, the stationary distribution of the queue size in a Tail Drop router is given by:
TD ( k ) = k (1 )
1 k +1 k = 0, ,K ,
Equation 23
B. RED with instantaneous queue size The number of packets in the queue is a birth-death process, the stationary distribution of which is given by:
RED ( k ) =
l =0 K k k 1 [1 d ( l )] k =0 l =0
k [1 d ( l )]
k 1
k = 0, ,K ,
Equation 24
11
Figure 10 shows that RED reduces the mean delay, but increases the delay variance.
C. RED with average queue size Similar conclusion can be obtained when we use average queue size as a parameter to compute the drop rate. That is, RED reduces the mean delay but also increase the jitter in the delay.
12
TCP connection (T) depends on the packet drop probability (p) and average round trip time (Ri), the average packet size M (in bits), the average number of packets acknowledged by an ACK b (usually 2), the maximum congestion window size advertised by the flows TCP receiver Wmax (in packets) and the duration of a basic (non-backed-off) TCP timeout To, (which is typically 5R). For simplicity, we express this fairly complex relationship by using the following equation, where rt,i () is the sending rate of flow i. rt,i (p, Ri ) = T (p, Ri ) Equation 25 The purpose of the controlling element is to bring and keep the cumulative throughput (of all flows) below (or equal to) the links capacity c:
n c r j =1 t , j
Equation 26
In the following, we further simply the model, based upon the assumptions made below. Basically, we assume that the TCP flows are homogenous. That is, all TCP (Reno) flow fi (1 i n) have same average round trip time (RTT), Ri = R, same average packet size Mi = M So that rt,i (p, R) = rt,j (p, R) for 1 i, j n, and the objective function becomes rt,i (p, R) c/n 1in
Equation 27
Now the n-flow feedback system can be reduced to a single-flow feedback system, as shown in Figure 12.
13
To determine the steady state of this feedback system (i.e., the average value of rt, q and p when the system is in equilibrium), we need to determine the queue function (or queue law) q = G(p) and the control function p=H( q ). The control function H is given by the architecture of the drop module, which is RED in our case.
We first break the loop and study the open loop system shown in Figure 13. Note that p is an independent parameter in this open loop system. If R0 is the propagation and transmission time on the rest of round trip, the round trip delay R can be expressed as:
R = R0 + q / c
Equation 28
In this open loop case, since the drop probability is not directly affected by the average queue length, it is possible that router may keep dropping the packets even the average queue length does not exceed certain threshold. Hence, if drop probability p > p0, the link is underutilized and the average queue size approaches 0.
q ( p)0
Equation 29
So we have the following equations to describe the system behavior when p > p0 R = R0 , Equation 30 rt (p, R) c/n When p < p0, since the drop probability is low, the average queue length is considerably long and the objective function becomes
rt ( p ,R0 + q / c ) = c / n
q ( p ) = c [T 1( p ,c / n ) R0 ]= max{B ,c [T 1( p ,c / n ) R0 ]} R R
Equation 31 Equation 32
Where B is the max buffer size. The authors have conducted extensive simulation and the result support the equation 31 developed above. The relationship between average queue length and the drop probability is illustrated in Figure 14.
14
Now, let us return to the feedback control system in Figure 12. An expression for the long-term (steady-state) average queue size as a function of packet drop probability, denoted by, is just developed as in q ( p ) = G ( p ) Equation 32 and validated via simulation. However, Equation 32 is developed under the open loop scenario. If we assume that the drop module has a feedback control function denoted by p = H (qe ) , where qe is an estimate of the long-term average of the queue size. If the following system of equations
q = G( p) p = H (q )
Equation 33
has unique solution (ps, q s ), then the feedback system in Figure 12 has an equilibrium state (ps, q s). Moreover the system operates on average at (ps, q s ). If we use RED for queue management, the H function becomes: =
p = H (q ) q minth maxp max th minth
if minth< q <maxth
Equation 34
=0 = 1
if if
q<
q
minth >maxth
For different combinations of H function and G function, the whole system may work in a stable equilibrium state or in unstable state, which depends on how the two curves intersect with each other. In one case, for example, the G function and H function are as two curves illustrated in Figure 15. It can be seen easily that the system approaches the equilibrium point (ps, q s) . That is, the equilibrium point is an attractor for all states around it and once the system reaches the equilibrium state, it will stay there with only small statistical fluctuations, given that the number of flows n does not change.
In the other scenario, the G function and H function are as in Figure 16. In this case, the equilibrium point is situated beyond pmax, where the drop rate has a jump from 0.1 to 1, as given by Equation 34. Careful analysis shows that the system, although attracted by this point, cannot stay there, since the value of p is not defined. So, RED operating point oscillates in this case.
15
Equation 35
pb
Equation 36
So X is a geometric random variable (R.V.) with parameter pb and E[X] = 1/ pb . We intend to mark (drop) packets at fairly regular intervals. It is undesirable to have too many marked (dropped) packets close together, and it is also undesirable to have too long an interval between marked packets. Both of these events can occur when X is a geometric random variable, which can result in global synchronization, with several connections reducing their windows at the same time. Method 2: Uniform random variables. If we define drop probability pa as: pb / (1 count pb)
n2 Pb p (1 b ) = pb 1 ( n 1) pb i = 0 1 ipb
Equation 37
where count is the number of unmarked packets that have arrived since the last marked packet, then
Pr ob[ X = n ] =
Equation 38
Pr ob[ X = n ]= 0
which is a uniform random variable (R.V.) within [ 1, 2, ., 1/ pb ] with E[ X ] = 1 / ( 2 pb ) + 1 / 2 Method 2 is has obvious advantage over method 1, because we tend to spread out the packet drop as evenly as possible. Neither clustering nor large inter-dropping interval are desirable.
16
AVERAGE QUEUE LENGTH ANDq . The average queue length is defined as:
avg(1-q )avg+qq
Equation 39
where an exponential moving average is used to estimate the average queue length. It is apparent that if the q is too large, then the averaging procedure will not filter out transient congestion at the router. If q is set too low, then avg responds too slowly to changes in the actual queue size and then cant detect the initial stage of congestion. Assume that the queue is initially empty, with an average queue size of 0, and then the queue increases from 0 to L packets over L packet arrivals. After the Lth packet arrives at the router, avgL is:
L L 1 i avgL = i q (1 q )L i == q (1 q )L i ( ) 1 q i =1 i =1 L+1 L i x + (Lx L 1) x ix = , we obtain the upper bound (1 x )2 i =1
Equation 40
for q as:
avgL = L + 1+
(1 q )L + 1 1
Equation 41
Given a minimum threshold minth and a acceptable bursts of L packets arriving at the router, then q should be chosen to satisfy the following equation for avgL < minth to accommodate burstiness.
(1 q )L + 1 1
Equation 42
< minth
avgL = L + 1+
THRESHOLD VALUES. The parameter minth must be correspondingly large to allow the link utilization to be maintained at an acceptably high level, if the typical traffic is bursty. The maxth partly depends on the maximum average delay that can be allowed by the router. The rule of thumb is: maxth = 3 minth
Equation 43
APPLICATION OF FEEDBACK CONTROL MODEL ON PARAMETER CONFIGURATION. In section IV., a feedback control model for RED scheme has been developed. One of the main applications of this model is to configure the parameters for RED. 17
As shown in Figure 17, different H function and G function can result in different operating point and hence the characteristics (e.g. stability, etc) of the system. For example, a H function with a high slope results in a state with low drop rate but large average queue size. Conversely, a H function with a small slope give rise to a lower average queue size, but larger drop rate.
18
PARETO DISTRIBUTION AND RED PERFORMANCE As discussed in section II, RED scheme has a worse performance if the inter-arrival time for input traffic model has Pareto distribution instead of exponential distribution, which is shown in Figure 7. The simulation indicates that under the same traffic load, RED has higher drop probability for bursty traffic (whose inter-arrival time follows Pareto distribution) than the smooth traffic (whose inter-arrival time is exponentially distributed). The fact that inter-arrival time has Pareto distribution means that the probability that the inter-arrival time approaches infinity is bigger than 0. Apparently, an inter-arrival time with Pareto distribution is more likely to become infinity than that with exponential distribution. That is, the traffic that has interarrival time with Pareto distribution is more clustered than that with exponentially distributed interarrival time and becomes more possible to make the buffer full and packets dropped. RELATIONSHIP BETWEEN FUCNTION H(q) AND G(p) IN THE FEEDBACK CONTROL MODEL G(p) is derived from a model 6) that characterizes the behavior of end-to-end TCP connections with multiple routers in between. When drop probability at routers decreases, packet loss decreases and hence sending rate at end host increases. Higher sending rate, if high enough, results in higher buffer occupancy and larger average queue size at router. If drop probability increases, more packets are to be lost and the sending rate is to be slowed down. Then the buffer occupancy will be lowered accordingly. Meanwhile, H(q) describes the relationship between the average queue size and drop probability with regard to RED algorithm, which runs at the intermediate routers. In this case, the feedback controller tends to increase the drop probability as buffer occupancy increases. Apparently, G(p) and H(q) are not inverse function to each other. In fact, since both of them describe how the system behaves, the point the two corresponding curves intersect should be the place where the system enters equilibrium steady state. FUTURE WORK Although much research effort has been focused on understanding and utilizing RED algorithm to leverage the current network, some interesting research topics are yet to be investigated in more detail in future. For example, since it is widely accepted that Poisson model is not sufficient to characterize the traffic in current Internet, it is important to understand how RED and similar Active Queue Management (AQM) algorithm act when self-similar network traffic is applied. Further studies may produce more meaningful characterization of RED performance in the real-world network. Also note that T. Bonald et al. conclude in 3) that RED algorithm does not avoid TCP synchronization, using Equation 35 instead of
19
Equation 37 as the definition for drop probability. However, S. Floyd et al. have already shown in the 1) that Equation 37 yields better performance than Equation 35, in terms of avoiding TCP synchronization. Hence, the argument made by T. Bonald et al. in 3) may not be valid. One of the main reasons that Equation 37 is considered undesirable is that it brings more difficulty in performing the mathematical analysis. Hence, simulation approach may be appropriate for conducting further examination on this problem.
VIII. REFERENCES
1) S. Floyd, V. Jacobson. Random early detection gateways for congestion avoidance. IEEE/ACM Transactions on Networking (TON) August 19932309 2) RFC: Recommendations on Queue Management and Congestion Avoidance in the Internet 3) T. Bonald, M. May, and J. C. Bolot. Analytic evaluation of RED performance. IEEE INFOCOM 2000 4) V. Firoiu, M. Borden. A Study of Active Queue Management for Congestion Control. IEEE INFOCOM 2000 5) J. Padhye, V. Firoiu, D. Towsley, and J. Kurose. Modeling TCP Throughput: A Simple Model and its Empirical Validation. ACM SIGCOMM '98. 6) J. Padhye, V. Firoiu, D. Towsley, and J. Kurose. A Stchastic Model of TCP Reno Congestion Avoidance and Control. Technical Report CMPSCI TR 99-02, Univ. of Massachusetts, Amherst, 1999.
20