You are on page 1of 6

A CONTROL ENGINEER’S LOOK AT ATM CONGIESTION AVOIDANCE

Charles E. Rohrs, Randall A. Berry, Stephen J. O’Halek

C.E. Rohrs is a Fellow of Tellabs Research Center and a Visiting Scientist at MIT.
Tellabs Research Center, 3740 Edison Lakes Pkwy, Mishawaka, IN 46545 (charlie@trc.tellabs.com)
R.A. Berry and S.J. O’Halek are graduate students at MIT.

Abstract -- Algorithms for controlling congestion are critical possible beliaviors quickly becomes intractable. The
for the success of ATM networks. The various sample mathematics of dynamic system theory suggests that the
algorithms appearing in the literature have in common interaction of many non-linear feedback loops can produce
significant non-linearities that make these algorithms difficult unexpected and erratic behavior. While no such behavior has
to analyze using the tools of classical control theory. This been demonstrated for the carefully crafted binary algorithms,
paper reports on the beginnings of a research program that we feel that it is a u:seful endeavor to examine if simple linear
considers the ATM congestion control problem from the point feedback loops can be designed in the context of the ATM
of view of control theorist. A control scheme is developed that congestion avoidance problem. If adequate feedback loops can
can be designed and analyzed using well established linear be obtained, the expansion of the analysis to large
control theory. There is promise that as this approach is further interconnected systems becomes much simpler and more
developed it offers hope that analysis can assure reasonable fruitful.
behavior in the large scale system setting of ATM networks. In this pcxper, we address the creation of source and switch
algorithms that can be analyzed and designed using standard
linear control system techniques that can be found in any
I. INTRODUCTION standard undergraduate text in the subject, for example, [7].
We believe that thilj theory of control which has developed
A critical success factor for ATM networks is the resolution over many years has much to teach us about congestion
of a scheme to adequately control the flow of data into an ATM avoidance algorithms and that this theory has been largely
network so as to avoid congestion at the network switching ignored. We believe: manipulating the problem to make use of
points [I]. The ATM Forum is earnestly addressing this this theory can produce a number of benefits. Among the
question and has defined protocols that enable the switches and potential benefits are:
sources to communicate congestion information [2]. It has (1) simpler and/or more effective algorithms,
also produced a series of sample congestion avoidance (2) better understanding of the current non-linear
algorithms. These algorithms are not part of the standards algorithms,
being set but are included to demonstrate the viability of the (3) better analysis techniques for large systems of
protocols. The algorithms suggested to date fall into two interacting algorithms.
categories - binary algorithms, e.g. the PRCA (proportional In this paper, we consider only the simple case where two
rate control algorithm) [3], which use a single bit to sources are trying to send cells to the same output port of the
communicate congestion information from the switches to the same switch. Both sources are persistent in that they would
sources, and multivalued algorithms such as those presented in like to send cells as fast as possible but are limited by the
[4],[5] and [6], which exchange much more information in bandwidth of the output port of the switch. Each source should
both directions between the sources and the switches. While settle to a rate that is half the bandwidth of the output port.
more recent work in the ATM Forum has concentrated on The results of this paper provide only the first steps in using
multi-valued schemes, this paper addresses only the more control theory to analyze and design alternative congestion
restricted binary schemes. Our attention is restricted to these avoidance algorithms. We are not yet at the point where we
schemes because we are looking to more fundamentally advocate a particular alternative algorithm. We present our
understand the behavior of these schemes and the role classical thinking and. our results as a contribution to the general
control theory can play in suggesting alternative schemes that understanding of the behavior of rate control algorithms and an
can be more productively analyzed. opening to a different way of considering such algorithms. We
The binary schemes that have been presented to date as hope that the lessons learned in addressing these issues from
sample switch and source algorithms possess the characteristic the viewpoirit of a control engineer will enrich the work of
that they contain control loops with significant non-linearities those working on the problem of congestion control, most of
within their operating regions. Such schemes are difficult to whom have differen,t backgrounds from ours.
analyze even in a single loop setting. In a network setting with
many interacting loops, fully understanding the range of

0-7803-2509-5195US$4.00 0 1995 IEEE 1089


11. A NON-LINEAR ALGORITHM one of two events occurs:
1. It is time to send a new RM cell. At this time the source
In this section we present a binary algorithm that is decreases its rate by a multiplicative scale factor
representative of those being considered in the ATM Forum. 2. The source receives a returned RM cell with C=O. If this
As algorithms are developed in the Forum small changes are occurs the source raises its rate by more than enough to
made but, for our purposes, the algorithms given here overcome the decrease incurred during the last condition 1
summarizes the key ingredients of this class of algorithms. situation.
Again, note that we are only considering the binary algorithms There is an aspect of this algorithm that makes it
with single bit feedback from the switches to the sources and unamenable to analysis. The rate at which cells are being sent
not the multivalued algorithms with more sophsticated by a source is constantly changing; indeed, it is the variable
communication capabilities. The purpose of presenting this being controlled. Since the times at which changes are being
algorithm is to point out two characteristics that make the made in the rate depends on the rate itself, the time between
algorithm very difficult to analyze and that create behavior that successive sample instants is constantly changing making a
is cause of concern. The algorithms are usually presented in discrete-time analysis very difficult. Also, the feedback
pseudo-code as in [4]. Here, we present the algorithm mechanism is corrupted by the timing out aspect of condition 1
somewhat less precisely and we fix the values of some in the source algorithm. The RM cells are not being sent out
parameters in order to simplify the exposition. and returned at a constant rate since the returning RM cells
At each time each source i has a rate Ri at which it is were generated some time ago when the source's rate may have
sending cells. We assume that each source always has cells to been lower or higher. If the RM cells are going out faster than
send and would like to send them as fast as possible. The they are coming back there are times when rate decreases due
switch sees cells arriving for a particular output line at a rate R to condition 1 are not offset even though there is no congestion
where R is the sum of the rates of cells being sent from all in the switch.
sources to the particular output line. The sources share a single A simulation of the switch and Source algorithms described
switch output line of bandwidth B=3.54*105 cells/sec (150 above is presented in Fig. 1. Two persistent sources are
Mbps). Every 32 cells the sources insert an RM (Resource contending for a single switch's output port which has a
Management) cell which is returned by the destination. The bandwidth of 3.54*105 cells/sec. The cell rates of two sources
switch controls one bit in the RM cell to indicate congestion, are shown in Fig. la. Rather than converging to steady-state
C=l, or no congestion, C=0. We consider the case where the rates of 1.77* lo5 cells/sec each, both source rates oscillate in
switch marks the RM cell on its way back to the source unison between 0.5*105 and 3.0*105. The switch's queue
(Backward Explicit Congestion Notification or BECN). The length also oscillates between 300 and 600 cells as shown in
feedback may be essentially instantaneous or may be subject to Fig. Ib.
delay.
Switch Algorithm. The switch queues the cells until they
can be delivered. The length, q, of the queue at successive
time instants separated by A seconds is modeled by .
.... . ... .._.. .... . .I I

q(nA) = max{q( (n-1)A) (1)


+ (R((n-1)A)-B)A,0} +n
4

Equation (1) models the flow of cells through the queue as a


continuously varying quantity. Cells arrive and leave the actual
queue at discrete times and the actual queue always contains an Fig. l a The Rates of the sources.
integer number of cells. The difference between the model of
the queue in (1) and the real queue is accounted for by the c: c: received by source lx7OPi

noise source nq. The statistics of nq can be derived from 3 w e u e length lcellsl

consideration of the queue dynamics. The noise nq is modeled


as zero mean white noise with variance ,1875. jo3 . .... ~

603
The switch sets the congestion indicator of RM cells
returning to the source if its queue length surpasses some
threshold (in this case, q=500). 0.01 0.02 0.03 0.04
time 1sec1

C = threshold ( q - 500) (2)


Fig. l b The size of the queue and the C indicators
where the threshold function equals 1 if the argument is greater on returning RM cells.
than or equal to 0 and the function equals 0 otherwise. This, of
course, is a significant non-linearity in a control system. Fig.1 The simulation results for the non-linear
Source Algorithm. The source changes its rate when either algorithm .

1090
Also shown in Fig. l b is the timing of the C=O and C=l RM It can create the effect of a many valued feedback variable by
cells as received back at a source. The timing out problem is using probabilistic feedback.
seen most clearly at the top of the source's rate cycle. The Consider the situation where the switch would like to
source rate levels off while it is receiving only C=O (no communicate the parameter p , a number between 0 and 1 to the
congestion) indicators. At this point the rate is high and new source. It can use a pseudo-random number generator to set
outgoing RM cells are being generated much faster than RM C=l with probability p and C=O with probability 1-p. The
cells (which were generated at lower rate previously) are being source can then recreate a noisy estimate of p by averaging
returned. The result is many false slowdowns due to timing over a number of successive C's. This is the scheme we
out. Notice that this effect is in fact stabilizing but that is suggest to turn the rate control system into an (almost) linear
makes accurate analysis very difficult. control system.
Some analysis of this representative algorithm can be Source algorithm. In the interval between time (n-l)A and
performed if this timing out problem is assumed away by only nA accumulate as C I the number of RM's received with C=l
letting the source adjust at fixed time intervals. A fixed and as COthe number o i RM's received with C=O. Let
interval adjustment mechanism in the spirit of the ATM Forum
PRCA algorithm is
(4)

At nA , compute Ri(nA)as
where COis the number of C=O RM cells received in time
interval A, Ri(nA) = y R i ( ( n - - l ) A ) + (l-pi((n-l)A))p (5)
C I is the number of C=l RM cells received in time
interval A, -Pi( ( n - 1 ) A ) a
a is the multiplicative decrease factor, and is the = YRi((n-1)A) - ( a + P ) P i ( ( n - W ) +P
additive increase factor. Notice that this equation for the rate adjustment has an
A system very similar to this is analyzed in [8]. Even with effect very similar to that of Eq. (3) but it has been modified to
the timing out problem removed, the system contains two non- be linear in the control parameter pi. There is still an
linearities which cause it to exhibit steady-state oscillations
exponential decrease in rate which can be overcome by an
similar to those demonstrated in Fig. 1. A typical cycle in the
additive increase when there is little congestion, i.e.,when p i is
oscillation is described in [8] where it is made apparent that the near zero.
cycles are fundamentally caused by the threshold non-linearity
Switch Algorithm. The queue length is still given by (1).
in (2). The switch computes the feedback parameter p as the output of
In summary, the non-linearities in the algorithm of (1)-(3) a linear time invariant operator working on q. The loop
create a control system that results in large limit cycles. The
contains a summer amd another low frequency pole at y. The
binary control variable C does not anticipate changes in the fundamentals of control theory indicate that a simple controller
queue. By the time C is used to indicate congestion the rates that feeds back a term proportional to the queue size and
are large and growing. The congestion indication causes the another term indicating the change in the queue size is
rates to begin to decrease but the queue continues to grow until appropriate for a good, stable system response. Such a
the sum of the rates falls below the outgoing bandwidth. The controller is standard in the control theory literature [7]. It is
cycle is reversed on the downside. The two sources are referred to as a proportional-derivative (PD) or a lead network
synchronized in their oscillation since the nonlinear mapping controller. This controller can be expressed as
of the single queue is driving them simultaneously in the same
direction. The situation becomes worse when delay is added to
the feedback.

The parameter p is communicated to the source by marking


111. USING PROBABILISTIC FEEDBACK TO CREATE A C=l to all sources with probability p if O<p<l. The variable p
LINEAR CONTROL SYSTEM is allowed to saturate at 0 and 1. If the parameters a and b are
chosen small enough[,p will not saturate and the effect will be a
The most serious problem in the system discussed above is linear control system where each source receives a noisy
the restriction that only a single bit can be fed back from the estimate of the variablep. In particular, the parameter b relates
switch to the sources. If a many valued control variable could the steady-state queue size to p and is important in allowing
be used in the feedback loop then linear control could be operation to remain in the linear region.
applied and standard techniques such as simple lead (i.e., PD) Notice that if the: system operates in the saturation region,
control could be used to create a system that behaves better. the behavior is much the same as the original system. We
While we agree to live with the binary feedback restriction, we have, in effect replaced the hard switching non-linearity of (2)
believe that it can be used more wisely. The system actually and Fig. 2a with the softer non-linearity of Fig. 2b. The softer
has many chances to send a congestion bit back to the source. non-linearity allows a region of linear control which can

1091
produce better performance. The remaining non-linearity model by representing the sum of the two sources as a doubling
provides a safety net against any instability in the linear region of one source.
as any locally unstable trajectories are trapped in a stable limit R ( n A ) = 2R, (nA) (9)
cycle.
A slightly larger model with each source represented
separately may easily be substituted for more complex
situations.
Equations (I), (5)-(7), and (9) provide a model with which
the control system can be designed and analyzed. A block
diagram of the model appears in Fig. 3.
0 200 400 600 800
4

Fig.2a The hard non-linearity of the threshold


function
r-l
1
(a + b) (1-& 2.' )
1 I I

cz 0.5
PI

0 _II

0 500 1000 Fig. 3 Block diagram of the linear control system

Now that the model is specified, linear control theory can


be applied. Linear control theory provides guidance in
Fig.2b The softer non-linearity of probabilistic
choosing the parameters so that design trade-offs are clear
feedback
through analysis rather than the trial and error of simulation.
Fig. 2 Comparison of the non-linearities involved The theory offers techniques to use the parameters to
in the congestion control loops. manipulate derived quantities such as bandwidth and phase
margin to aid in the understanding of the following
fundamental performance trade-offs:
IV. ANALYSIS AND SIMULATION 1. A stable linearized controller can be designed.
2. The maximal round trip delay for which the linearized
The one switch, two sources cell transport system was controller remains stable can be determined using the
analyzed and simulated with the switch implementing (6) and control loop bandwidth and phase margin. The lower
probabilistic tagging and the source implementing (4) and (5). the bandwidth and the larger the phase margin, the more
As long as the system operates in the linear region a complete delay can be tolerated.
analysis is possible. Simulations confirm the expected results. 3. The speed and the shape of the transient response can be
For analysis and design of the linearized control scheme determined. The larger the phase margin, the nicer the
(1) is added to model the queue. The difference between the shape of the transient. The higher the bandwidth, the
variable p at the switch and the noisy estimates, p l , of this faster the transient response.
variable computed at the sources is represented by additive 4. Steady-state values of variables can be determined in
noise, ni. terms of the parameters.
5. The effects of the noise sources on the variables can be
pi(nA) = p(nA) +nj(nA) (7) analyzed. The smaller the bandwidth, the less effect the
noise sources have on the variables.
The noise, ni, represents the error made in estimating the mean
of N i.i.d. binomial random variables by their sample mean, We begin by examining the steady-state values. We
where N is the number of RM cells received in a A interval. analyze the case where the parameters are chosen so that the
The errors in each interval will be independent, and so ni can closed-loop system is stable and operates in the linear region,
be represented as white noise with a variance given by: i.e., that p remains between zero and one. Then, except for the
noise sources, ni and nq driving the system, the system reaches
a constant steady- state. With n,=O and nq = 0, the steady-state
values are:
For the range of tests considered here the two sources From(l), (9) R = B and Ri = 1 / 2 B (10)
remain synchronized. This fact is exploited to simplify the

1092
the actual congestion control system were run to verify
From (5)
performance. The results of a simulation are in given in Fig. 4.
The rates of the two sources are displayed in Fig 4a, where it is
From (6) q = p/b (12) seen that the rates converge to near their steady-state values.
The queue is well behaved as in steady-state it moves slightly
From ( l l ) , the parameter P/(a+p)must be chosen larger around its steady-state value. The presence of the noise which
than (l-y)B/(a+p) to allow a positive steady-state p . From (5), enters the system in communicating p to the sources prevents a
smaller y removes the effects of the initial rates more quickly complete quiescent steady-state. Notice, however, that the
and has the effect of forcing the two sources toward the same queue is kept to an (order of magnitude smaller than the non-
rate more effectively. We chose a=O and ~ 0 . 9 initially.
9 To linear simulation of Fig. 1 and that the oscillations are
keep the queue size small (less than IOO), (12) was used to drastically reduced.
guide the choice of b as b=0.01. Based on the expected steady-
state rates of the sources, we chose our sampling time to be
~ l = 9 * 1 0 secs.
- ~ This allows each source to see on the average
5 RM cells per sampling period. Increasing A changes the
frequency scale of the system and has effects similar to those
produced by lowering the bandwidth.
Control theory tells us that the bandwidth and phase margin
are determined by a transfer function called the control loop
gain. A larger loop gain means a larger bandwidth. The
control loop gain is given by Fig 4a. The ratl:s of the sources

2 ( a+ p) A ( a + b ) z-’( 1 - “z-’)
~ i , “ “ ’ ” ” ” l ’ ’

a+b
G(s) = (13)
( 1 - yz-l) ( 1 - z-’>

The parameters a and p are left to set the phase margin, the
bandwidth and the steady-state probability p. Choosing
a=0.0685 and p=2.95*103 creates an adequate phase margin of
45 degrees, a bandwidth of 500 rad/sec and a steady-state
Fig 4b. The size of the queue.
p=0.4. The steady-state q is 40.
Other quantities can be derived using the theory. The Fig. 4 Simulation of the probabilistic system with
linearized system can remain stable for round trip delays up to ~ 0 . 9 9a=0.0685,
, b=0.01, a=O,and P=2.95*103.
1.5 msecs. Larger delays could be handled by either lowering
the loop gain or increasing A. If the actual delay is greater than To demonstrate ithe trade-offs that are achievable using
that specified, the system will behave much as the original non- linear control theory the system was redesigned with the goal
linear system. of reducing the effect of the noise from the communication of
Linear system theory provides the means to predict the p . To achieve this ,goal we must reduce the bandwidth and
effect the noise sources will have on the steady state values. accept the accompanying slowing of the transient response.
Given the transfer function, H(dw),between a noise source and The important point is that because the system can be analyzed
a parameter of interest, then the variance in the parameter is we know the precisie nature of the trade-offs involved. To
related to the variance of the white noise source by decrease the bandwidth, the loop gain of (13) is lowered by
decreasing P. The parameter a must be increased to protect the
phase margin. Finally, in order to keep the steady state p in
(12) at p=0.4, the value of y must be increased making the rates
of the two sources converge more sluggishly. We set
We will call this scale factor the covariance response of the ~ 0 . 9 9 9 8 a=0.4,
, b=O.Ol, a=O, and P=59. The bandwidth of
transfer function. For the model parameters above, the the resulting system is now 50 radJsec with a 60 degree phase
covariance response from the probability noise, nito the queue margin resulting in robustness to delays in the loop up to 19
length, q is 840 and the covariance response from the queue msecs. The variance of the queue is now predicted to be 6.3
noise, ny to the queue length, q is 0.25. The variances of niand with a standard deviation of 2.5 cells. The simulation was run
nq are 0.05 and ,1875 respectively. Thus, the variance of the starting the rates rieax their steady-state values to avoid
queue is expected to be 40 and the standard deviation is 6.3. excessive simulation through the longer transient. The
This means that in the steady state the queue should usually simulation results appear in Fig. 5. The most noticeable
stay within +I- 6.3 cells of its steady state value. difference is the much reduced effect of the noise on the rates
As long as the variables remain in the linearized region, the and the queue as ]predicted from the smaller closed-loop
entire response can be determined analytically. Simulations of bandwidth.

1093
[ 5 ] Siu, K. and H. Tzeng, “Intelligent Congestion Control for
ABR Service in ATM Networks,” Computer Communication
Review, Vol. 24, No. 5 pp. 81-106, October. 1995.
[6] Barnhart, A.W., “Example Switch Algorithm for Section
5.4 of TM 5.4 of TM Spec.,” ATM Forud95-0195, February
10, 1995.
[7]Rohrs, C.E., J.L. Melsa, and D.G. Schultz, Linear Control
Systems, McGraw-Hill. 1993.
[8] Yin, N.and M.G. Hluchyj, “On Closed-Loop Rate Control
Fig 5a. The rates of the sources. for ATM Cell Relay Networks,” Proceedings, IEEE Infocom
’94,pp. 99-108. June 1994.

Fig 5b. The size of the queue.


Fig 5 Simulation of the probabilistic system with
~ 0 . 9 9 9 8a=0.4,
, b=0.01, GO, and p=59.

V. CONCLUSIONS

Probabilistic feedback has been used to create a congestion


control algorithm which can be designed and analyzed using
linear control theory. The viability of such an approach has
been demonstrated in simulations that match theoretical
predictions. The linear control technique can be used to
eliminate the large steady-state oscillations from the results of
the non-linear algorithms. The ability to understand the effects
of parameters in this new system has also been demonstrated
so that design trade-offs can be understood and enacted. The
linear control techniques employed here apply a fortiori to the
case where multivalued feedback is provided for directly.
Only a single simple case has been investigated to date.
Much work is yet to be done before the approach given here is
demonstrated as a viable alternative design. However, we feel
the promise is clear. If the design can be achieved using
analyzable linear systems, then a great deal more can be
understood about the behavior of large interconnections of
such systems.

REFERENCES

[I] Jain, R, “Congestion Control and Traffic Management in


ATM Networks: Recent Advances and A Survey”, to be
published in Computer Networks and ISDN Systems.
[2] Closed-Loop Rate-Based Traffic Management,” ATM
Forum/95-0013R3, editing version.
[3] Closed-Loop Rate-Based Traffic Management,” ATM
Forum/94-0438Rl, July 1994.
[4] Closed-Loop Rate-Based Traffic Management,” ATM
Forum/94-0438R2, September 1994.

1094

You might also like