You are on page 1of 34

1

Resource Allocation in OFDMA Wireless Networks with Time-Varying Arrivals


Somsak Kittipiyakul and Tara Javidi, Member, IEEE

Abstract This paper considers the issue of optimal subcarrier allocation in OFDMA wireless networks when the arrivals and channels are stochastic and time-varying. Our objective is to minimize the long-term average packet delay over multiple time epochs. We show by an example that water-lling-based subcarrier allocation policy which maximizes throughput at each time epoch is not optimal in this case. This is because such policy ignores the time-varying queue state information, while in fact such information is necessary to minimize the long-term average delay. We present an optimal policy for a special case of On-Off channel and two homogeneous users. For general channel, the optimal policy is complicated and unknown. However, based on the insights learned from the On-Off case, we provide heuristic policies that use different degrees of knowledge about the channel and queue state information. Through simulation, we show that the value of queue vs. channel information varies with the trafc load. For instance, at low-to-moderate trafc regime, a policy which ignores much of channel state information outperforms a water-lling policy which ignores much of the queue state information. The opposite is true when considering heavy trafc regime. Index Terms OFDMA, subcarrier allocation, water-lling, load-balancing, queue state information

I. I NTRODUCTION

RTHOGONAL frequency-division multiple access (OFDMA) is a promising technique to provide multiple access control (MAC) in high-speed wireless applications (e.g. broadband wireless, 4G

systems, LANs) in a hostile multi-path environment with frequency-selective fading. OFDMA achieves high spectral efciency in multiuser environment by dividing the total available bandwidth to orthogonal narrow sub-bands to be shared by users in an efcient manner [1]. By adaptively assigning subcarriers to the best users, multi-user water-lling achieves the instantaneous maximum throughput by taking advantage
S. Kittipiyakul and T. Javidi are with University of California, San Diego. Email: tara@ece.ucsd.edu

of channel diversity among users in different locations. Since it is unlikely that selective fading affects users in different locations on similar frequencies, there is an overall gain in selectively assigning subcarriers to users. This is known as multiuser diversity gain. When each user is assumed to have innite or sufcient amount of data in the buffer, water-lling is throughput optimal in both short and long terms. However, when considering nite time-varying packet arrivals and time-varying channels, water-lling is not longterm throughput optimal [2]. The problem of optimal real-time subcarrier allocation has been recently studied ([3]-[18]). The prior work can be categorized into two classes based on the optimization objectives. The objective in the rst class of work ([3]-[9]) is to minimize the total transmit power given constraints on Quality of Service (QoS) requirements of each user. These constraints include minimum data rate and/or acceptable bit error rate (BER). In [8], although random packet arrivals are considered as inputs to the system, there is a packet scheduler that controls the average transmission rate of data ows into the subcarrier allocation. The second class of papers [10]-[13] attempt to maximize the total throughput at each decision epoch given a constraint on transmit power per data stream. In [14], the authors generalize the objective to be the sum utility function of the throughputs subject to power constraints. Our work is similar to this class of papers in that we focus on maximizing the system throughput when considering the subcarrier allocation problem in an OFDMA system. We believe that, although there will always be numerous applications for which the power minimization is critical, in many commercial applications of future wireless systems, achieving high connection speed (data rate) will be of primary concern rather than power. However, the above works [10]-[13] differ from our work in that they all consider throughput maximization at each decision epoch. They provide solutions to a multi-user water-lling problem achieving Shannon capacity under power constraint at each decision epoch (in this paper, we refer to this as instantaneous throughput maximization). The algorithm is then run every decision epoch to allow updates regarding varying channel states and varying number of data streams. The problem with this technique is that the scheduler fails to anticipate the impact of its allocation decision to the future state of the system by assuming innite supply of data per user. The present paper is a part of an on-going effort to establish a systematic approach to a series of throughput-related problems which have been observed in OFDMA systems when considering realistic packet arrivals and queue occupancies ([15]-[18]). In [15], a subcarrier allocation based solely on queue backlogs for an MPEG-4 video transmission application is studied. In [16], the same authors study

subcarrier allocation based on realistic packet arrivals for MPEG-4 video streaming with respect to a performance metric based on video quality. The authors in [17] propose a subcarrier allocation in an OFDM system with nite buffer space. They show through simulations that water-lling solutions perform poorly with respect to a long-run throughput criterion due to buffer overow. The authors in [18] demonstrate that a subcarrier allocation that maximizes a network-wide utility function, determined by mean waiting time, signicantly extends the stable region for incoming trafc. Furthermore, it provides low delay transmission and controls the level of fairness for high-speed, bursty, and delay-sensitive trafc. In this paper, we focus on a long-term average performance, i.e. average queue backlog over T epochs, rather than an instantaneous optimization. We argue that the key factor in performance improvement in real systems with nite and time-varying arrivals is the queue backlog information. We show by an example that even without a constraint on buffer sizes, water-lling-based techniques perform poorly when considered over a long term. We argue that under realistic arrival patterns and moderate loads, maximizing instantaneous throughput (water-lling) is too myopic. Instead, we propose a long-term objective to minimize the average holding cost in the system and attempt to identify the optimal subcarrier allocation with respect to this objective. For a special case of In general, an optimal long-term policy must trade off between two competing goals: the desire to get maximum throughput now and the desire to get the maximum throughput in the future. The second goal requires positioning the system to enjoy the highest multiuser diversity gain in the future by avoiding the situations where some queues are empty while others are heavily backlogged. In other words, the optimal long-term policy must take advantage of the knowledge of the current queue backlogs as well as the statistics of the channel and arrival processes to guarantee a balanced load in the future. The paper is organized as follows. Section II provides problem formulation and assumptions. In Section III, we discuss two major classes of policies: the instantaneous throughput maximizing policies (this class includes water-lling) and load-balancing policies. Via a simple example, we show that in general the intersection of these two classes of policies can be empty, making identication of optimal policy difcult. However, in Section IV, we consider a special case of two-state (ON-OFF) channel model. For this case, we identify a policy, called Maximum Throughput and Load-Balancing (MTLB), that belongs to both classes and show its optimality for the case of two homogeneous users. In Section V, for general connectivity model where the optimal policy is unknown, we use the knowledge learned from the MTLB policy in the On-Off connectivity model that queue information is vital for the optimal policy. We design heuristic

algorithms derived from MTLB with various emphasis on channel and queue state information. We, then, compare their performance via simulation under different arrival loads and trafc distributions. Section VI concludes the paper and discusses direction of future studies. The Appendices contain the construction of MTLB allocation and the proofs of the discussed theorems and lemmas. Here we would like to emphasize that our study shows that load balancing is essential in minimizing average delay. This fact is absolutely independent of fairness considerations even though MTLB happens to be a max-min fair policy when the users are assumed to be homogeneous. In other words, our interest in MTLB policy is not motivated by concerns regarding fairness. Instead, we demonstrate the following fundamental point: under realistic arrival patterns, load balancing is an essential element of any delayoptimal policy. II. P ROBLEM F ORMULATION A. Model and Notations We consider a downlink single-hop OFDMA system composed of a base station and N users with innite buffers. There are K OFDM subcarriers which are time-slotted. There are N queues at the base station, one for each user, to buffer the data. The users are homogeneous, i.e. they see statistically symmetric arrival and channel connectivity processes. Furthermore, they have the same priority. Packets of xed size arrive stochastically at each queue j and are transmitted to user j over a set of allocated subcarriers. At the beginning of each timeslot, the assignment of subcarriers to users is made by a centralized resource manager at the base station. The resource manager has perfect knowledge of the queue backlogs and the channel states which are assumed constant during a timeslot but varying over timeslots (block fading model). We do not allow sharing of any subcarriers. The assignment is announced immediately to all users via a separate control channel. Packets arrive during the current timeslot is not transmitted in that timeslot. In this paper, we use adaptive QAM as an example of the modulation scheme. We use the fact that there is only a very small loss of channel capacity if a white power spectrum is used (i.e. each subcarrier receives equal power) instead of the optimal power spectrum [19]. By allowing the users to be located at different distances from the base, the transmit power per subcarrier of each user is assumed to be pre-adjusted to compensate for the different path losses so that the average received power per subcarrier A SSUMPTIONS

AND

SNR SNR Threshold for 2 packet tranmission/timeslot S2 SNR Threshold for 1 packet tranmission/timeslot S1 frequency subcarrier 1 subcarrier 2 subcarrier 3

Fig. 1. Mapping of a received SNR channel prole of a user to packet capacity for each subcarrier. In this example, the user can potentially transmit 2, 1, and 0 packets on subcarrier 1, 2, and 3, respectively.

assigned to user j is equalized for all users j = 1, . . . , N 1 . For example, if the distance of user j and the base is dj and subcarrier i is assigned to user j, then the required transmit power on subcarrier i is
P d , K j

where d is path loss, is a xed path loss exponent and P is a constant number2 . If hij j
|hij |2 d j

denotes the channel gain of user j at subcarrier i, then the signal power of this subcarrier received by
P user j is ( K d ) j

which is equal to

P |h |2 , K ij

independent of the distance dj . Under such assumption,

the channel gain hij can be mapped to the number of packets per time slot, cij , that subcarrier i can potentially transmit for user j as [19]: cij = P |hij |2 D max 0, 0.31(10 log10 ( ) 6.7) KNo (1)

where D the number of QAM symbols per channel in a timeslot, the xed packet length (in bits), No is the noise power in the subcarrier and is the ooring operation. In other words, we assume that random fading channel conditions can be mapped into a matrix of connectivities {cij } and hence we consider the random connectivity matrix instead of the random fading channels of the users. Figure 1 shows such a procedure for an example of a user. In this example, we assume that there are two modulation and/or coding types: the rst type requires a certain SNR (SNR > S1 ) and can transmit one packet per a timeslot; the other transmission type requires a higher SNR (SNR > S2 ) and transmits two packets per a timeslot. The gure shows that, the user having this channel prole can receive 2, 1, and 0 packets on subcarrier 1, 2, and 3, respectively. As a result we map the channel prole for this user given in Figure 1 to a connectivity prole of (2, 1, 0). The following notations are used throughout the paper. Note that we use the following conventions:
The pre-adjusted transmit power assumption is used to avoid the near-far problem which would cause some issues on fairness as discussed in Section VI. 2 With the symmetric channel and arrival assumption (described later in Assumptions (A1) and (A2)), a subcarrier is equally likely to be P assigned to any user and hence the average total transmit power consumed at the base is N N d . j=1 j
1

lower case letters for scalar, bold face lower case letters for row vectors, upper case letters for matrices and scripted upper case letters for space of matrices.

b(n) = (b1 , . . . , bN ): Backlogs of each queue at the beginning of timeslot n. a(n) = (a1 , . . . , aN ): Stochastic packet arrivals to each queue during timeslot n. C(n) = {cij }: the K-by-N stochastic connectivity matrix at timeslot n where cij denotes the maximum number of packets subcarrier i can serve from queue j. For example, if user 1 has channel prole at time n given as in Figure 1, the rst column of C(n) will be the column vector (2, 1, 0).

W (n) = {wij }: the K-by-N allocation matrix at the beginning of timeslot n. wij {0, 1} and wij = 1 denotes that subcarrier i is assigned to serve queue j.

Denition 1: For a row vector x = (x1 , . . . , xN ) and a matrix Y = (y1 , . . . , yN ) where yj is a column vector, a column-by-column matrix permutation corresponding to a permutation is dened as, for any j and k, (xj ) = xk (yj ) = yk Using the above notations and denition, we make the following assumptions on the arrival and connectivity processes: (A1) The packet arrival processes {a(n)} to users queues during each timeslot are independent across timeslots. The packet arrival processes are symmetric such that the joint probability mass function is permutation invariant, i.e. P [a(n) = (x)] = P [a(n) = x] for any n, vector x and permutation . (A2) The connectivity proles {C(n)} are independent across timeslots and symmetric, i.e. the joint pmf for {C(n)} is column-by-column permutation invariant, i.e. P [C(n) = (Y )] = P [C(n) = Y ] for any n, matrix Y and column-by-column permutation matrix . Assumption (A2) is valid when the channel and mobility creates a homogeneous environment for all users. Note that (A1) and (A2) imply independence across time but not across users, i.e. at a given time the arrivals to various queues need not be independent.

Users/ Queues a1 1 a2 2 aN N

Subcarriers/ Servers 1 c 11 c 12 c 1N 2 3 c K1 c KN K

Fig. 2.

Subcarrier Allocation Problem

B. Problem Formulation We now formulate an abstract problem that captures essential features of the described OFDMA problem, discussed above. Problem (P) Consider a discrete-time model of N queues served by K servers. At each time, each server can serve one queue; but a queue can be simultaneously served by multiple servers. A server i can serve at most cij packets from queue j (see Figure 2). At each time, the connectivities cij of all queue/server pairs are known. We allow for arrivals at each queue at each time and the arrivals are assumed to occur right before each time. The statistics of arrival and connectivity processes are assumed to satisfy (A1) and (A2). We wish to determine a Markov server allocation policy that minimizes the cost function at the nite horizon T :
JT = E[ |I0 ] T

(2)

where I0 summarizes all information available at time zero. is the cost under Markov policy T over horizon T . = T
t=0 T

(b(t))

(3)

where the cost function (b) =

N j=1 g(bj )

where g is a convex and strictly increasing function.

We note that restriction to Markov policies does not entail any loss of optimality because Problem (P) is a stochastic control problem with perfect observations [20]. Also, note that when g is identity function, Problem (P) reduces to an average total backlog (E[
T t=0 N j=1 bj (t)])

minimization problem over

horizon T . From Littles theorem, the optimal policy that achieves minimum average backlog achieves the minimum average packet delay as well. Thus, we can interchange the notion of minimizing average delay and average backlog. C. Related Prior Work in Server Allocation Our problem formulation is very similar to the problem of transmission scheduling for wireless and satellite networks where a limited number of transmitters (servers) or channels have to be allocated to competing users with varying connectivity. The authors in [21] consider the server allocation problem of a single server to N competing queues. At each time slot each queue may be connected or disconnected (ON-OFF) to the server, depending on a binary connectivity random variable. They show that the Longest Connected Queue (LCQ) policy stabilizes the system if the system is stabilizable and minimizes the delay for the special case of symmetric queues. The authors in [22] further show that, in the case of K servers = N queues and the constraint that at most C packets can be served in total in each time slot and fractional packets are allowed to be served to each queue, the optimal policy (the Most Balanced policy) is to serve the queues such that the resulting queue lengths are most balanced. The authors in [22] allow for sharing of the servers (serving a fraction of packet from a set of queues). Furthermore, they do not allow servers to have distinct connectivity proles, i.e. in their paper a user is either connected to all servers or none at all. This is a special case of our problem where C(n) is reduced to a vector. In addition, the model used in [23] is similar to the one used in our paper. The authors in [23] consider the problem of batch allocation of bandwidth or servers to multiple queues. The study is focused on the delay in the observations of channel and queue lengths. Again, the difference with our model is that [23] assumes identical connectivity prole while we allow for distinct connectivity proles across servers. However, we do not consider observation delay with respect to queue length nor do we address imperfect channel estimation. III. I NSTANTANEOUS T HROUGHPUT M AXIMIZING
VS .

L OAD -BALANCING P OLICIES

In this section, we consider two classes of server allocation policies: a class comprising of instantaneous throughput maximizing (IMT) policies and another class of load-balancing (LB) policies. As discussed previously, each class represents one of the competing goals: an IMT policy maximizes the number of

packets being served now, while an LB policy maximizes the number of non-empty queues (hence, the multiuser diversity gain and the number of packets served) in the future. To be precise, we rst dene the feasible allocation and non-idling feasible allocation. Then, we describe the two classes of policies mentioned above. A. Feasible Allocation Assume that at the beginning of time slot n, the state of the system is (b, C). An allocation W = {wij } is a feasible allocation for time slot n if (a) cij = 0 wij = 0; and (b)
N j=1

wij 1, i = 1, . . . , K.

The set of all feasible allocations is denoted by W(n, C). In addition, dene W(n, b, C) W(n, C) to denote the set of all non-idling feasible allocation W if W also satises (c)
K i=1

wij cij bj , j = 1, . . . , N.

B. Instantaneous Throughput Maximizing Policies (IMT)


Instantaneous Throughput Maximizing: An IMT allocation W (n) = wij W(n, C) is a feasible

allocation that achieves the maximum throughput at time n if for all W (n) = {wij } W(n, C),
N K wij cij I{bj >0} j=1 i=1 N K

j=1 i=1

wij cij I{bj >0} ,

(4)

where the indicator function IE =

0 otherwise. Note that the traditional water-lling is an IMT policy. Since water-lling maximizes instantaneous throughput by assigning subcarriers based only on the channel state information (CSI), it potentially empties or shortens some queues such that the queue lengths are signicantly unbalanced. C. Load-Balancing (LB) Policies It is reasonable to maximize the expected future multiuser diversity gain under stochastic arrival and connectivity processes. For that reason, we consider a load-balancing policy; one that distributes the future work load among the queues as evenly as possible so that there are as many users as possible who have data waiting in the queues for transmission. Hence, this guarantees as much as possible the multiuser

if condition E holds,

10

diversity gain in the future. The future work load here is dened as the length of the queues after the assignment. The Longest Connected Queue policy [21] and Most Balanced policy [22] are some examples of LB policies. To introduce the LB policy, we need the following denitions to compare queue vectors in term of their load distribution: Denition 2: Considering ordering function ord : RN RN to be such that x RN , y = ord(x) has the ordered elements of x in descending order i.e. yi yi+m , m > 0. Denition 3: We say x LQO y (x is more balanced than y) iff ord(x) lex ord(y) where the relation lex on RN is the lexicographic (i.e. dictionary or alphabetic) ordering (p.12 [26]). Example: (2, 2, 1) LQO (1, 3, 1) because ord(2, 2, 1) = (2, 2, 1) lex (3, 1, 1) = ord(1, 3, 1).
Load Balancing: An LB allocation W (n) = wij W(n, C) is a feasible allocation that produces the

most balanced queues after the assignment if, for all W (n) = {wij } W(n, C), [b 1W C]+ LQO [b 1W C]+ (5)

where an element-wise product W C is a matrix {wij cij }, and 1 is a row vector of ones; Note that for
+ + a vector v RN , [v]+ = v1 , . . . , vN + where vj = vj I{vj >0} .

Note that LB policies potentially sacrice the current throughput (by giving priority to long queues) for the future throughput (by increasing the future multiuser diversity gain).

D. Example This example shows that an average-delay optimal policy (which is also an average-backlog optimal policy by Littles theorem, as noted earlier) could be a mixture of IMT and LB policies. Table I gives a simple example demonstrating IMT, LB and mixture policies. For illustration, we assume the arrival and connectivity processes to have periodic structure with a period of six timeslots. The initial queue lengths b is [2, 1]. The subcarrier allocation matrix for each timeslot is denoted by underlining elements of each 1 2 for the IMT policy indicates that connectivity matrix. For example, for timeslot 1, the choice of 1 0 0 1 and achieves 3 packets of throughput while leaving the queue lengths after it allocates W = 1 0

11

Policies Timeslot 1 2 Arrivals (a) 2 1 0 2 Queues (b + a) 4 2 3 2 Connectivity (C) 1 2 2 1 1 0 0 1 Leftover Queues 3 0 1 1 Total Throughput Backlog 6 5 Total Backlog

IMT 3 4 2 0 0 2 3 1 1 3 1 1 2 1 1 1 1 1 1 1 0 2 13 4 4 30

LB 5 1 2 1 4 2 0 0 0 0 4 5 6 2 0 2 4 0 2 0 0 2 2 6 1 2 1 4 2 1 2 1 0 2 2 6 2 0 2 2 4 2 1 0 1 2 2 3 4 2 0 0 2 4 2 2 4 1 1 2 1 1 1 1 1 2 2 2 2 12 6 6 6 38 5 1 2 3 4 2 0 0 0 1 4 7 6 2 0 3 4 0 2 0 0 3 2 7 1 2 1 4 2 1 2 1 0 3 0 6 2 0 2 3 2 2 1 0 1 1 1 5

Mixture 3 4 2 0 0 2 3 1 1 3 1 1 2 1 1 1 1 1 1 1 1 1 14 4 4 29

5 1 2 2 3 2 0 0 0 0 3 5

6 2 0 2 3 0 2 0 0 2 1 5

TABLE I A N EXAMPLE SHOWING IMT, LB, AND MIXTURE POLICIES .

1 2 indicates that the allocation at [3, 0], a highly unbalanced state. On the other hand, the choice of 1 0 1 0 and has 2 packets of throughput but the leftover queue lengths are the LB policy assigns W = 1 0 balanced at [2, 2]. Which policy is better cannot be told at timeslot 1 since the overall result depends on the future arrivals and connectivities. Because of the periodic structure of the arrivals and connectivities in this particular example, the allocations repeat every six timeslots. In every 6-timeslot period, there are 14 packet arrivals but only 13 packet departures under IMT policy and 12 under LB, resulting in instability of queue backlogs (growing to innity). Now we consider a third and mixture policy whose allocations are given in Table 1. It results in a better delay performance. In contrast to the pure IMT and LB policies, the mixture policy sometimes maximizes instantaneous throughput (e.g. at timeslot 1), balances the load (e.g. at timeslot 4), or does IMT and LB simultaneously (e.g. at the other timeslots). We notice that this policy stabilizes the queues. This can be seen directly from the queue backlogs. Both IMT and LB policies add 1 and 2 packets, respectively, to the queues every 6 timeslots. Hence, the queues grow unboundedly over time while the queues under the mixture policy stay xed at 2 and 1 packets at the end of every 6 timeslots. Although this is only a pathological example, it illustrates the difculty of devising an optimal policy in the context of delay, where the loss of delay optimality can cause instability. IV. S PECIAL C ASE : O N -O FF C HANNEL In this section, we consider a special case of the connectivity process where cij only takes values 0 (OFF) or 1 (ON). Under this On-Off connectivity, we show that there exists a policy that meets both the instantaneous throughput maximizing and load balancing objectives for the case of two homogeneous

12

users. We then show that, in this special On-Off case, the Maximum Throughput and Load-Balancing (MTLB) policy is an optimal policy for Problem (P). At each timeslot, the MTLB allocation is computed by using a Maximum Weight Matching (MWM) algorithm (see Appendix I). In this section, we give a denition of MTLB policy specic for On-Off channel, discuss its existence, and prove its optimality. A. MTLB Policy Denition 4: Given state (b, C) at the beginning of time slot n, the MTLB policy chooses a non-idling
feasible packet withdrawal matrix W (n) = wij W(n, b, C) such that

(C1) Maximum Throughput: W (n) achieves the maximum throughput, i.e. for all W = {wij } W(n, b, C),
N K wij j=1 i=1 N K

j=1 i=1

wij .

(6)

(C2) Load Balancing: W (n) produces the most balanced queue, i.e. for all W = {wij } W(n, b, C), b 1W LQO b 1W. B. Existence of MTLB Policy We show the existence of MTLB policy constructively in Appendix I by constructing an algorithm that results in an MTLB allocation at each decision epoch. C. Optimality of MTLB Policy We have the following main theorem of the paper: Theorem 1: Consider Problem (P) with On-Off connectivity, N = 2 users, a nite horizon T given any initial state I0 = (b, C) and a cost function (b) =
N j=1 g(bj )

(7)

where g is a strictly increasing and

convex function, then MTLB policy is optimal at all time n = 1, . . . , T . The proof is given in Appendix II. Note that we conjecture that Theorem 1 should hold as well for any N 3 users. The proof requires complicated and lengthy arguments and is left for future work. Remark: Condition (C1) is a maximum instantaneous throughput or water-lling condition. In other words, the optimality of the MTLB policy shown in Theorem 1 implies that water-lling criteria is not sufcient to guarantee long-term throughput optimality unless it is complemented by a load-balancing criteria (condition (C2)).

13

Remark: Theorem 1, in addition, can be extended to the optimality of MTLB in an expected average cost sense for an innite horizon problem. Corollary 1: Consider an innite horizon version of the Problem (P), where the cost is modied to be the average expected cost at each stage. Then MTLB is optimal for any initial state I0 = (b, C). Proof: The theorem proves that there exists a stationary MTLB policy which is optimal for Problem (P) for any nite horizon T . Hence, our MTLB policy achieves the minimization of the average expected cost /T for any nite horizon T . Since the policy is independent of the horizon T , it is optimal with T respect to an average expected cost criterion for the innite horizon version of the problem. V. G ENERAL C ONNECTIVITY M ODEL We have seen from the example in Table I that the optimal policy for the general connectivity model, i.e. when cij {0, 1, . . . , cmax }, where cmax 2, is rather complicated and unknown. Furthermore, we know that MTLB policy does not, in general, exist in this more general case3 . Thus, in this section, we return to the two classes of policies: IMT and LB dened in Section III. We see each class can outperform the other depending on the system load. From the On-Off connectivity model, we have learned that queue information is vital for an average-delay optimal policy. We use the obtained insights to design several heuristic policies that are expected to be close to the optimal delay performance for a large subset of admissible loads. The policies use different degrees of knowledge on connectivity and queue information (called Channel and Queue State Information, CSI and QSI, respectively). We then compare the performance of all these policies in term of average queue backlog (or equivalently, average packet delay) by simulation under different trafc loads and trafc types.

A. Heuristic Policies 1) Algo-I (full QSI, On-Off CSI): The subcarrier assignment uses full information about the queue lengths (full QSI) and minimal information about the channel. The channel quality is thresholded to be either ON or OFF (On-Off CSI), i.e. a subcarrier is considered ON if hij hthreshold4 . Then, MTLB policy introduced in Section IV-A is used for subcarrier allocation. Note that the best allocation is found by using the maximum weight matching algorithm described in the Appendix I.
Moreover, we do not have any numerical solution to the general connectivity model, since all numerical solutions to the dynamic programming equation (shown in Appendix II-B) suffer from dimensionality curse (note that our state space is exponentially in number of subcarriers, users, and the maximum queue sizes). 4 The threshold is arbitrarily chosen. It should be adjusted depending on the load of the system and the number of subcarrier and users. We assume here that the threshold is xed in our simulation.
3

14

Algo-I:

cij = cth =

1
D

if hij hthreshold, otherwise.


P 0.31(10 log( K h2 threshold ) No

6.7)

bp = bj /(cth ) j = 1, . . . , N j For state (bp , C) compute an MTLB allocation W (see Appendix I).

Note that the ON subcarriers are treated for a worst-case scenario, so the throughput of user j is equal to min bj , cth
K j=1 wij .

2) Algo-II (full CSI, On-Off QSI): The subcarrier allocation uses full information about the channel (full CSI) and minimal information about the queue lengths (On-Off QSI). The subcarrier allocation considers only the queues which have some data to transmit. This allocation belongs to the class of IMT policies dened in (4). Algo-II:
Assign W = wij such that wij (i) = 1 where j (i) = arg maxj{1,...,N } cij I{bj >0} .

3) Algo-III and Algo-IV (full CSI and full QSI): Algo-III is the Maximum-Weight policy proposed in [24] which achieves full throughput for all admissible trafc. Under the Maximum-Weight policy, each subcarrier is assigned to the queue with the highest weighted connectivity, where the weight is the queue backlog size. Thus, when the connectivity of a subcarrier for multiple queues are equal, the subcarrier is given to the longest queue. However, Algo-III may over assign subcarriers to some users (i.e. the users have not enough data to send over all assigned subcarriers). With an insight into the signicance of loadbalancing, we modify the known Algo-III to arrive at Algo-IV. Algo-IV is proposed to avoid the problem of imbalancing by accounting in the weight the reduced queue backlog resulted from the packets that will be served from the already assigned subcarriers. Algo-III:
Assign W = wij such that wij (i) = 1 where j (i) = arg maxj{1,...,N } bj cij .;

Algo-IV:

X = {1, . . . , K}; Loop (until stop):

15

If X = , then stop; (i , j ) = arg maxiX,j{1,...,N } bj cij ;


If bj ci j > 0, then wi j = 1 else stop;

bj = bj ci j and X = X {i };
Assign W = wij . K i=1 wij cij .

For both algorithms, note that the throughput of user j is min bj ,

B. Numerical Comparisons and Simulation We consider a downlink OFDMA system in a single cell with one base station composed of N = 32 statistically independent and identical users and K = 128 subcarriers. We generate a frequency-selective channel by using 26-tap multipath with exponential intensity prole and use adaptive QAM modulation. We set the parameters P, D, and No in (1) so that the allocation of subcarriers over a block is equivalent to the server scheduling problem where the connectivity cij {0, 1, 2, 3, . . .}. All simulations are conducted over a 6,000 timeslots. We consider arrivals of xed-size packets where the number of arrivals per timeslot for each queue is a random variable having one of the three distributions: heavy-tailed, Poisson and deterministic5. The heavy-tailed distribution is said to model realistic packet trafc (e.g. in the Internet) which exhibits a longrange dependency [35]). Here, we use the simplest heavy-tailed distribution, the Pareto distribution6. The heavy-tailed, Poisson and deterministic trafc types represent decreasing degree of burstiness, respectively. Figures 3, 4, and 5 provide comparisons of the performance of the proposed algorithms under different trafc models in terms of the average total queue backlog (equivalently, in terms of the average delay by Littles theorem). For all trafc types, Algo-III and Algo-IV, as expected, outperform Algo-I and AlgoII because Algo-III and Algo-IV use both CSI and QSI for the assignment decision. This observation is consistent with [17] and [18], as discussed before. Furthermore, Algo-IV outperforms Algo-III via a balancing of queues (at the cost of computational burden). However, the more interesting and important observation is the performance of Algo-I in the lightto-moderate trafc regime (below 6 packets/user/timeslot) in all trafc types. Although Algo-I ignores much of the CSI, it outperforms Algo-II and Algo-III signicantly. This is because Algo-II and Algo-III
Since we allow only integer number of packet arrivals in each timeslot, we assume arrivals of and packets to allow for non-integer average number of arrivals . 6 We use a slightly modied Pareto, i.e. its cumulative distribution function of P (X < x) = 1 (k/x) , where k is the minimum value the random variable X can take and we set = 2.
5

16

200 Avg. Queue Backlog (pkts/timeslot/user) 180 160 140 120 100 80 60 40 20 0 2 3 4 5 6 7 Avg. Load (pkts/timeslot/user) 8 9 AlgoI AlgoII AlgoIII AlgoIV

Fig. 3.

Average queue backlog for the truncated heavy-tailed distribution.


200 Avg. Queue Backlog (pkts/timeslot/user) 180 160 140 120 100 80 60 40 20 0 2 3 4 5 6 7 Avg. Load (pkts/timeslot/user) 8 9 AlgoI AlgoII AlgoIII AlgoIV

Fig. 4.

Average queue backlog for the Poisson distribution.

over-assign the capacity. In addition, it is because in this low-to-moderate trafc regime, the trafc load is light enough compared to the system capacity so that the load can be sustained even with the On-Off knowledge of the CSI. However, as the trafc intensity increases, the performance of Algo-I sees a sharp degradation reecting the policys maximum stable rate (about 6 to 8 packets/timeslot/user, depending on trafc burstiness). In other words, Algo-II has a larger stability region than Algo-I since Algo-II uses full CSI and its over-assignment problem disappears at heavy trafc. This insight sheds light on nature of delay performance versus throughput considerations and the benet of using queue information. When considering light-to-moderate trafc intensity (resulting in reasonable delays), the value of QSI outplays that of CSI. This means that CSI is critical for high throughput and delay-insensitive applications, while QSI is vital for delay sensitive trafc for low-to-moderate throughput.

17

200 Avg. Queue Backlog (pkts/timeslot/user) 180 160 140 120 100 80 60 40 20 0 2 3 4 5 6 7 Avg. Load (pkts/timeslot/user) 8 9 AlgoI AlgoII AlgoIII AlgoIV

Fig. 5.

Average queue backlog for the constant arrivals.

VI. C ONCLUSION

AND

F UTURE R ESEARCH

In this paper, we considered the problem of subcarrier allocation in OFDMA system. We argued that conventional water-lling policies based on maximizing instantaneous throughput are too myopic and ignore important information about state variable (queue length). We identied a policy (MTLB) that achieves the instantaneous maximum throughput as well as balancing the queue lengths. Such a policy always exists when the channel follows a symmetric ON/OFF model. In such case, we conjectured that MTLB achieves the minimum average delay (mean response time) at any time. We proved this when N = 2 while the proof for N 3 users remains open. that are expected to be close to the optimal delay performance for a large subset of admissible loads. The policies use different degrees of knowledge on connectivity and queue information (called Channel and Queue State Information, CSI and QSI, respectively). For more realistic channels (general connectivity matrix) but symmetric users, we proposed two heuristic policies (Algo-III and Algo-IV) based on the insights obtained from the On-Off connectivity case. We showed by simulation that the value of CSI and QSI in optimizing the performance heavily depends on the arrival statistics. We showed that in low-to-moderate trafc regime and from a delay optimality perspective, balancing the queues is more critical than opportunistically taking advantage of CSI. The opposite becomes true in the heavy trafc regime. Furthermore, the proposed heuristic algorithms (AlgoIII and Algo-IV) that use both sets of information and perform well in both regimes. What remains is the extension of the above results to a network of heterogeneous users (in terms of priority, statistics of arrivals and channel quality). We believe that in such systems the fundamental tradeoff between instantaneous

18

throughput and setting up the system for highest future multi-user diversity remains. The challenge, though, is to understand how heterogeneity of users will impact the notion of balancing. It is useful to note that our symmetric assumption is a statistical notion. This is justied if we assume variation in the channel and mobility creates a homogeneous environment for all users. In case where this is not true, as the case with low mobility or users in a large cell, the near-far problem and consequent fairness-throughput issues are challenging problems by themselves as evident in [25] and the references therein. We note that our work is only complimentary to these papers, in that, we address the signicance of queue information in allocating subcarriers even in the absence of such near-far problem. Therefore, combining the general connectivity matrix and heterogeneous users into our model to reect the practical system is an important problem for future research. ACKNOWLEDGMENT This research was supported in part by the National Science Foundation ADVANCE Cooperative Agreement No. SBE-0123552. The authors would like to thank Professor Richard Ladner and Tami Tamir for their helpful suggestions. R EFERENCES
[1] R.S. Cheng and S. Verdu, Guassian multiaccess channels with ISI: capacity region and multiuser water-lling, IEEE Trans. on Inform. Theory, vol. 39, no. 3. pp. 773-785, May 1993. [2] S. Kittipiyakul and T. Javidi,Subcarrier allocation in OFDMA systems: beyond water-lling, 2004 Asilomar Conference on Signals, Systems, and Computers, Nov. 2004. [3] D. Kivanc, and H. Liu, Computationally efcient bandwidth allocation and power control for OFDMA, IEEE Trans. on Wireless Comm., vol. 2, no. 6, Nov. 2003. [4] C. Y. Wong, R. S. Cheng, K. B. Lataief, and R. D. March, Multiuser OFDM with adaptive subcarrier, bit and power allocation, IEEE JSAC., vol. 17, no. 10, pp. 1747-1757, Oct. 1999. [5] S. Pietrzyk and G. J. M. Janssen, Multiuser subcarrier allocation for QoS provision in the OFDMA systems, IEEE VTC 2002 Fall, vol. 2, Sept. 2002. [6] E. Bakhtiari and B. Khalaj, A new joint power and subcarrier allocation scheme for multiuser OFDM systems, 14th IEEE Proc. on Personal, Indoor and Mobile Radio Comm., 2003. [7] G. Zhang, Subcarrier and bit allocation for real-time services in multiuser OFDM systems, 2004 IEEE International Conf. on Communications, 2004. [8] Y. Zhang and K. Lataief, Adaptive resource allocation and scheduling for multiuser packet-based OFDM networks, IEEE ICC 2004, Paris, June 2004. [9] H. Yin, H. Liu, An efcient multiuser loading algorithm for OFDM-based broadband wireless systems, Globecomm 00, San Francisco, USA, 2000.

19

[10] M. Ergen, S. Coleri and P. Varaiya, QoS aware adaptive resource allocation techniques for fair scheduling in OFDMA based broadband wireless access systems, IEEE Transactions on Broadcasting, vol. 49, no. 4, Dec. 2003. [11] J. Jang, K.B. Lee, and Y.H. Lee, Transmit power and bit allocations for OFDM systems in a fading channel, Proc. IEEE GLOBECOM 2003, Dec. 2003. [12] G. Munz, S. Petschinger, and J. Speidel, An efcient water-lling algorithm for multiple access OFDM, IEEE Globecom 02, Taipei, Taiwan, November 2002. [13] W. Rhee and J.M. Ciof, Increase in capacity of multiuser OFDM system using dynamic subchannel allocation, Proc. of Vehicular Tech. Conf. (VTC), 2000, vol. 2, pp.1085-1089, May 2000. [14] G. Song and Y. Li, Cross-layer optimization for OFDM wireless networks - part I: theoretical framework, IEEE Trans. Wireless Comm., v.4, no.2, March 2005. [15] J. Gross, J. Klaue, H. Karl and A. Wolisz, Subcarrier allocation for variable bit rate video streams in wireless OFDM systems, IEEE VTC, Florida, USA, 2003. [16] J. Gross, J. Klaue, H. Karl, and A. Wolisz, Cross-layer optimization of OFDM transmission systems for MPEG-4 video streaming, Computer Communications, 2004. [17] G. Li and H. Liu, Dynamic resource allocation with nite buffer constraint in broadband OFDMA networks, IEEE Wireless Comm. and Networking, v. 2, pp. 1037-1042, March 2003. [18] G. Song, Y. Li, L. Cimini, Jr., and H. Zheng, Joint channel-aware and queue-aware data scheduling in multiple shared wireless channels, Globecomm 2003, San Francisco, 2003. [19] A. Czylwik, Adaptive OFDM for wideband radio channels, Proc. GLOBECOM 96, vol. 1, pp 713-718. [20] R. Kumar and P. Varaiya, Stochastic Control, Prentice-Hall, 1986. [21] L. Tassiulas and A. Ephremides, Dynamic server allocation to parallel queues with randomly varying connectivity, IEEE Trans. on Info. Theory, vol. 39, no. 2, pp. 466-478, 1993. [22] A. Ganti, Transmission Scheduling for Multi-Beam Satellite Systems, Doctoral Thesis, Dept. of EECS, MIT, Cambridge, MA, 2003. [23] N. Ehsan and M. Liu, Properties of optimal resource sharing in a delay channel, IEEE CDC 2004, Dec. 2004. [24] T. Javidi, Rate stable resource allocation in OFDM systems: from waterlling to queue-balancing, Allerton Conference on Communication, Control, and Computing, September 2004. [25] Z. Shen, J. G. Andrews, and B. L. Evans, Adaptive resource allocation in multiuser OFDM systems with proportional fairness, IEEE Trans. on Wireless Communications, Dec. 2005. [26] D. M. Topkis, Supermodularity and Complementarity, Princeton University Press, USA, 1998. [27] U. Manber, Introduction to Algorithm: a creative approach, Addison-Wesley Publishing Company, 1989. [28] N. Harvey, R. Ladner, L. Lovasz, and T. Tamir, Semi-matchings for bipartite graphs and load balancing, Proc. of the Workshop on Algorithms and Data Structures (WADS 03), Ottawa, Canada, July 2003. [29] K. Mehlhorn and S. Nher, The LEDA Platform of Combinatorial and Geometric Computing, Cambridge University Press, 1999. a [30] B. Hajek, Optimal control of two interacting service stations, IEEE Trans. Auto. Control. AC-29, pp. 491-499, 1984. [31] G. Koole, Convexity in tandem queues, Prob. in Engr. and Info. Sciences, 2004. [32] E. Altman, B. Gaujal and A. Hordijk, Discrete-Event Control of Stochastic Networks: Multimodularity and Regularity, Springer-Verlag, Germany, 2003. [33] A. W. Marshall and I. Olkin, Inequalities: Theory of Majorization and Its Application, Academic Press, USA, 1979. [34] E. M. Yeh and A. S. Cohen, Delay optimal rate allocation in multiaccess fading communications, Proc. Allerton Conf. on Communication, Control, and Computing, Monticello, IL, 2004

20

[35] Wikipedia, Long-range dependency, http://en.wikipedia.org/wiki/Heavytail

A PPENDIX I C OMPUTATION
AND

E XISTENCE

OF

MTLB

POLICY

In this section, we prove the existence of MTLB policy under an On-Off channel model. The proof is constructive and uses ideas from graph matching literature. An algorithm to compute MTLB assignment is proposed. This algorithm is based on notions of alternating and balancing paths which are later used to prove the optimality of MTLB. Note that the discussion in this section (hence the existence proof) is valid for general N.

A. Alternating and Balancing Paths Let U be the set of all queues and V the set of all servers. We have the following denitions: Denition 5: The ordered interleaved sequence of user-servers S(W, u1 , uk ) := (u1 , v1 , u2 , . . . , uk1 , vk1, uk ) is said to be an alternating path from node u1 to node uk corresponding to allocation W if a) ul = uj and vl = vj for all l = 1, . . . , k, j = 1, . . . , k and l = j; b) Both queues ul and ul+1 have connectivity to server vl , i.e. cvl ,ul = cvl ,ul+1 = 1; and c) W assigns server vl1 to queue ul , i.e. wvl1 ,ul = 1; In addition, S(W, u1 , uk ) is called a balancing path from node u1 to node uk corresponding to allocation W if it also meets condition d) bu1 wu1 buk wuk + 2 where wul =
K i=1

wi,ul for l = 1, k.

Denition 6: W a (W b ) is the alternating (balancing) allocation of allocation W along an alternating (balancing) path S(W, ul , uk ) if queues ul are reassigned to servers vl , l = 1, . . . , k 1. An example of alternating path and alternating allocation is shown in Figure 6. Notice that the above notions are conceptually similar to the notion of alternating path in the graph matching literature [27]. In particular, our denition of balancing path is related to the notion of cost-reducing path in [28] in that a balancing path is used to reduce the cost of unbalancedness of the queues.

21

u1 v1 u2 u3 v2

u1 u2 u3

v1 v2

(a)
Fig. 6.

(b)

Example of an alternating path and the alternating allocation from queue u1 to queue u3 (a) Alternating path (u1 , v1 , u2 , v2 , u3 ). The dotted and solid lines show connectivity. The solid lines show the assignment of queues to servers (b) Alternating allocation.

B. Existence of MTLB: Construction and Computation Now we propose a graph algorithm to construct MTLB assignment. We rst convert the original graph of queues and servers (Figure 2) into the following Equivalent Bipartite Graph with proper weights on the edges. We then run Maximum Weight Matching (MWM) on the equivalent bipartite graph. In Theorem 2, we show that the resulting assignment satises conditions (C1) and (C2), hence, it is MTLB.
Equivalent Bipartite Graph Construction 1) Associated with each queue j, construct mj = min(bj ,
K i=1 cij )

nodes labeled as aj1 , aj2 , . . . , ajmj .

2) Let U eq = {a11 , a12 , . . . , a1m1 , a21 , . . . , aN mN } be the set of all such nodes. 3) Let V eq = {vi }i=1 be the set of servers. 4) Let E eq = {(ajm , vi ) : cij = 1} be the set of edges representing connectivities. 5) Let : E eq Z++ , (ajm , ) = bj m + 1 be the positive integer weight of each edge in E eq .
K

Denition 7: [29] Consider a bipartite graph (U, E, V ) with weight function : E R. A matching M is a subset of E such that no two edges in M share an endpoint. The weight of a matching M is (M) =
eM

(e). A matching M is a maximum weight matching (MWM) if its weight is at least as

large as the weight of any other matching . Denition 8: A subcarrier allocation W = {wij } and a matching M are said to be equivalent when 1) M is a matching on the equivalent bipartite graph, and 2) wij = 1 if and only if there exists m such that (ajm , vi ) is a matching edge, i.e. (ajm , vi ) M. Theorem 2 uses the notion of MWM to constructively prove the existence of MTLB allocation. For that, we need the following lemma: Lemma 1: Any allocations W and W in W(n, b, C), for n = 1, . . . , T , that achieve the maximum throughput L are derived from each other by a sequence of alternating allocations.

22

Proof: Let w := 1W . It is sufcient to show that if both w + ei and w + ek are feasible and both have the maximum throughput L, then there exists an alternating path S(w + ek , ui, uk ). If this is not the case, then w + ei + ek must also be feasible. But the throughput of w + ei + ek is equal to L + 1 which is a contradiction. Lemma 1 means that any throughput optimal allocations relate to each other by re-assignment of some servers while the throughput is kept unchanged. This is true because a subcarrier can serve one packet from any of the connected servers by the On-Off channel assumption. Corollary 2: An allocation W W(n, b, C), for n = 1, . . . , T , which satises Condition (C1) also satises the Load-Balancing Condition (C2) if and only if it has no balancing path. Proof: First observe that any allocation satisfying the Condition (C2) must also satisfy Condition (C1) because if not, there would be an idle server that could have been assigned to serve one more packet and the queues will be more balanced using this idle server. With this observation, the corollary holds using Lemma 1. Theorem 2: A maximum weight matching on the equivalent bipartite graph is equivalent to a MTLB allocation, i.e. the equivalent allocation satises both conditions (C1) and (C2). Proof: Since all weights are positive, the MWM matching on the equivalent bipartite graph necessarily matches all possible servers and hence the equivalent allocation achieves maximum throughput (condition (C1)). We prove the load balancing condition (C2) by contradiction. Suppose the maximum weight matching M results in the allocation W that achieves the maximum throughput but does not produce the most balanced queues. But from Corollary 2, we know that there must exist a balancing path S(w, uj , ui) from some queue uj to ui such that bj wj bi wi + 2. Let us denote the balancing allocation of W along S(w, uj , ui) as W b . Let M b be the equivalent matching to W b . According to M, node aiwi is matched and aj(wj +1) is not, while the reverse is true for M b . Hence, (M b ) (M) = bj wj (bi wi + 1) 1. But this is a contradiction to the assumption that M is the maximum weight matching. An example of MTLB assignment based on the proposed algorithm is shown in Figure 7. It is intuitive to see that maximum weight matching on the equivalent bipartite graph achieves MTLB assignment. This is because the equivalent bipartite graph in effect expands individual packets that can possibly be served into nodes and basically labels each packet with the number of packets waiting behind it (see Figure 7(b)). The maximum weight matching selects the matching that serves the packets with the most number of

23

U1 A a 11 B C (a) U1 5 4 U2 4 3 2 (b) B C

U2

a 12 a 21 a 22 a 23

U1 5 3 A U2 4 3 B C (c)

Example of MTLB construction (a) queue lengths and connectivities; (b) The equivalent bipartite graph with the weights are shown at each subnode e.g. the weights of the edges (a11 , A) and (a11 , B) are ve. The thick edges indicate the maximum weight matching; (c) The edges indicate the resulted MTLB assignment. The leftover queue length after the allocation is {3, 3}.
Fig. 7.

packets waiting behind them. This achieves maximum throughput and load-balancing at the same time. Over-assignments are avoided since, in the equivalent bipartite graph, only min bj , from each node j is expanded. A PPENDIX II O PTIMALITY
OF
K i=1 cij

packets

MTLB

POLICY FOR

O N -O FF C ONNECTIVITY

In this section, we analyze the solution to Problem (P) and show that MTLB policy is optimal for Problem (P) for symmetric On-Off connectivity when N = 2. We conjecture the optimality for N 3. A. Basic Denitions and Notations We rst revisit the notion of instantaneous cost function , introduced in Problem (P). We use a similar framework as in [23], [30] and [31]. We rst dene a class of functions, F , to which any cost function (b) of the form
N j=1 g(bj )

belongs, where g is strictly increasing and convex. We then show that if the

cost function belongs to F , then the average optimal cost-to-go function (dened in (10)) also belongs to F . The properties of F are then used to show the optimality of MTLB policy. For notation convenience, we rst dene: Denition 9: Dene function Rij : ZN ZN to be equivalent to a transfer of a packet from queue ui to queue uj i.e. Rij (b) = b ei + ej where em is a row vector of zeros except for the mth element which is one.

24

Next we give the denition of F , the class of the cost functions we consider: Denition 10: A function f : ZN R belongs to the set F if, for any i, j {1, . . . , N}, f satises: + (B.1) f (b) f (b + ei ); (B.2) f (b) = f ((b)) for any permutation ; (B.3) f (b + ei ) f (b) f (b + ei + ej ) f (b + ej ); (B.4) 2f (b) f (b + ei ) + f (b ei ); (B.5) 2f (b) f (Rij (b)) + f (Rji (b)); and (B.6) f (Rij (b)) f (b) if and only if bi bj + 1. (B.1) is a monotonicity condition. (B.2) is permutation invariance. (B.3) is supermodularity [31]. (B.4) is convexity in bi . (B.5) is directional convexity along b1 + b2 = constant line [31]. Conditions (B.3)-(B.5) are the second-order relations related to convexity for discrete functions.7 (B.6) is the balancing advantage and establishes the optimality of the optimal MTLB policy. Fact 1: Any strictly increasing and convex function of the form ing and convex, belongs to F . Proof of this fact is based on simple testing of (B.1) to (B.6) and hence is left to the readers.
N j=1

g(bj ), where g is strictly increas-

B. Dynamic Programming Formulation


Next we consider a dynamic programming formulation in which Vn (b, C) is the expected cost-to-go

at horizon n under Markovian policy . Let allocation W (n, b, C) W(n, b, C) denote the allocation at state (b, C) prescribed by policy at time n. It is clear that:
Vn (b, C) = (b) + Ea,C [Vn1 (b + a 1W (n, b, C), C)].

(8)

The equivalence of our cost function in (2) and (8) is due to the validity of dynamic programming theorem for a nite horizon Markov Decision Process (MDP) [20]. Dene
Vn (b, C) := min Vn (b, C) Un

(9)

to be the minimum cost over all Markovian policies at horizon n.


7 Due to the symmetric assumptions (i.e. condition (B.2)) in our model, conditions (B.3)-(B.5) are special cases of the multimodularity condition stated by Hajek [32].

25

Furthermore, we dene the average optimal cost-to-go function as


vn (b) := Ea,C [Vn (b + a, C)] .

(10)

In the following Proposition we show the iterative structure of vn . Proposition 1: Given a horizon n, the average optimal cost-to-go at time n, vn (b), satises the following recursions: v0 (b) = (b) vn (b) = (b) + Ea,C
wW(n,C)

(11) min vn1 ([b + a w]+ ) (12)

where Ea,C denotes the expectation with regard to the statistics of arrivals and connectivity and (b) := Ea [(a + b)]. Proof: From dynamic programming, we have the following recursion for the optimal cost-to-go
Vn (b, C):

V0 (b, C) = (b)
Vn (b, C) = (b) + wW(n,b,C)

inf

Ea,C [Vn1 (b + a w, C)]

(13)

Since W(n, b, C) is nite, there exists an optimal packet withdrawal w (n, b, C) at time n when the state of the queue backlogs is equal to the vector b and the connectivity prole is C. The optimal cost-to-go (13) can, then, be rewritten as:
Vn (b, C) = (b) +

wW(n,b,C)

min

Ea,C Vn1 (b + a w, C)

(14)

= (b) + Ea,C Vn1 (b + a w (n, b, C), C)

= (b) + vn1 (b w (n, b, C)) . Now taking the expectation of both sides we have:
vn (b) = Ea,C [Vn (b + a, C)]

(15)

= (b) + Ea,C = (b) + Ea,C

wW(n,b+a,C)

min

vn1 (b + a w)

(16) (17)

wW(n,C)

min

vn1 ([b + a w]+ ) .

26

where the last equality is a result of the fact that for any allocation W W(n + 1, C), there exists W W(n + 1, b, C) such that vn (b 1W ) = vn ([b 1W ]+ ). In addition, v0 (b) = Ea,C [V0 (b + a, C)] = (b). C. Optimality of the MTLB policy for N = 2 Users Outline of the Proof Consider Problem (P) with a nite horizon T given any initial state I0 = (b, C) and a strictly increasing and convex cost function (b) = via the following three statements: (ST1) vn is strictly increasing, n = 0, 1, . . . , T ; (ST2) vn F and vn is strictly increasing MTLB is optimal at stage n + 1; and (ST3) vn F and MTLB is optimal at n + 1 vn+1 F . Notice that (ST2) and (ST3) allow for an inductive proof of the optimality when (b) = We prove (ST1) in Lemma 2. Lemma 2: If (b) =
N j=1 N j=1 N j=1 g(bj ),

where g is strictly increasing convex. We prove Theorem 1

g(bj ).

g(bj ) where g is strictly increasing and convex, then vn (b, C) is strictly

increasing on b for all n = 0, . . . , T . By using (ST1), we have that any optimal allocation at any stage n must achieve the maximum throughput (condition (C1)) (see Lemma 3 below). Hence, Lemmas 3 and 4 below are sufcient to establish (ST2): Lemma 3: If vn is strictly increasing for all n = 0, . . . , T 1, any optimal allocation satises the maximum throughput condition (C1) in the denition of the MTLB policy. Lemma 4: If vn F and vn is strictly increasing, then MTLB policy is optimal at stage n + 1. The last step is to prove (ST3). However, proving (ST3) requires more complex development. Due to boundary consideration, it is more convenient to work with a function on ZN rather than ZN . We dene + the following extension: Denition 11: Consider f : ZN R. We denote f : ZN R as an extension of f on ZN such that + f (b) = f ([b]+ ). Furthermore, we dene an extension F of F : F := f : ZN R : f meets (B.1) to (B.6) (18)

27

The above extensions together with the optimality of MTLB at n + 1 facilitate the proof of (ST3) as follows: vn F vn F Ea,C (by Fact 2) min vn (b + a w) (19) satises (B.3) to (B.6) (by Lemmas 6 and 7) (20) (21) (22)

wW(n+1,C)

vn+1 F vn+1 F

(by Lemmas 2, 5 and Fact 3) (by Fact 4)

where the facts and lemmas are listed below. All the facts are from [23] and [30] and the lemmas are proved in Appendix III. Fact 2: If f F , then the function f : ZN R dened as f (b) = f ([b]+ ) is in F. Fact 3: If f1 , f2 , . . . are a sequence of functions that belong to F, then h(b) = to F , where pl are constants. Fact 4: If f F , then the restriction of f to non-negative domain is in F . Fact 5: If f1 , f2 , . . . are a sequence of functions that belong to F , then h(b) = to F , where pl are constants. Lemma 5: If (b) =
N j=1 g(bj ) l l

pl fl (b) also belongs

pl fl (b) also belongs

where g is a strictly increasing and convex function, then vn (b, C) is

permutation invariant on b for all n = 0, . . . , T . Lemma 6: If vn F and MTLB is optimal at stage n + 1, then Ea,C minwW(n+1,C) vn (b + a w) satises (B.3) to (B.5). Lemma 7: If vn F and MTLB is optimal at stage n + 1, then Ea,C minwW(n+1,C) vn (b + a w) satises (B.6). Proof of Theorem 1 Proof: With the above outline, we are ready to show that MTLB is optimal for all stage n + 1, n = 0, . . . , T 1. But by (ST2), it sufces to show that vn F and strictly increasing for all n. We show this by induction. Note that, by Lemma 2, we already have (ST1), i.e. vn is strictly increasing for all n. Basis of Induction: By Fact 1, the cost function (b) = convex, belongs to F . From (11), v0 (b) = (b) =
a N j=1 g(bj ),

where g is strictly increasing and

Pa (a)(b + a). By Fact 5, v0 (b) belongs to F .

Induction Step: Suppose vn F . By (ST2), MTLB is optimal at stage n + 1. Hence, by (ST3), we have vn+1 F . Note that (ST3) is established by showing (19) to (22). Since the proofs of (19), (20) and (22) are

28

straightforward from the stated lemmas and facts. Here we focus on (21). Notice that vn+1 (b) = ([b]+ ) + Ea,C min vn (b + a w) .

wW(n+1,C)

Since F , we have that ([b]+ ) is in F by Fact 2. Because Ea,C minwW(n+1,C) vn (b + a w) meets (B.3) to (B.6) by (20), we have that vn+1 also has the same properties (by Fact 3). In addition, by Lemmas 2 and 5, vn+1 also has the properties (B.1) and (B.2). Therefore, vn+1 F . D. Optimality of the MTLB policy for N > 2 It is more difcult and involved to prove the optimality of MTLB policy for N > 2 users as we conjecture as follows: Conjecture 1: If the cost function is strictly increasing Schur convex8 [33] then vn is strictly increasing Schur convex function for all n. Given this conjecture, the proof of optimality for N > 2 is similar to Lemma 4. In the lemma, we would have b w1 b w2 and hence vn (b w1 ) vn (b w2 ). We believe that the proof of the conjecture would need a combination of sample path argument and majorization techniques (see [34]) and is a topic of further research. A PPENDIX III S UPPORTING L EMMAS In this appendix we establish the proofs for lemmas 2 to 7 stated above. The rst lemma establishes (ST1) and that vn satises condition (B.1) for all n = 0, . . . , T . Lemma 2: If (b) =
N j=1 g(bj ) where g is a strictly increasing and convex function, then vn (b, C) is

strictly increasing on b for all n = 0, . . . , T , i.e. b > b vn (b ) > vn (b).


Proof: Since vn is linearly related to Vn by (10), it sufces to show the strict monotonicity of Vn (b, C). We show by induction.

Induction Basis: V0 (b, C) = (b) =


8

N j=1 g(bj )

is strictly increasing by the assumption of g.

For any x = (x1 , . . . , xN ) RN , let x[1] . . . x[N] denote the components of x in decreasing order. For x, y RN , x is said to be weakly majorized by y, written x w y, if k x[i] k y[i] , k = 1, . . . , N . If, in addition, equality obtains for k = N , x is said i=1 i=1 to be majorized by y, written x y. A function f : RN R is Schur-convex if x y f (x) f (y).

29 Induction Step: Assume Vn1 (b , C) > Vn1 (b, C) for b > b, then Ea,C [Vn1 (b + a w , C)] Ea,C [Vn1 (b + a w(w), C)]

Vn (b , C) = (b ) +

w W(n,b ,C) w W(n,b ,C) wW(n,b,C)

min

(b ) + (b ) + > (b) +

min

min

Ea,C [Vn1 (b + a w, C)]

wW(n,b,C)

min

Ea,C [Vn1 (b + a w, C)]

= Vn (b, C),

where, for w W(n, b , C), we dene w(w) W(n, b, C) as the allocation that assigns to each queue
j the same number of servers as w unless the queue is empty, in which case it assigns wj (bj bj ). In

light of this, the rst inequality holds by the induction hypothesis and the fact that w(w ) w (b b). The second inequality holds because w(w) W(n, b, C). The third inequality is a result of the strict monotonicity of . Lemmas 3 and 4 provide (ST2). Lemma 3: If vn is strictly increasing for all n = 0, . . . , T 1, any optimal allocation satises condition (C1) in the denition of the MTLB policy. Proof: (by contradiction) Assume W W(n + 1, b, C) is optimal but does not satisfy (C1), i.e.
i,j

wij = l < L, where L is the maximum achievable throughput. Now assume, without loss of generality,

that b1 w1 = 0 and b2 w2 > 0 and there is at least one idle server p such that cp1 > 0. Note that if there is an idle server i connected to u2 , we are done since 1(W + Ei2 ) W(n + 1, b, C), where Ei2 is a matrix of all zero except element (i, 2) which is 1. In other words, vn (b 1(W + Ei2 )) < vn (b 1W ) by the strict monotonicity of vn (Lemma 2), a contradiction with the optimality of W . Now we argue that there must exist a non-idle server such that it is connected to both u1 and u2 but assigned to u1 , i.e. S(W, u2 , u1 ) = {u2 , , u1 } is an alternating path. Note that if such server does not exist, u2 has used up all its connected servers and thus l is the maximum throughput, a contradiction. Now, let W a be the alternating allocation of W along S(W, u2 , u1), i.e. 1W a = 1W e1 + e2 . Notice that, under W a , one packet remains in u1 and thus the packet can be assigned to server p. Hence, 1W a + e1 W(n + 1, b, C). But 1W a + e1 is nothing but 1W + e2 . In other words, we have shown that 1W = 1W +e2 W(n+1, b, C). Hence, vn (b1W ) < vn (b1W ) by the strict monotonicity of vn , a contradiction with the optimality of W .

30

Lemma 4: If vn (b) F and vn (b) is strictly increasing on b, then the MTLB policy is optimal at stage n + 1. Proof: By Lemma 3, we have that the optimal allocation w (n+1, b, C) W (n+1, b, C) achieves the maximum throughput (C1). However, among all allocations achieving the maximum throughput, we
need to show that the most balanced allocation wM T LB achieves the minimum cost function Vn+1 (b, C)

(Eqn. (15)). In another word, vn (b wM T LB ) = minwW(n+1,b,C) vn (b w). Assume there exists two possible allocations we can choose from, w1 and w2 W(n + 1, b, C),
1 1 such that w2 = R21 (w1) and b1 w1 b2 w2 + 2. Since vn meets condition (B.6), it gives that

vn (b w1 ) vn (R12 (b w1 )) = vn (b w2 ). Thus, conguration b w2 which is more balanced than conguration b w1 achieves lower (or the same) average cost vn () at stage n + 1. Repeating this process in a nite number of steps, we arrive at an optimal allocation which is MTLB. The rest of the appendix provides Lemmas 5 to 7 necessary to establish (ST3) as discussed in (19) to (22) in the outline of the proof of the main theorem. The next lemma shows that vn satises (B.2) for all n = 0, . . . , T . Lemma 5: If (b) =
N j=1 g(bj ) where g is a strictly increasing and convex function, then vn (b, C) is

permutation invariant on b for all n = 0, . . . , T , i.e. vn ((b)) = vn (b).


Proof: Again, it sufces to show the permutation invariance property of Vn (b, C).

Induction Basis: V0 (b, C) = (b) =

N j=1 g(bj )

is clearly permutation invariant.

Induction Step: Assume Vn1 ((b), (C)) = Vn1 (b, C), then Ea,C [Vn1 ((b) + a w, C)]

Vn ((b), (C)) = ((b)) +

wW(n,(b), (C))

min

= (b) + = (b) + = (b) +

wW(n,(b), (C)) wW(n,b,C)

min

Ea,C [Vn1 ((b) + (a) w, (C))]

min

Ea,C [Vn1 ((b) + (a) (w), (C))] Ea,C [Vn1 (b + a w, C)]

wW(n,b,C)

min

= Vn (b, C),

where the second equality is a direct result of Assumptions (A1) and (A2). The third equality holds since W W(n, b, C) (W ) W(n, (b), (C)). The fourth equality follows from the induction hypotheses.

31

Next, we show that Ea,C minwW(n+1,C) vn (b + a w) satises (B.3) to (B.6) whenever vn F. Without loss of generality, we consider the case i = 1 and j = 2 in (B.3) to (B.6). But before we proceed, for notational simplicity, we write Tna,C (b) := We also dene the following denition: Denition 12: Dene W (n + 1, b, C) to be the set of all optimal allocations at time n + 1, when the state of the system at time n + 1 is (b, C). In other words, W (n + 1, b, C) := {W W(n + 1, C) : vn ([b 1W ]+ ) = We are now ready to show the following lemma: Lemma 6: If vn F and MTLB is optimal at stage n+1, then, for any state b, Ea,C Tna,C (b) satises (B.3), (B.4), and (B.5). Proof: By using Fact 3 in Appendix II-C, it sufces to show that Tna,C (b) satises conditions (B.3) to (B.5) for any realization (a, C) of the arrival and connectivity processes. For short notation, let b := a + b. There exists a MTLB allocation w W(n + 1, C) such that w W (n + 1, b , C) W (n + 1, b + e1 , C) W (n + 1, b + e1 + e2 , C). This is because 1) adding one packet to each queue does not create any balancing paths, i.e. if w W (n + 1, b , C), then w W (n + 1, b + e1 + e2 , C); and 2) w W (n + 1, b , C) can be chosen such that it gives priority to serving queue u1 , hence, adding one packet to u1 does not create any balancing paths, i.e. w W (n + 1, b + e1 , C). For notational simplicity, let d := b w Z2 . (i) Tna,C (b) satises (B.3): Tna,C (b + e1 + e2 ) + Tna,C (b) Tna,C (b + e1 ) Tna,C (b + e2 ) =
wW(n+1,C)

wW(n+1,C)

min

vn (b + a w).

(23)

W W(n+1,C)

min

vn ([b 1W ]+ )}

(24)

min

vn (b + e1 + e2 w) + vn (b + e1 w)

wW(n+1,C)

min

vn (b w) vn (b + e2 w)

wW(n+1,C)

min

wW(n+1,C)

min

vn (d + e1 + e2 ) + vn (d) vn (d + e1 ) vn (d + e2 ) 0, where the last inequality holds because vn F and hence satisfying condition (B.3).

32

(ii) Tna,C (b) satises (B.4): We need to show the non-negativity of Tna,C (b + e1 ) 2Tna,C (b) + Tna,C (b e1 ) =
wW(n+1,C)

min

vn (b + e1 w) 2 min

wW(n+1,C)

min

vn (b w) +

wW(n+1,C)

min

vn (b e1 w) (25)

= vn (d + e1 ) 2n (d) + v We consider two cases.

wW(n+1,C)

vn (b e1 w)

Case 1: If w W (n + 1, b e1 , C), then we are done. Case 2: w W (n + 1, b e1 , C). This means that S(w , u2 , u1), d2 = d1 + 1, and wa W (n + 1, b e1 , C) where wa := 1W a and W a is the alternating allocation of w along S(w , u2 , u1). In another word, wa = R12 (w ) = w e1 + e2 . Now, (25) becomes vn (d + e1 ) 2n (d) + v min vn (b e1 w)

wW(n+1,C)

= vn (d + e1 ) 2n (d) + vn (b e1 wa ) v = vn (d + e1 ) 2n (d) + vn (d e2 ) v = vn (d + e1 ) vn (d) vn (12 (d)) + vn (d e2 ) = vn (d + e1 ) vn (d) vn (d + e1 e2 ) + vn (d e2 ) 0, where the second equality holds because b e1 wa = de2 , the forth equality holds because d2 = d1 +1, and the last inequality holds because vn F hence satisfying condition (B.3). (iii) Tna,C (b) satises (B.5): We need to show the non-negativity of Tna,C (R12 (b)) 2Tna,C (b) + Tna,C (R21 (b)) = =
wW(n+1,C)

min

vn (R12 (b ) w) 2

wW(n+1,C)

min

vn (b w) + min

wW(n+1,C)

min

vn (R21 (b ) w) (26)

wW(n+1,C)

min

vn (R12 (b ) w) 2n (d) + v

wW(n+1,C)

vn (R21 (b ) w)

Case 1: If w W (n + 1, R21 (b ), C) W (n + 1, R12 (b ), C), then we are done.

33

Case 2: If w W (n + 1, R21 (b ), C), then S(w , u1 , u2 ). In addition, d1 = d2 because w is chosen to give priority to serving u1 . After an alternating allocation of w along S(w, u1 , u2 ), we have R21 (w ) W (n + 1, R21 (b ), C). Thus, for this case, (26) becomes: min vn (R12 (b ) w) 2n (d) + v = = min min vn (R21 (b ) w)

wW(n+1,C)

wW(n+1,C)

wW(n+1,C)

vn (R12 (b ) w) 2n (d) + vn (d) v vn (R12 (b ) w) vn (d) (27)

wW(n+1,C)

min

Next we consider the two following subcases: Case 2.1: If w W (n + 1, R12 (b ), C), then (27) is equal to vn (R12 (d)) vn (d) 0, because d1 = d2 and vn F hence satisfying condition (B.6). Case 2.2: If w W (n + 1, R12 (b ), C), then S(w , u2, u1 ). After an alternating allocation of w along S(w , u2 , u1), we have R12 (w ) W (n + 1, R12 (b ), C). Thus, (27) is equal to zero. Case 3: If w W (n+1, R21 (b ), C) but w W (n+1, R12(b ), C), then, by permutation invariance property of vn , this is the same as Case 2.1 where w W (n + 1, R21 (b ), C) but w W (n + 1, R12 (b ), C). Lemma 7: Assume vn F and MTLB is optimal at stage n + 1. For any state b such that b1 b2 + 1, Ea,C [Tna,C (b)] satises condition (B.6). Proof: For notational simplicity, let us dene, for any (a, C), Z a,C (b) := Tna,C (b) Tna,C (R12 (b)). (28)

It is easy to see that, for some (a, C), Z a,C (b) can be negative. However, because of the joint permutation invariance of the pmfs of the arrival and connectivity processes (Assumptions A1 and A2), to show that Ea,C [Tna,C (b)] meets condition (B.6) is equivalent to show the non-negativity of Ea,C [Tna,C (b)] Ea,C [Tna,C (R12 (b))] = Ea,C Tna,C (b) Tna,C (R12 (b)) = Ea,C Z a,C (b) = 1 Ea,C Z a,C (b) + Z 12 (a),12 (C) (b) . 2

Thus, it sufces to show that, for any (a, C) and b1 b2 + 1, Z a,C (b) + Z 12 (a),12 (C) (b) 0.

34

We show this by noticing that Z a,C (b) + Z 12 (a),12 (C) (b) = Z a,C (b) + Tn 12
(a),12 (C)

(b) Tn 12

(a),12 (C)

(R12 (b))

= Z a,C (b) + Tna,C (12 (b)) Tna,C (12 (R12 (b))) = Tna,C (b0 ) Tna,C (b1 ) + Tna,C (bM ) Tna,C (bM 1 ) (29)

where M := b1 b2 ( 1) and bm := b me1 + me2 , for m = 0, . . . , M. The rst and third equalities follow from (28) and the second inequality from permutation invariance property. Note that 12 (b) = b2 e1 + b1 e2 = b (b1 b2 )e1 + (b1 b2 )e2 = bM , 12 (R12 (b)) = bM 1 , bm+1 = R12 (bm ), and bm1 = R21 (bm ). Now notice that if M = 1, then (29) is zero. If M 2, we have Tna,C (b0 ) Tna,C (b1 ) Tna,C (bM 1 ) + Tna,C (bM )
M 1

=
m=1 M 1

Tna,C (bm1 ) 2Tna,C (bm ) + Tna,C (bm+1 ) {Tna,C (R21 (bm )) 2Tna,C (bm ) + Tna,C (R12 (bm ))}
m=1

= 0,

where the inequality holds because vn F and, from Lemma 6, Tna,C () satises condition (B.5).

You might also like