Professional Documents
Culture Documents
Sustainable
Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom)
2020 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom) | 978-1-6654-1485-2/20/$31.00 ©2020 IEEE |
for game interactions. Using this system, players can play video streaming services such as YouTube and Netflix, cloud
high-end video games on common PCs, tablets and mobile gaming is more interactive and delay-sensitive. Therefore,
phones without dedicated hardware equipped. Moreover, cloud the constraints to be satisfied during resource allocation are
gaming allows players to start playing games instantly without more stringent. Second, running games generally should not
time-consuming software downloads and configurations. For be allowed to be migrated among servers due to interruption
all these benefits, cloud gaming has attracted attention from to gameplay and therefore requires a sufficiently intelligent
both academia and industry. Several cloud gaming services strategy to efficiently allocate resources. Third, the workload
are currently in operation, such as GeForce NOW [1], Liquid- in cloud gaming is highly dynamic, and thus the resource
Sky [2], Simply [3] and Vortex [4]. allocation strategy should also be able to adapt to changes
As cloud gaming has dynamic workloads and intensive efficiently.
resource demands, it is naturally suitable to deploy the service In this paper, we investigate the resource allocation issues in
on public cloud infrastructures such as Amazon EC2 [5] and geo-distributed cloud gaming from the perspective of CGSPs.
Microsoft Azure [6], due to their elastic and on-demand nature Specifically, we study the game request dispatching problem
of resource provisioning. The cloud gaming service providers with the goal of minimizing the resource rental cost, while pro-
(CGSPs) can dynamically rent resources from cloud providers, viding satisfactory services to players. The request dispatching
219
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY SURATHKAL. Downloaded on November 08,2023 at 17:50:24 UTC from IEEE Xplore. Restrictions apply.
including minimizing network traffic and latency, as well as degradation for games (e.g., the interaction delay grows or the
maximizing throughput and load balancing [27], [28]. Stolyar frame rate drops) and thus hurt user experiences. However,
et al. [29] applied a shadow routing approach to large-scale our analysis and proposed approach can easily be applied
service systems, but without packing constraints. Although [8], to the scenario with interferences considered, as long as the
[30] considered packing constraints when using shadow rout- interferences are considered in the definition of the vector
ing, their objectives are load balancing. In this paper, we s = (s1 , . . . , sI ).
instead aim to minimize the total server cost. Cloud gaming is delay sensitive, requiring that the network
III. P ROBLEM S TATEMENT delay from the player to the cloud server is not too large
[31], [32]. Otherwise, the total delay (including the network
A. System Model delay and encoding/decoding delay, etc) experienced by the
Consider a cloud gaming platform that provides services for player will be unacceptable. Different game genres usually
I different games, which are indexed by i ∈ I = {1, . . . , I}. have different delay requirements. For example, fast-paced
For each game i ∈ I, we assume that the play requests for games such as first-person shooters or car racing games, have
game i arrive at a rate of λi (i.e., there will be λi requests more stringent delay requirement than moderate-paced games
per minute on average). The period when a game is running such as third-person role-playing games [31], [32]. For each
is referred to as a session. The average session length of game i ∈ I, we define Di as the maximum network delay that
game i is denoted by 1/μi . Running games consumes several can be tolerated by players. For each play request of game i,
types of resources such as CPU, GPU, memory, etc. Suppose we call DC j eligible if the network delay from the player to
we are concerned about K types of resources, indexed by DC j is less than Di .
k ∈ K = {1, . . . , K}. Let ai,k denote the resource demand of
game i for resource k. Streaming the encoded video consumes
bandwidth. Let bi denote the bit rate for streaming the encoded B. Problem Statement
video of game i. For brevity, we assume all the requests
for the same game demand the same amount of resources. Our goal is to find the optimal strategy that dispatches new
In practice, players may have different play settings (e.g., game requests to DCs such that the total combined rental
different resolutions), making the real resource consumption cost of servers and bandwidth is minimized. When a new
slightly different from each other, even if they play the same request arrives, the strategy should decide which DC the
game. Our model can be applied to this case by conceptually request should be dispatched to (request-to-DC assignment)
considering different play settings as different games. and which server the request should be assigned to (request-
Suppose the cloud gaming platform is deployed on J geo- to-server assignment). If there is no need to open a new server,
graphically distributed DCs, which are indexed by j ∈ J = the strategy should choose one of the currently running servers
{1, . . . J}. The CGSP rents cloud servers (virtual machines) to accommodate the request. Once a game instance starts, it
from these DCs to run games and stream the corresponding will run on the same server during the entire game session.
output videos. For brevity, we assume the servers in the Migrating game instances from one server to another is not
same DC have the same type. Our analysis and proposed allowed due to large overhead and interruption to gameplay.
methods can easily be applied to the case where each DC has Let Cj denote the cost rate of renting a cloud server in DC
multiple server types, by conceptually considering the servers j. Let Vj denote the cost rate of using one unit bandwidth in
of different types as servers in different DCs. As the cloud is DC j. The present rate at which requests of game i are made
elastic, we assume there is a larger supply of servers from each such that the request is actually dispatched by the algorithm
DC than our demand. Denote by B the bandwidth capacity of to DC j may fluctuate. At any particular time, we take Li,j
each DC. Although we assume different DCs have the same to be the maximum rate that incoming requests fo game i can
B, our models can easily be extended to the scenario with be dispatched to DC j without violating latency requirements.
different B for different DCs. Our algorithm will dispatch incoming requests of game i to DC
For each cloud server in DC j, the capacity of resource k is j as a rate of λi,j . This rate will be changed over time by the
Aj,k . A server in DC j can simultaneously run multiple game algorithm as the number of incoming requests fluctuate. Also,
instances if the total resource demand does not exceed the let Ns,j denote the number of servers in DC j that are used
server’s capacity. We may place a server into a configuration in configuration s ∈ Sj according to our algorithm. Note that
denoted by a vector s = (s1 , . . . , sI ). Such a server may the variables λi,j and Ns,j are set by decisions made within
i∈I si game instances, running
simultaneously run up to the algorithm. For convenience, we define dependent variables
at most si instances of game i. Different servers will have of the number of servers used in DC j as ρj = s∈Sj Ns,j
different compatible vectors. The constraint i si ·ai,k ≤ Aj,k and of the total bandwidth utilization of DC j as rj =
and
must be satisfied for each k ∈ K. Denote by Sj the set of such i∈I bi λi,j /(μi B).
vectors that are maximal (not dominated by others). Note that Suppose that some algorithm is running and has entered
our model does not explicitly consider the interferences (e.g., a stable period, where the incoming rates λi and Li,j are
competition for the same resources) when multiple games run constant. Then, we want an algorithm that produces rates
on the same server. The interferences may cause performance λi,j and values Ns,j such that they solve (or are as close
220
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY SURATHKAL. Downloaded on November 08,2023 at 17:50:24 UTC from IEEE Xplore. Restrictions apply.
as possible) to the solution of the following linear program subject to
(denoted by P1):
0 ≤ λi,j ≤ Li,j , ∀i ∈ I, ∀j ∈ J (8)
xs,j ≥ 0, ∀s ∈ Sj , ∀j ∈ J (9)
min C j ρj + V j rj B (1)
{λi,j },{Ns,j }
j∈J j∈J
λi,j = λi , ∀i ∈ I (10)
j∈J
subject to bi λi,j /μi ≤ B, ∀j ∈ J (11)
i∈I
λi,j /μi ≤ ρ∗j si xs,j , ∀j ∈ J , ∀i ∈ I (12)
0 ≤ λi,j ≤ Li,j , ∀i ∈ I, ∀j ∈ J (2)
s∈Sj
Ns,j ≥ 0, ∀s ∈ Sj , ∀j ∈ J (3)
Consider the following optimization problem (denoted by
λi,j = λi , ∀i ∈ I (4) P2):
j∈J
min max{yj , zj } (13)
bi λi,j /μi ≤ B, ∀j ∈ J (5) {λi,j },{xs,j } j∈J
i∈I
subject to the constraints (8) to (12). We have the following
λi,j /μi ≤ si Ns,j , ∀j ∈ J , ∀i ∈ I (6) theorem.
s∈Sj
Theorem IV.1. P1 is equivalent to P2, i.e., the optimal
solution of P1 is also the optimal solution of P2, and vice
The first item in the objective function is the total cost versa.
of renting servers and the second item is the total cost of
bandwidth consumption. Constraint (5) ensures that the total Proof. We first prove that an optimal solution of P1 also
bandwidth consumption at each DC does not exceed the minimizes
∗ the objective function of P2 (i.e., (13)). Let
∗
bandwidth capacity of the DC. Constraint (6) ensures that all {λi,j }, {N s,j } denote an optimal solution of P1. Recalling
∗ ∗
the arrivals can be handled by the system, where the left part that ρj = s∈Sj Ns,j , define x∗s,j = Ns,j ∗
/ρ∗j . It is easy to
∗
is the amount of processing volume arriving due to requests see that {λi,j }, {xs,j } is the optimal solution of P1 . Let
∗
for game i in DC j, and the right part is the total volume yj∗ and zj∗ denote the values of yj and zj when λi,j = λ∗i,j and
of game i’s requests that can be served in DC j (si denotes xs,j = x∗s,j for problem P1 . It is easy to see that yj∗ = 1 and
the number of requests for game i in configuration s). Note zj∗ = 1 hold for all j ∈ J , which implies that: (1) the value of
that any algorithm that operates on this system model must the objective function (13) is equal to 1 when λi,j = λ∗i,j and
obey constraints (2) through (6), and therefore the solution of xi,j = x∗i,j for P2; and (2) the minimum value of the objective
this linear program represents a minimum possible cost of any function (13) cannot be larger than 1. We must therefore show
algorithm solving this problem. that min{λi,j },{xs,j } maxj∈J {yj , zj } < 1 is impossible.
Suppose {λi,j }, {xs,j } denote an optimal solution of P2
that has min{λi,j },{xs,j } maxj∈J {yj , zj } < 1. Let yj and
IV. R EQUEST D ISPATCHING A LGORITHM zj denote the values of yj and zj at this optimal solution.
We must therefore have maxj∈J {yj , zj } < 1, so it follows
that yj <
1 and ∗zj <1 for all∗ j ∈ J . Therefore,
A. Preliminary Analysis ∗
we
have j∈J C j ρ y
j j + j∈J V j r z
j j B < j∈J Cj ρj +
∗ ∗ ∗
j∈J Vj rj B. Recall that j∈J Cj ρj + j∈J Vj rj B is the
In this section, we transform P1 to an equivalent problem minimum value of theobjective function (7). This implies that
P2. The intention in doing this is that problem P2 has a the cost produced by {λi,j }, {xs,j } is less than the minimal
corresponding request dispatching algorithm that we use in cost of P1 , which is a contradiction. Therefore the optimal
this paper. The problem P1 is a linear convex optimization solution of P1 is also an optimal solution for P2.
problem, which
can be solved
efficiently by many approaches We next prove that an optimal solution of P2 also mini-
[33]. Let {λ∗i,j }, {Ns,j
∗
} denote an optimal solution of P1. mizes the objective function of P1 (i.e., (7)). Based on the
DC j at
In this optimal configuration, theserver use is ρ∗j = previous discussions, we know that the minimum value of
∗ ∗ ∗
s∈Sj Ns,j and bandwidth use is rj = i∈I bi λi,j /(μi B). the objective function (13) is equal to 1. Let {λi,j }, {xs,j }
∗ ∗
Let xs,j = Ns,j /ρj and yj = ρj /ρj . It follows that
denote an optimal solution of P2. Let yj and zj denote the
∗
s∈Sj xs,j = yj . Let zj = rj /rj . If we use xi,j , yj and values of yj and zj when λi,j = λi,j and xs,j = xs,j for P2.
zj to replace Ns,j , ρj and rj respectively, P1 can be rewritten It follows that maxj∈J {yj , zj } = 1, implying that yj ≤ 1
as the following problem (denoted by P1 ): and zj ≤ 1 for all j ∈ J . If we set λi,j = λi,j and
xi,j= xi,j in P1 , the objective function (7) will be equal
min Cj ρ∗j yj + Vj rj∗ zj B (7) to j∈J Cj ρ∗j yj + j∈J Vj rj∗ zj B. Since we have yj ≤ 1
{λi,j },{xs,j }
j∈J j∈J
and zj ≤ 1 for all j ∈ J , it follows that j∈J Cj ρ∗j yj +
221
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY SURATHKAL. Downloaded on November 08,2023 at 17:50:24 UTC from IEEE Xplore. Restrictions apply.
j∈JVj rj∗ zj B ≤ ∗
j∈J Cj ρj +
∗
j∈J Vj rj B. Recall Algorithm 1 Request-to-DC Assignment
that
the minimum value of the objective function (7) is Require:
∗ ∗ ∗
C ρ
j∈J j ∗j + j∈J V
j jr B, implying
that j∈J C j j yj +
ρ A new request of game i
∗ ∗
j∈J Vj
rj zj B ≥ j∈J Cj ρj + j∈J Vj rj B. Therefore, {ρ∗j }, {rj∗ }, η, c, θ and other given parameters
we have j∈J Cj ρ∗j yj + j∈J Vj rj∗ zj B = j∈J Cj ρ∗j +
∗
Ensure:
j∈J Vj rj B, which means that {λi,j }, {xs,j } is the opti- Assignment of the new request to corresponding DC
mal solution of P1 . Step 1: Determine the target DC m for the new request
D ← the set of DCs that are eligible to the new request
B. Request Dispatching Algorithm m := arg minj∈D [Qj,i /(ρ∗j μi ) + Qj bi /(rj∗ μi B)]
In this section, we present the details of the request dis- Step 2: Update the virtual queues
patching algorithm. The algorithm consists of two subroutines: Qm,i := Qm,i + 1/(ρ∗m μi )
∗
request-to-DC assignment and request-to-server assignment. Qm := Qm + bi /(rm μi B)
When a new request arrives to the CGSP, the request-to- for Each DC j ∈ J do
DC assignment subroutine determines which DC the request σ j := arg maxs∈Sj i∈I si · Qj,i
will be dispatched to and the request-to-server assignment end for
subroutine chooses a server in the selected DC to place the if η j∈J [ i∈I σij Qj,i + Qj ] > 1 then
request. Qj,i := max{Qj,i − cσij , 0}, ∀j ∈ J , i ∈ I
The algorithm is based on the shadow routing scheme Qj := max{Qj − c, 0}, ∀j ∈ J
in [8], which maintains several virtual queues. Each of the end if
dispatching decisions is made according to the current states Step 3: Update configuration usage fractions
of the virtual queues. Specifically, for each pair of DC j and for Each DC j ∈ J do
game i, there is an associated virtual queue (j, i) whose length x̂s,j := θI(s, σ j )+(1−θ)x̂sj , for all j and s ∈ Sj
is denoted by Qj,i . For each DC j, there is also a virtual end for
queue associated with the bandwidth capacity, whose length
is denoted by Qj . We emphasise that virtual queues are not is satisfied, the amount of “workload” cσij (σij refers to the
buffers where actual requests are placed for waiting. Instead, number of requests of game i in configuration σ j ) is removed
they are just variables maintained by the algorithm. Therefore, from virtual queue (j, i), namely, Qj,i := max{Qj,i − cσij , 0}
the waiting times of actual requests are not associated with the and the amount of “workload” c is removed from bandwidth
lengths of virtual queues. virtual queue (j), namely, Qj := max{Qj − c, 0}. Here c > 0
In addition to the given system parameters, the algorithm is a fixed parameter such that
also requires positive parameters η, c, θ as well as {ρ∗j } and c > max 1/(ρ∗j μi ) and c > max bi /(rj∗ μi B) (16)
{rj∗ } (which can be obtained by solving P1). Here we assume i,j i,j
that all the system parameters are fixed. We will discuss later In the third step, the algorithm updates the configura-
(in Section 5) how the proposed algorithm adapts to dynamic tion usage fractions which will be used in the request-to-
parameter changes. server assignment subroutine. Specifically, for each DC j,
Request-to-DC Assignment. The details of the request-to- the configuration usage fractions are updated according to
DC assignment subroutine are shown in Algorithm 1, and has x̂s,j := θI(s, σ j ) + (1 − θ)x̂sj , where I(s, σ j ) = 1 if s was
three steps. When a new request of game i arrives, the first the configuration σ j computed in step 2 and condition (15)
step immediately determines the DC (denoted by m) that the holds, and I(s, σ j ) = 0 otherwise.
new request will be dispatched to. Let D denote the set of Next, we show the asymptotic optimality of the request-to-
DCs that are eligible for the new request (i.e., satisfying the DC assignment algorithm. We have the following theorem:
network delay requirement). Then, the DC m is determined
according to Theorem IV.2. As η → 0, the request-to-DC dispatching
rates, as well as the configuration usage fractions produced
m := arg min[Qj,i /(ρ∗j μi ) + Qj bi /(rj∗ μi B)]. (14) by Algorithm 1, are close to optimal solutions of P2.
j∈D
Proof. Suppose parameters {ρ∗j }, {rj∗ }, {μi }, c and B are
In the second step, the algorithm updates the virtual queues. fixed rational numbers, such that condition (16) holds. When
First, the algorithm increases Qm,i by 1/(ρ∗m μi ), and Qm parameter η is close to 0, the virtual queuing process is
by bi /(rj∗ μi B). Then, the algorithm chooses a candidate a positive recurrent countable discrete-time Markov chain,
configuration σ j ∈ Sj for each DC j. After that, the algorithm which has a stable steady state. For a fixed η, denote by x̄s,j
(η)
needs to decide whether or not to activate a “super-server” the steady-state probability that configuration s is chosen as
for the virtual queues. If the super-server is activated, which (η)
σj and condition (15) holds. Let p̄i,j denote the steady-state
occurs when the condition probability that an arriving request of game i is assigned to DC
j j. According to the proof of Proposition 1 in [8], as η → 0,
η [ σi Qj,i + Qj ] > 1 (15)
Algorithm 1 solves the problem of minimizing the frequency
j∈J i∈I
222
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY SURATHKAL. Downloaded on November 08,2023 at 17:50:24 UTC from IEEE Xplore. Restrictions apply.
of superserver activations, subject to the stability of virtual by the
weighted sum of the residual capacity of all resources,
queues. That is, if we pick a stationary distribution for each η i.e., k wk rk , where rk is the residual capacity of the server
(η) (η) ,
then, as η → 0, {x̄s,j }, {p̄i,j } converges to ({x̄s,j }, {p̄i,j }) for resource k and wk is the weight forresource k. In our
,
which is the set of optimal solutions of the following problem: implementation, for DC m, we set wk = i ai,k λi /(λAm,k ).
If no feasible server is found, we start a new server with des-
min max{ȳj , z̄j } (17)
{p̄i,j },{x̄s,j } j∈J ignation s := arg mins∈Sm :si >0 zim (s)/[si x̂s,m ], and assign
the request to this server.
subject to
0 ≤ p̄i,j ≤ Li,j /λi , ∀i, j (18) V. A DAPTATION TO DYNAMIC C HANGES
x̄s,j ≥ 0, ∀s ∈ Sj , ∀j (19) In the previous discussions, we have assumed that all the
system parameters are fixed. In this case, all the parameters
λi p̄i,j bi /μi ≤ B, ∀j (20)
used by Algorithm 1 do not need to be updated. In this section,
i
we discuss how the proposed algorithm adapts to dynamic
p̄i,j = 1, ∀i (21) changes in the parameters. For cloud gaming systems, we
j
note that: a) the request arrivals display a strong daily pattern
(λi /λ)p̄i,j /(ρ∗j · μi ) ≤ si cx̄s,j , ∀i, j (22) of peaks and troughs, but the proportion of arrivals for any
s∈Sj particular game do not change quickly; b) Except for the
x̄s,j = ȳj , ∀j, (23) request arrival rates, other system parameters change very
s∈Sj
infrequently. Based on these observations, we shall show that
the proposed request dispatching algorithm can automatically
where ȳj = s∈Sj x̄s,j and z̄j = i (λi /λ)p̄i,j vi /(Bcrj∗ μi ) adapt to the daily variation of request arrivals.
.
If we rewrite this problem in terms of variables λi,j = λi p̄i,j Let λ1 , · · · , λI denote the request arrival rates of games 1
,
xs,j = λcx̄s,j , yj = λcȳj and zj = λcz¯j , we obtain P2 through I at some time point in a day. Denote by {ρ∗j }λ1 ,··· ,λI
.
Therefore, the proof of the theorem is complete. and {rj∗ }λ1 ,··· ,λI the parameters {ρ∗j } and {rj∗ } for arrival rates
Request-to-server Assignment. After the DC m is deter- λ1 , · · · , λI . As we know, the request arrival rates of different
mined, the new request will be assigned to a specific server games change in a similar way in a daily period. Typically, the
within DC m by the request-to-server assignment subroutine. peak usage occurs during late evening and the trough usage
In order to do that, for each non-empty server in each DC, occurs in the early morning. At any time point t in a one-day
a designation configuration is allocated when the server is period, it is reasonable to denote the arrival rates of games by
started. If a server’s designation is s = (s1 , . . . , sI ), it means γt λ1 , · · · , γt λI , where γt is a scalar dependent on t. Then, we
that we will never place more than si requests of game i have the following theorem.
on this server. A server with designation s is referred to as Theorem V.1. Algorithm 1 with input parameters {ρ∗j }λ1 ,··· ,λI
an s-server. Empty servers do not have a designation. Once and {rj∗ }λ1 ,··· ,λI is optimal at any time point t in a daily
allocated, a server designation does not change until the server period, as long as the bandwidth is not a limiting factor during
becomes empty. peak time.
Consider an s-server in DC m. If the current number of
game i’s requests on the server is less than si , we say the server Proof. Let {λ∗i,j }, {Ns,j ∗
} denote an optimal solution of
is feasible for a request of game i. The algorithm first finds P1 when the arrival rates are λ∗1 , · · · , λI . It follows that
out all the feasible servers (denoted by H) for the new request. for each i ∈ I, ρ∗i = s∈Sj Ns,j . We wish to show that
Let zim (s) denote the total number of game i’s requests in s- {γt λ∗i,j }, {γt Ns,j
∗
} is a feasible solution of P1 when∗ the
servers in DC m. Recall that the configuration usage fractions arrival rates are γt λ1 , · · · , γt λI . Suppose {γt λ∗i,j }, {γt Ns,j }
produced by Algorithm 1 are close to optimal solutions of is
not the optimal
solution of P1. There must exist a solution
P2. Based on this fact, the request-to-server assignment strives {λi,j }, {Ns,j } of P1 such that
to make the real-time configuration usage fractions (which is
proportional to zim (s)/si ) be close to their optimal values (i.e., Cj ρj + Vj Brj < Cj γt ρ∗j + Vj Bγt rj∗ . (25)
x̂s,m ). For this, we assign the new request to the feasible j∈J j∈J j∈J j∈J
server (if there is at least one feasible server found) whose
It is easy to see that {λi,j }/γt , {Ns,j }/γt is a feasible
designation is
solution
∗ of P1 when the arrival rates are λ1 , . . . , λI . Since
s := arg min zim (s)/[si x̂s,m ], (24) {λi,j }, {Ns,j ∗
} is an optimal solution of P1 when the
H :s >0
s∈Sm ∗
i
arrival
rates are λ1 , . . . , λ I . It follows
that j∈J Cj ρj +
∗
j∈J Vj Brj < j∈J Cj ρj /γt + j∈J Vj Brj /γt , which
H
where Sm denotes the set of designations of all the feasible
servers in H. If there is more than one feasible server whose contradicts with (25). Therefore, {γt λ∗i,j }, {γt Ns,j ∗
} is
designation is s , we assign the request in a best-fit manner, the optimal solution of P1 when the arrival rates
i.e., to the server (whose designation is s ) with the minimum are γt λ1 , . . . , γt λI . It follows that {ρ∗j }γt λ1 ,··· ,γt λI =
residual capacity. The residual capacity of a server is measured γt {ρ∗j }λ1 ,··· ,λI and {rj∗ }γt λ1 ,··· ,γt λI = γt {rj∗ }λ1 ,··· ,λI .
223
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY SURATHKAL. Downloaded on November 08,2023 at 17:50:24 UTC from IEEE Xplore. Restrictions apply.
Let γ min denote the minimum γt . If we set param- TABLE I
eters c such that c > maxi,j 1/(γ min ρ∗j μi ) and c > S UMMARY OF DC S
maxi,j bi /(γ min Brj∗ μi ), condition (16) is satisfied for all γt . DCs Server Type Server Cost Data Transfer Cost
In this case, the Algorithm 1 using {ρ∗j }λ1 ,··· ,λI , {rj∗ }λ1 ,··· ,λI
EC2-Virginia g3.4xlarge $1.14/Hour $0.090/GB
as the input parameters has the same behaviour with that using EC2-Oregon g3.4xlarge $1.14/Hour $0.090/GB
γt {ρ∗j }λ1 ,··· ,λI , γt {rj∗ }λ1 ,··· ,λI , 1/γt as the input parameters. EC2-Frankfurt g3.4xlarge $1.43/Hour $0.090/GB
It implies that the algorithm using {ρ∗j }λ1 ,··· ,λI , {rj∗ }λ1 ,··· ,λI EC2-Singapore p2.xlarge $1.78/Hour $0.120/GB
EC2-Tokyo p2.xlarge $1.52/Hour $0.140/GB
as input parameters also produces optimal configuration us- EC2-Seoul p2.xlarge $1.47/Hour $0.126/GB
age fractions for arrival rates γt λ1 , · · · , γt λI . Therefore, the
theorem is proven.
TABLE II
Theorem V.1 implies that against the daily variations of S UMMARY OF G AMES
the request arrival rates, the proposed dispatching algorithm
can run optimally without adjusting {ρ∗j } and {rj∗ }. For the
Games (CPU, GPU, RAM) Games (CPU, GPU, RAM)
224
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY SURATHKAL. Downloaded on November 08,2023 at 17:50:24 UTC from IEEE Xplore. Restrictions apply.
1.4 1.4 1.4
Normalized Cost Shadow Shadow Shadow
Normalized Cost
Normalized Cost
1.3 1.3 1.3
1 1 1
1.01 1.03 1.05 1.07 1.09 0.25 0.5 1 2 4 0.1 0.3 0.5 0.7 0.9
Value of c (× max{A1 , A2 }) Value of ϵ Value of θ
Fig. 2. Parameter settings for Algorithm 1
Given all the parameters, we use JOptimier [38] to solve P1, the results in [8]), which implies that ηQj,i and ηQj are
yielding {ρ∗j } and {rj∗
}, and the optimal
value of the objective almost
constants in condition (16) as η → 0. It follows that
function of P1, i.e., j∈J Cj ρ∗j + j∈J Vj rj∗ B, which we η[ j,i Qj,i + j Qj ] ≈ 1 if we set “crude” values to the
note is a lower bound of the total cost. Moreover, we compare parameters in (16). Then, we can obtain η[J(1 + I)c] ≈ 1,
the proposed dispatching algorithm with the following two where J(1 + I) is the total number of virtual queues. Based
alternatives: on this conclusion, we use an adjustment parameter for η,
Lowest Latency Assignment (LLA). In this algorithm, such that η = /(J(1 + I)c). Figure 2(b) presents the mean
each request is dispatched to the nearest DC in terms of cost produced by the shadow routing algorithm as varies
network latency among all its eligible DCs. If there is more from 0.25 to 4. As can be seen, the shadow routing algorithm
than one DC with attaining the same minimum latency, the performs better when is close to 1.0. In the rest of the
algorithm randomly chooses one of them. experiments, we set = 1.0.
Lowest Combined Price Assignment (LCPA). This algo- The parameter θ should be between 0 and 1. Figure 2(c)
rithm dispatches each request to the eligible DC which has the presents the mean cost produced by the shadow routing
lowest combined price for the request. The combined price of a algorithm as θ varies from 0.1 to 0.9. As can be seen, the cost
DC j for a request of game i is defined as k ai,k wk Cj +bi Vj , for different θ is similar, indicating that the shadow algorithm
where wk is the weight for resource k, which is defined as in is not very sensitive to θ, as long as it is within a reasonable
Section IV-B. range. In the rest of the experiments, we set θ = 0.1.
After the DC is determined, both LLA and LCPA assigns the
request to server within the selected DC in a best-fit manner. C. Results
That is, among all the running servers that can accommodate
the request, they assign the request to the server with the 1) Static Request Arrival Rates: We first evaluate the
minimum residual capacity. If no server can accommodate the proposed algorithms in the scenario with static request arrival
request, a new server is started. rates. The simulation is warmed up for a period of 24 hours
and then runs for another 24 hours. Figure 3 shows the real-
B. Parameter Settings for Algorithm 1 time cost produced by each algorithm over the simulation
The dispatching algorithm uses several parameters which period for λ = 400 (requests per minute). Among all the
should be properly set. The parameter c should satisfy (16), algorithms, the shadow routing algorithm produces the lowest
but it should not be too large. Otherwise, the increments cost, which is relatively close to the optimal. It indicates
of virtual queues in one step will be very large. Figure that the dispatching rates of different types of requests to
2(a) shows the mean cost (normalized to the lower bound) DCs and the configuration usage fractions produced by the
produced by the shadow routing algorithm as c varies from shadow algorithm are close to optimal. The LCPA algorithm
c = 1.01 max{A1 , A2 } to c = 1.09 max{A1 , A2 } (A1 and produces higher cost than the shadow algorithm. This is
A2 are the right-hand sides of condition (16)) in the scenario because it does not consider the configuration usage fractions
with a static request arrival rate. The simulation is warmed up in the assignment of requests to servers. The LLA algorithm
for a period of 24 hours and then runs for another 24 hours. achieves the worst performance among all the algorithms.
The other parameters of Algorithm 1 are set by default (the This is because both the dispatching rates of requests and
default values will be presented later). As can be seen, the the configuration usage fractions are not considered in this
cost increases as c grows. So, we set c = 1.01 max{A1 , A2 } algorithm.
in the rest of the experiments. Figure 4 shows the mean cost (normalized to the lower
According to the results in [8], the smaller the value of bound) produced by each algorithm over a period of 24 hours
η the more stationary that the system will be, but η cannot for different arrival rates. We can see that the shadow algorithm
be too small because the system will take time O(1/η) to is very cost-efficient for large-scale systems with high arrival
transform from the current steady state to a new steady state. rates. However, the benefit is reduced when the arrival rate
As η → 0, the scaled virtual lengths ηQj,i and ηQj in steady is small. This is because fewer servers are used with a small
state converge to a set of optimal non-negative values (see arrival rate, making the queuing system less stable and thus
225
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY SURATHKAL. Downloaded on November 08,2023 at 17:50:24 UTC from IEEE Xplore. Restrictions apply.
1.4 1.4 1.4
LLA LCPA Shadow
Normalized Cost
Normalized Cost
Normalized Cost
1.3 1.3 1.3
1.2 1.2 1.2
1.1
1.1 1.1
1
0 4 8 12 16 20 24 1 1
50 100 200 400 800 1,600 10 20 40 60 80 100
Time (Hours)
Total Arrival Rate (λ) Number of Games
1.4 500 3
LLA LCPA LLA LCPA Shadow
Normalized Cost
LLA LCPA Shadow
the configuration usage fractions produced by the shadow routing algorithm is adaptive to various distributions of arrival
algorithm are not so close to optimal. rates.
We next simulate more varieties of games to study the 2) Dynamic Arrival Rates: We next show how the proposed
scalability of the proposed algorithm. The resource demands of algorithms perform with dynamic request arrival rates. In this
the simulated games are synthetically generated based on the experiment, the request arrivals are generated according to the
measurements of the 10 real games. Let amin k and amax
k denote WoWAH dataset [39]. The WoWAH dataset contains 667032
the minimum and the maximum resource demand for resource game sessions which are observed over a 3-year period in a
k of the 10 games. We assume the resource demands of the World of Warcraft server in Taiwan. For each game session
simulated games for each resource k are uniformly distributed in the WoWAH dataset, we generate a play request whose
in the range [amin
k , amax
k ]. Note that as the number of games start/ending time is set according to the start/ending time of
increases, the size of Sj increases rapidly, which increases the game session. Then, we choose game i as the requested
the computational complexity of solving P1. To address this game with a probability of (1/iα )/[ i 1/iα ].
issue, we divide the games into several groups. For each group, Figure 7 shows the real-time cost produced by each algo-
we run the shadow algorithm separately, i.e., the games from rithm over a randomly selected period of two days. Figure
different groups cannot share servers. 8 shows the normalized cost of each algorithm to the lower
Figure 5 shows the performance of the algorithms for bound. For each sampling time point t, the lower bound of the
different numbers of simulated games. In this experiment, total cost is estimated as follows. Let Ni (t) denote the number
games are randomly partitioned into equal-sized groups with of game i’s request arrivals over the period [t − 10, t + 10] (the
each group having 10 games. We can see that the performance unit of t is minute). We estimate the request arrival rate of
of the shadow algorithm is always relatively close to the game i at time t as λi (t) = Ni (t)/20. Then, the lower bound
lower bound as the number of games increases, and at times of the total cost can be computed by solving P1 using λi (t) as
outperforms the other algorithms significantly. It indicates that the inputs. Let λmin
i denote the minimum arrival rate of game
the shadow algorithm is scalable and the impact of partitioning i during the simulation period. The parameters {ρ∗j } and {rj∗ }
the games in this way is insignificant. used in the shadow algorithm are computed using {λmin i } as
Finally, we evaluate the impact of parameter α in the Zipf the arrival rates of games.
distribution function for controlling the distribution of the From Figures 7 and 8 we can see that the shadow algorithm
arrival rates. Figure 6 shows the mean cost produced by each always outperforms the other two algorithms, and that the
algorithm as α varies from 1 to 1.5. A small α will make shadow algorithm can automatically adapt to the daily work-
the distribution more uniform while a large α will make the load variations without adjusting parameters {ρ∗j } and {rj∗ }.
distribution more skewed. As can be seen, the performance We also observe that for all the algorithms, the performance
of the shadow routing algorithm is similar for different α, in the “declining” period in a daily period (i.e., from 08:00 to
which outperforms LLA and LCPA, indicating that the shadow 18:00) is worse than the performance during the “climbing”
226
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY SURATHKAL. Downloaded on November 08,2023 at 17:50:24 UTC from IEEE Xplore. Restrictions apply.
period (i.e., from 18:00 to the next day’s 08:00). This is [15] M. Basiri and A. Rasoolzadegan, “Delay-aware resource provisioning for
cost-efficient cloud gaming,” IEEE Transactions on Circuits & Systems
because there are more play request arrivals than departures for Video Technology, vol. PP, no. 99, pp. 1–1, 2016.
in the climbing period. In this case, all the running servers [16] H. Tian, D. Wu, J. He, Y. Xu, and M. Chen, “On achieving cost-
are almost fully occupied by the continuously arrival requests, effective adaptive cloud gaming in geo-distributed data centers,” IEEE
Transactions on Circuits & Systems for Video Technology, vol. 25,
causing the server to be almost fully utilized and thus the no. 12, pp. 2064–2077, 2015.
cost produced by the algorithms is close to the optimal. In [17] D. Wu, X. Zheng, and J. He, “iCloudAccess: cost-effective streaming
contrast, there are more player departures than arrivals in the of video games from the cloud with low latency,” IEEE Transactions on
Circuits and Systems for Video Technology, vol. 24, no. 8, 2014.
declining period, which causes many servers to run at low [18] Y. Li, X. Tang, and W. Cai, “On dynamic bin packing for resource
levels of resource utilization so that the total cost produced by allocation in the cloud,” in Proceedings of the 26th ACM Symposium on
the algorithms is far from optimal. Compared with the other Parallelism in Algorithms and Architectures, 2014, pp. 2–11.
[19] ——, “Play request dispatching for efficient virtual machine usage in
two algorithms, the performance degradation of the shadow cloud gaming,” IEEE Transactions on Circuits and Systems for Video
algorithm during the declining period is much lower. Technology, vol. 25, no. 12, pp. 2052–2063, 2015.
[20] Y. Deng, Y. Li, X. Tang, and W. Cai, “Server allocation for multiplayer
VII. C ONCLUSIONS cloud gaming,” in Proceedings of the 2016 ACM International Confer-
ence on Multimedia. ACM, 2016, pp. 918–927.
In this paper, we studied the play request dispatching prob- [21] Y. Deng, Y. Li, R. Seet, X. Tang, and W. Cai, “The server allocation
lem for distributed cloud gaming infrastructure. We employ problem for session-based multiplayer cloud gaming,” IEEE Transac-
tions on Multimedia, vol. PP, no. 99, pp. 1–1, 2017.
a shadow routing based approach, which is asymptotically [22] Y. Zhao, H. Jiang, K. Zhou, Z. Huang, and P. Huang, “Meeting service
optimal, to jointly solve the request-to-DC assignment and level agreement cost-effectively for video-on-demand applications in the
request-to-server assignment problem for minimizing the total cloud,” in IEEE Conference on Computer Communications 2014. IEEE,
2014, pp. 298–306.
resource rental cost. Simulations show that our algorithm [23] Y. Wu, C. Wu, B. Li, X. Qiu, and F. C. Lau, “Cloudmedia: When
is more cost-efficient than existing heuristics, scalable with cloud on demand meets video on demand,” in 2011 31st International
a large number of games and highly adaptive to dynamic Conference on Distributed Computing Systems (ICDCS). IEEE, 2011,
pp. 268–277.
changes. In this paper, the proposed algorithms were evaluated [24] R. Panigrahy, K. Talwar, L. Uyeda, and U. Wieder, “Heuristics for vector
using Amazon EC’s cloud resources only. In the future, we bin packing,” 2011.
would like to consider using resources from more cloud [25] E. G. Coffman, Jr, M. R. Garey, and D. S. Johnson, “Dynamic bin
packing,” SIAM Journal on Computing, vol. 12, no. 2, pp. 227–258,
providers to evaluate the proposed algorithms. We also wish 1983.
to consider the performance interferences among co-located [26] A. L. Stolyar, “Large-scale heterogeneous service systems with general
games when packing the requests onto servers. Moreover, we packing constraints,” Advances in Applied Probability, vol. 49, no. 1,
pp. 61–83, 2015.
would like to apply similar ideas to general datacenters. [27] S. T. Maguluri, R. Srikant, and L. Ying, “Stochastic models of load
balancing and scheduling in cloud computing clusters,” Proceedings of
R EFERENCES IEEE INFOCOM, vol. 131, no. 5, pp. 702–710, 2012.
[1] (2018) GeForce Now. [Online]. Available: http://www.geforce.com/ [28] J. W. Jiang, T. Lan, S. Ha, M. Chen, and M. Chiang, “Joint vm placement
[2] (2018) LiquidSky. [Online]. Available: https://liquidsky.com/ and routing for data center traffic engineering,” Proceedings of IEEE
[3] (2018) Simplay. [Online]. Available: https://simplay.io/ INFOCOM, vol. 131, no. 5, pp. 2876–2880, 2012.
[4] (2018) Vortex. [Online]. Available: https://vortex.gg/ [29] A. L. Stolyar and T. Tezcan, Control of systems with flexible multi-
[5] (2018) Amazon EC2. [Online]. Available: https://aws.amazon.com/ec2/ server pools: a shadow routing approach. J. C. Baltzer AG, Science
[6] (2018) Microsoft Azure. [Online]. Available: Publishers, 2010.
https://azure.microsoft.com/ [30] X. Fei, F. Liu, H. Xu, and H. Jin, “Towards load-balanced vnf assign-
[7] R. Shea, D. Fu, and J. Liu, “Rhizome: utilizing the public cloud to ment in geo-distributed nfv infrastructure,” in IEEE/ACM International
provide 3D gaming infrastructure,” in Proceedings of the 6th ACM Symposium on Quality of Service, 2017.
Multimedia Systems Conference. ACM, 2015, pp. 97–100. [31] M. Claypool and K. Claypool, “Latency and player actions in online
[8] G. Yang, A. L. Stolyar, and A. Walid, “Shadow-routing based dynamic games,” Communications of the Acm, vol. 49, no. 49, pp. 40–45, 2006.
algorithms for virtual machine placement in a network cloud,” in 2013 [32] M. Jarschel, D. Schlosser, S. Scheuring, and T. Hofeld, “Gaming in
Proceedings IEEE INFOCOM, 2013, pp. 620–628. the clouds: Qoe and the users perspective,” Mathematical & Computer
Modelling, vol. 57, no. 1112, pp. 2883–2894, 2013.
[9] C.-Y. Huang, C.-H. Hsu, Y.-C. Chang, and K.-T. Chen, “GamingAny-
[33] S. Boyd and L. Vandenberghe, Convex optimization. Cambridge
where: an open cloud gaming system,” in Proceedings of the 4th ACM
university press, 2004.
Multimedia Systems Conference, 2013, pp. 36–47.
[34] D. Finkel, M. Claypool, S. Jaffe, T. Nguyen, and B. Stephen, “Assign-
[10] M. Claypool, D. Finkel, A. Grant, and M. Solano, “On the performance
ment of games to servers in the OnLive cloud game system,” in 2014
of OnLive thin client games,” Multimedia Systems, pp. 1–14, 2014.
13th Annual Workshop on Network and Systems Support for Games.
[11] Y.-T. Lee, K.-T. Chen, H.-I. Su, and C.-L. Lei, “Are all games equally
IEEE, 2014, pp. 1–3.
cloud-gaming-friendly? an electromyographic approach,” in 11th Annual
[35] (2018) Steam & Game Stats. [Online]. Available:
Workshop on Network and Systems Support for Games. IEEE, 2012,
http://store.steampowered.com/stats/
pp. 1–6.
[36] K. T. Chen, Y. C. Chang, H. J. Hsu, D. Y. Chen, C. Y. Huang, and
[12] H. Ahmadi, S. Z. Tootaghaj, M. R. Hashemi, and S. Shirmohammadi, “A
C. H. Hsu, “On the quality of service of cloud gaming systems,” IEEE
game attention model for efficient bit rate allocation in cloud gaming,”
Transactions on Multimedia, vol. 16, no. 2, pp. 480–495, 2014.
Multimedia Systems, pp. 1–17, 2014.
[37] (2018) Google data centers. [Online]. Available:
[13] M. Hemmati, A. Javadtalab, A. A. Nazari Shirehjini, S. Shirmohammadi,
https://en.wikipedia.org/wiki/Google Data Centers
and T. Arici, “Game as video: bit rate reduction through adaptive object
[38] (2018) JOptimier. [Online]. Available: http://www.joptimizer.com/
encoding,” in Proceeding of the 23rd ACM Workshop on Network and
[39] Y.-T. Lee, K.-T. Chen, Y.-M. Cheng, and C.-L. Lei, “World of warcraft
Operating Systems Support for Digital Audio and Video. ACM, 2013,
avatar history dataset,” in Proceedings of the Second Annual ACM
pp. 7–12.
Conference on Multimedia Systems. ACM, 2011, pp. 123–128.
[14] H.-J. Hong, D.-Y. Chen, C.-Y. Huang, K.-T. Chen, and C.-H. Hsu,
“Placing virtual machines to optimize cloud gaming experience,” IEEE
Transactions on Cloud Computing, 2014.
227
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY SURATHKAL. Downloaded on November 08,2023 at 17:50:24 UTC from IEEE Xplore. Restrictions apply.