Cost-Efficient Request Dispatching in Geo-Distributed Cloud Gaming Infrastructure

2020 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing,
Sustainable
Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom)
2020 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom) | 978-1-6654-1485-2/20/$31.00 ©2020 IEEE |
Cost-Efficient Request Dispatching in

Geo-distributed Cloud Gaming Infrastructure
Yusen Li, Jinping Wu, Bingzheng Ma, Gang Wang, Xiaoguang Liu
College of Computer Science
Nankai University, Tianjin, China
{liyusen, wujp, mbz, wgzwp, liuxg}@nbjl.nankai.edu.cn
Abstract—Cloud gaming is a recent approach to gaming, where

processing for the games occurs on well-equipped servers within Servers
the cloud and a video stream is returned to the user, meaning
users can play high-end games on devices that lack computational Game
DC1
power. Different games require different amounts of computa- Interaction
DC2
tional resources and computation times. It would be desirable
to efficiently pack a number of servers with multiple games at
once, however this is complicated within a geo-distributed cloud
system as we must consider that not every data center can fulfil
every game request due to latency requirements. Players DC3
Within this work, we present shadow routing algorithms
to distribute game requests to cloud data centers and also
Fig. 1. Geo-Distributed Cloud Gaming Infrastructure
to pack the servers within the data centers with these game
requests. These algorithms are designed to operate in order to
e.g., virtual machines and bandwidth, according to the actual
minimize total cost from server hire and bandwidth usage, and
we prove their performance is asymptotically close to optimal. demand, which can significantly improve the utilization of
An experiment using realistic arrival rates is given, and the resources. As the players could come from different locations,
results verify our theory within a realistic context. Also shown existing CGSPs generally deploy their services on geograph-
using proof and experimentation is that the algorithms can adapt ically distributed data centers (DCs) [2], [7]. Figure 1 shows
themselves to periodic changes as demand raises and falls while
a high-level overview of the geo-distributed cloud gaming
remaining close to the optimal, which is a particular weak point
of other schemes. infrastructure. When a play request is received, the cloud
Index Terms—Cloud Gaming, Geo-distributed, Resource Cost, gaming platform will forward the request to a particular DC
Request Dispatching, Shadow Routing. and allocate resources within the DC for running the requested
game and streaming the encoded video.
I. I NTRODUCTION To maximize revenue, the CGSPs must attempt to minimize
Cloud gaming has been a popular topic within the tech the rented resources used while providing satisfactory service
industry for a number of years. The main idea of cloud gaming to players. The prices of cloud resources across DCs exhibit
is to use a cloud server to render gameplay before encoding location diversity which can be exploited to save costs. How-
and sending the resulting video stream to thin clients over ever, due to the intrinsic properties of cloud gaming, it is chal-
the network. The thin clients decode and display the video lenging to efficiently utilize the resources in the complex and
streams, and send the input actions by players to cloud servers heterogeneous cloud environment. First, compared with the
DOI: 10.1109/ISPA-BDCLOUD-SOCIALCOM-SUSTAINCOM51426.2020.00053
for game interactions. Using this system, players can play video streaming services such as YouTube and Netflix, cloud
high-end video games on common PCs, tablets and mobile gaming is more interactive and delay-sensitive. Therefore,
phones without dedicated hardware equipped. Moreover, cloud the constraints to be satisfied during resource allocation are
gaming allows players to start playing games instantly without more stringent. Second, running games generally should not
time-consuming software downloads and configurations. For be allowed to be migrated among servers due to interruption
all these benefits, cloud gaming has attracted attention from to gameplay and therefore requires a sufficiently intelligent
both academia and industry. Several cloud gaming services strategy to efficiently allocate resources. Third, the workload
are currently in operation, such as GeForce NOW [1], Liquid- in cloud gaming is highly dynamic, and thus the resource
Sky [2], Simply [3] and Vortex [4]. allocation strategy should also be able to adapt to changes
As cloud gaming has dynamic workloads and intensive efficiently.
resource demands, it is naturally suitable to deploy the service In this paper, we investigate the resource allocation issues in
on public cloud infrastructures such as Amazon EC2 [5] and geo-distributed cloud gaming from the perspective of CGSPs.
Microsoft Azure [6], due to their elastic and on-demand nature Specifically, we study the game request dispatching problem
of resource provisioning. The cloud gaming service providers with the goal of minimizing the resource rental cost, while pro-
(CGSPs) can dynamically rent resources from cloud providers, viding satisfactory services to players. The request dispatching
978-1-6654-1485-2/20/$31.00 ©2020 IEEE 218

DOI 10.1109/ISPA-BDCloud-SocialCom-SustainCom51426.2020.00053
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY SURATHKAL. Downloaded on November 08,2023 at 17:50:24 UTC from IEEE Xplore. Restrictions apply.
problem consists of two phases: request-to-DC assignment and the performance of these systems. For example, the work
request-to-server assignment. The request-to-DC assignment in [10], [11] measured the latency and network traffic of
decides which DC an incoming request should be dispatched some existing cloud gaming platforms. The work in [12],
to. The request-to-server assignment places the request to a [13] optimized the video encoding and graphics rendering
specific server within the selected DC. As games consume techniques for bit rate reduction in video transmission.
large amounts of multiple types of resources, games must be The resource provisioning issues in cloud gaming have been
assigned to servers in such a way that we will not exceed any extensively studied. Hong et al. [14] studied the problem of
of the resource capacities of the server. how to consolidate multiple game servers (virtual machines)
In this paper, we study the game request dispatching prob- on physical machines to achieve a good balance between the
lem. For this, we will formulate a constrained stochastic player’s quality of experience and the CGSP’s profit. Basiri
optimization problem. The constrained stochastic optimization et al. [15] constructed detailed queuing and delay models to
problem can be obtained by minimizing cost subject to a list of study cost-efficient resource provisioning for cloud gaming.
intuitive constraints. The solution to this optimization problem, However, both of the above works assumes that the cloud
minimizing for the total cost, yields both the rates of the gaming provider has fine-grained control of managing the
configurations that the DCs’ servers should be in as well as virtual machines on the physical machines, which is normally
the rates to which we should direct a given game to each of available in proprietary cloud systems only and had not yet
the DCs. We also propose a shadow routing based dispatching been available in public cloud environments.
algorithm, and prove that this algorithm, while assigning jobs Recently, many cloud gaming systems have been deployed
to DCs, can produce rates that are asymptotically optimal on public clouds [2], [7]. Significant effort has been devoted to
within our optimization problem, on the basis of results understanding the resource allocation issues in cloud gaming
derived from [8]. In order to kick-start the algorithm, values using public cloud resources. Wu and Tian et al. [16], [17]
obtained from the optimization problem are used to initiate the studied the request dispatching, server provisioning and video
algorithm. The algorithm automatically adapts to changes in streaming bit rate settings problems jointly in a geo-distributed
request arrival rates and can run continuously without solving cloud gaming platform, with the goal of minimizing the
the optimization problem from scratch frequently. The idea resource cost while guaranteeing a good quality of experience
of this algorithm is to assign each arriving request within a for players. However, their model assumes each server can
specially designed virtual queuing system, where the request only run one game at a time and the packing constraints are
is assigned to one of a number of virtual queues. Note that not considered in the assignment of games. Li et al. [18], [19]
a request is never actually placed into a virtual queue, but formulated the request dispatching problem in cloud gaming
we maintain variables associated with each virtual queue that as a variant of dynamic bin packing, with the objective of
changes when a request is assigned to its corresponding queue. minimizing the total server running hours. They theoretically
Each of the routing decisions is easily made from observing analyzed the worst-case performance of the classical bin
these variables’ values, and after making a decision we update packing algorithms. Deng et al. [20], [21] considered the server
the variables. allocation problem in the context of multiplayer cloud gaming
We conduct extensive experiments using real-world data to over geographically distributed DCs for minimizing the total
evaluate the performance of the proposed algorithm. These server cost. However, they assume different games have the
experiments show that the proposed algorithm can achieve same resource consumption.
near-optimal performance and outperforms several alterna- Cloud gaming shares some similarities with video-on-
tives. Moreover, the results also demonstrate the scalability demand and live streaming applications [22], [23]. However,
and adaptivity of the proposed algorithm. it is different in two important aspects. First, cloud gaming is
The remainder of this paper is organized as follows. Sec- more latency-sensitive, which makes the resource provisioning
tion 2 summarizes the related work. The system model and more challenging. Second, streaming videos consumes much
problem formulation are presented in Section 3. The proposed less computational resources compared with running games,
dispatching algorithm is presented in Section 4. In Section which makes the request-to-server assignment trivial in video
5, we present how the proposed dispatching algorithm adapts streaming applications. Our model is related to the vector
to dynamic changes. The proposed algorithm is evaluated in packing problem [24] which aims to use the minimum number
Section 6. Finally, conclusions are made and future work is of bins to pack a given set of items under multi-dimensional
discussed in Section 7. resource constraints. However, the items in a vector packing
stay in the system permanently while the requests in our model
II. R ELATED W ORKS leave the system after a random amount of time. Dynamic bin
As cloud gaming gains popularity, many cloud gaming packing [25] considers item departures but seldom do studies
systems have been developed for both research studies and on this topic consider multi-dimensional resource constraints.
commercial use. Some well-known platforms include Gamin- Related work also includes the virtual machine (VM) place-
gAnywhere [9], Rhizome [7], GeForce NOW [1], Liquid- ment problems, which in particular include packing con-
Sky [2], Simply [3] and Vortex [4]. A large amount of research straints [26]–[28]. However, the studies on VM placement
work has been conducted towards measuring and enhancing either consider a single DC [26] or focus on different aspects,
219
including minimizing network traffic and latency, as well as degradation for games (e.g., the interaction delay grows or the
maximizing throughput and load balancing [27], [28]. Stolyar frame rate drops) and thus hurt user experiences. However,
et al. [29] applied a shadow routing approach to large-scale our analysis and proposed approach can easily be applied
service systems, but without packing constraints. Although [8], to the scenario with interferences considered, as long as the
[30] considered packing constraints when using shadow rout- interferences are considered in the definition of the vector
ing, their objectives are load balancing. In this paper, we s = (s1 , . . . , sI ).
instead aim to minimize the total server cost. Cloud gaming is delay sensitive, requiring that the network
III. P ROBLEM S TATEMENT delay from the player to the cloud server is not too large
[31], [32]. Otherwise, the total delay (including the network
A. System Model delay and encoding/decoding delay, etc) experienced by the
Consider a cloud gaming platform that provides services for player will be unacceptable. Different game genres usually
I different games, which are indexed by i ∈ I = {1, . . . , I}. have different delay requirements. For example, fast-paced
For each game i ∈ I, we assume that the play requests for games such as first-person shooters or car racing games, have
game i arrive at a rate of λi (i.e., there will be λi requests more stringent delay requirement than moderate-paced games
per minute on average). The period when a game is running such as third-person role-playing games [31], [32]. For each
is referred to as a session. The average session length of game i ∈ I, we define Di as the maximum network delay that
game i is denoted by 1/μi . Running games consumes several can be tolerated by players. For each play request of game i,
types of resources such as CPU, GPU, memory, etc. Suppose we call DC j eligible if the network delay from the player to
we are concerned about K types of resources, indexed by DC j is less than Di .
k ∈ K = {1, . . . , K}. Let ai,k denote the resource demand of
game i for resource k. Streaming the encoded video consumes
bandwidth. Let bi denote the bit rate for streaming the encoded B. Problem Statement
video of game i. For brevity, we assume all the requests
for the same game demand the same amount of resources. Our goal is to find the optimal strategy that dispatches new
In practice, players may have different play settings (e.g., game requests to DCs such that the total combined rental
different resolutions), making the real resource consumption cost of servers and bandwidth is minimized. When a new
slightly different from each other, even if they play the same request arrives, the strategy should decide which DC the
game. Our model can be applied to this case by conceptually request should be dispatched to (request-to-DC assignment)
considering different play settings as different games. and which server the request should be assigned to (request-
Suppose the cloud gaming platform is deployed on J geo- to-server assignment). If there is no need to open a new server,
graphically distributed DCs, which are indexed by j ∈ J = the strategy should choose one of the currently running servers
{1, . . . J}. The CGSP rents cloud servers (virtual machines) to accommodate the request. Once a game instance starts, it
from these DCs to run games and stream the corresponding will run on the same server during the entire game session.
output videos. For brevity, we assume the servers in the Migrating game instances from one server to another is not
same DC have the same type. Our analysis and proposed allowed due to large overhead and interruption to gameplay.
methods can easily be applied to the case where each DC has Let Cj denote the cost rate of renting a cloud server in DC
multiple server types, by conceptually considering the servers j. Let Vj denote the cost rate of using one unit bandwidth in
of different types as servers in different DCs. As the cloud is DC j. The present rate at which requests of game i are made
elastic, we assume there is a larger supply of servers from each such that the request is actually dispatched by the algorithm
DC than our demand. Denote by B the bandwidth capacity of to DC j may fluctuate. At any particular time, we take Li,j
each DC. Although we assume different DCs have the same to be the maximum rate that incoming requests fo game i can
B, our models can easily be extended to the scenario with be dispatched to DC j without violating latency requirements.
different B for different DCs. Our algorithm will dispatch incoming requests of game i to DC
For each cloud server in DC j, the capacity of resource k is j as a rate of λi,j . This rate will be changed over time by the
Aj,k . A server in DC j can simultaneously run multiple game algorithm as the number of incoming requests fluctuate. Also,
instances if the total resource demand does not exceed the let Ns,j denote the number of servers in DC j that are used
server’s capacity. We may place a server into a configuration in configuration s ∈ Sj according to our algorithm. Note that
denoted by a vector s = (s1 , . . . , sI ). Such a server may the variables λi,j and Ns,j are set by decisions made within
i∈I si game instances, running
simultaneously run up to the algorithm. For convenience, we define dependent variables
at most si instances of game i. Different servers will have of the number of servers used in DC j as ρj = s∈Sj Ns,j
different compatible vectors. The constraint i si ·ai,k ≤ Aj,k and of the total bandwidth utilization of DC j as rj =
and
must be satisfied for each k ∈ K. Denote by Sj the set of such i∈I bi λi,j /(μi B).
vectors that are maximal (not dominated by others). Note that Suppose that some algorithm is running and has entered
our model does not explicitly consider the interferences (e.g., a stable period, where the incoming rates λi and Li,j are
competition for the same resources) when multiple games run constant. Then, we want an algorithm that produces rates
on the same server. The interferences may cause performance λi,j and values Ns,j such that they solve (or are as close
220
as possible) to the solution of the following linear program subject to
(denoted by P1):
0 ≤ λi,j ≤ Li,j , ∀i ∈ I, ∀j ∈ J (8)
xs,j ≥ 0, ∀s ∈ Sj , ∀j ∈ J (9)
min C j ρj + V j rj B (1)
{λi,j },{Ns,j }
j∈J j∈J
λi,j = λi , ∀i ∈ I (10)
j∈J

subject to bi λi,j /μi ≤ B, ∀j ∈ J (11)
i∈I

λi,j /μi ≤ ρ∗j si xs,j , ∀j ∈ J , ∀i ∈ I (12)
0 ≤ λi,j ≤ Li,j , ∀i ∈ I, ∀j ∈ J (2)
s∈Sj
Ns,j ≥ 0, ∀s ∈ Sj , ∀j ∈ J (3)
Consider the following optimization problem (denoted by
λi,j = λi , ∀i ∈ I (4) P2):
j∈J
min max{yj , zj } (13)
bi λi,j /μi ≤ B, ∀j ∈ J (5) {λi,j },{xs,j } j∈J
i∈I
subject to the constraints (8) to (12). We have the following
λi,j /μi ≤ si Ns,j , ∀j ∈ J , ∀i ∈ I (6) theorem.
s∈Sj
Theorem IV.1. P1 is equivalent to P2, i.e., the optimal
solution of P1 is also the optimal solution of P2, and vice
The first item in the objective function is the total cost versa.
of renting servers and the second item is the total cost of
bandwidth consumption. Constraint (5) ensures that the total Proof. We first prove that an optimal solution of P1 also
bandwidth consumption at each DC does not exceed the minimizes
∗ the objective function of P2 (i.e., (13)). Let
∗
bandwidth capacity of the DC. Constraint (6) ensures that all {λi,j }, {N s,j } denote an optimal solution of P1. Recalling
∗ ∗
the arrivals can be handled by the system, where the left part that ρj = s∈Sj Ns,j , define x∗s,j = Ns,j ∗
/ρ∗j . It is easy to
∗
is the amount of processing volume arriving due to requests see that {λi,j }, {xs,j } is the optimal solution of P1 . Let
∗
for game i in DC j, and the right part is the total volume yj∗ and zj∗ denote the values of yj and zj when λi,j = λ∗i,j and
of game i’s requests that can be served in DC j (si denotes xs,j = x∗s,j for problem P1 . It is easy to see that yj∗ = 1 and
the number of requests for game i in configuration s). Note zj∗ = 1 hold for all j ∈ J , which implies that: (1) the value of
that any algorithm that operates on this system model must the objective function (13) is equal to 1 when λi,j = λ∗i,j and
obey constraints (2) through (6), and therefore the solution of xi,j = x∗i,j for P2; and (2) the minimum value of the objective
this linear program represents a minimum possible cost of any function (13) cannot be larger than 1. We must therefore show
algorithm solving this problem. that min{λi,j },{xs,j } maxj∈J {yj , zj } < 1 is impossible.
Suppose {λi,j }, {xs,j } denote an optimal solution of P2
that has min{λi,j },{xs,j } maxj∈J {yj , zj } < 1. Let yj and
IV. R EQUEST D ISPATCHING A LGORITHM zj denote the values of yj and zj at this optimal solution.
We must therefore have maxj∈J {yj , zj } < 1, so it follows
that yj <
1 and ∗zj <1 for all∗ j ∈ J . Therefore,
A. Preliminary Analysis ∗
we
have j∈J C j ρ y
j j + j∈J V j r z
j j B < j∈J Cj ρj +
∗ ∗ ∗
j∈J Vj rj B. Recall that j∈J Cj ρj + j∈J Vj rj B is the
In this section, we transform P1 to an equivalent problem minimum value of theobjective function (7). This implies that

P2. The intention in doing this is that problem P2 has a the cost produced by {λi,j }, {xs,j } is less than the minimal
corresponding request dispatching algorithm that we use in cost of P1 , which is a contradiction. Therefore the optimal
this paper. The problem P1 is a linear convex optimization solution of P1 is also an optimal solution for P2.
problem, which
can be solved
efficiently by many approaches We next prove that an optimal solution of P2 also mini-
[33]. Let {λ∗i,j }, {Ns,j
∗
} denote an optimal solution of P1. mizes the objective function of P1 (i.e., (7)). Based on the
DC j at
In this optimal configuration, theserver use is ρ∗j = previous discussions, we know that the minimum value of
∗ ∗ ∗
s∈Sj Ns,j and bandwidth use is rj = i∈I bi λi,j /(μi B). the objective function (13) is equal to 1. Let {λi,j }, {xs,j }
∗ ∗
Let xs,j = Ns,j /ρj and yj = ρj /ρj . It follows that
denote an optimal solution of P2. Let yj and zj denote the
∗
s∈Sj xs,j = yj . Let zj = rj /rj . If we use xi,j , yj and values of yj and zj when λi,j = λi,j and xs,j = xs,j for P2.
zj to replace Ns,j , ρj and rj respectively, P1 can be rewritten It follows that maxj∈J {yj , zj } = 1, implying that yj ≤ 1
as the following problem (denoted by P1 ): and zj ≤ 1 for all j ∈ J . If we set λi,j = λi,j and
xi,j= xi,j in P1 , the objective function (7) will be equal

min Cj ρ∗j yj + Vj rj∗ zj B (7) to j∈J Cj ρ∗j yj + j∈J Vj rj∗ zj B. Since we have yj ≤ 1
{λi,j },{xs,j }
j∈J j∈J
and zj ≤ 1 for all j ∈ J , it follows that j∈J Cj ρ∗j yj +

221

j∈JVj rj∗ zj B ≤ ∗
j∈J Cj ρj +
∗
j∈J Vj rj B. Recall Algorithm 1 Request-to-DC Assignment
that
the minimum value of the objective function (7) is Require:
∗ ∗ ∗
C ρ
j∈J j ∗j + j∈J V
j jr B, implying
that j∈J C j j yj +
ρ A new request of game i
∗ ∗
j∈J Vj
rj zj B ≥ j∈J Cj ρj + j∈J Vj rj B. Therefore, {ρ∗j }, {rj∗ }, η, c, θ and other given parameters
we have j∈J Cj ρ∗j yj + j∈J Vj rj∗ zj B = j∈J Cj ρ∗j +
∗
Ensure:
j∈J Vj rj B, which means that {λi,j }, {xs,j } is the opti- Assignment of the new request to corresponding DC

mal solution of P1 . Step 1: Determine the target DC m for the new request
D ← the set of DCs that are eligible to the new request
B. Request Dispatching Algorithm m := arg minj∈D [Qj,i /(ρ∗j μi ) + Qj bi /(rj∗ μi B)]
In this section, we present the details of the request dis- Step 2: Update the virtual queues
patching algorithm. The algorithm consists of two subroutines: Qm,i := Qm,i + 1/(ρ∗m μi )
∗
request-to-DC assignment and request-to-server assignment. Qm := Qm + bi /(rm μi B)
When a new request arrives to the CGSP, the request-to- for Each DC j ∈ J do
DC assignment subroutine determines which DC the request σ j := arg maxs∈Sj i∈I si · Qj,i
will be dispatched to and the request-to-server assignment end for
subroutine chooses a server in the selected DC to place the if η j∈J [ i∈I σij Qj,i + Qj ] > 1 then
request. Qj,i := max{Qj,i − cσij , 0}, ∀j ∈ J , i ∈ I
The algorithm is based on the shadow routing scheme Qj := max{Qj − c, 0}, ∀j ∈ J
in [8], which maintains several virtual queues. Each of the end if
dispatching decisions is made according to the current states Step 3: Update configuration usage fractions
of the virtual queues. Specifically, for each pair of DC j and for Each DC j ∈ J do
game i, there is an associated virtual queue (j, i) whose length x̂s,j := θI(s, σ j )+(1−θ)x̂sj , for all j and s ∈ Sj
is denoted by Qj,i . For each DC j, there is also a virtual end for
queue associated with the bandwidth capacity, whose length
is denoted by Qj . We emphasise that virtual queues are not is satisfied, the amount of “workload” cσij (σij refers to the
buffers where actual requests are placed for waiting. Instead, number of requests of game i in configuration σ j ) is removed
they are just variables maintained by the algorithm. Therefore, from virtual queue (j, i), namely, Qj,i := max{Qj,i − cσij , 0}
the waiting times of actual requests are not associated with the and the amount of “workload” c is removed from bandwidth
lengths of virtual queues. virtual queue (j), namely, Qj := max{Qj − c, 0}. Here c > 0
In addition to the given system parameters, the algorithm is a fixed parameter such that
also requires positive parameters η, c, θ as well as {ρ∗j } and c > max 1/(ρ∗j μi ) and c > max bi /(rj∗ μi B) (16)
{rj∗ } (which can be obtained by solving P1). Here we assume i,j i,j
that all the system parameters are fixed. We will discuss later In the third step, the algorithm updates the configura-
(in Section 5) how the proposed algorithm adapts to dynamic tion usage fractions which will be used in the request-to-
parameter changes. server assignment subroutine. Specifically, for each DC j,
Request-to-DC Assignment. The details of the request-to- the configuration usage fractions are updated according to
DC assignment subroutine are shown in Algorithm 1, and has x̂s,j := θI(s, σ j ) + (1 − θ)x̂sj , where I(s, σ j ) = 1 if s was
three steps. When a new request of game i arrives, the first the configuration σ j computed in step 2 and condition (15)
step immediately determines the DC (denoted by m) that the holds, and I(s, σ j ) = 0 otherwise.
new request will be dispatched to. Let D denote the set of Next, we show the asymptotic optimality of the request-to-
DCs that are eligible for the new request (i.e., satisfying the DC assignment algorithm. We have the following theorem:
network delay requirement). Then, the DC m is determined
according to Theorem IV.2. As η → 0, the request-to-DC dispatching
rates, as well as the configuration usage fractions produced
m := arg min[Qj,i /(ρ∗j μi ) + Qj bi /(rj∗ μi B)]. (14) by Algorithm 1, are close to optimal solutions of P2.
j∈D
Proof. Suppose parameters {ρ∗j }, {rj∗ }, {μi }, c and B are
In the second step, the algorithm updates the virtual queues. fixed rational numbers, such that condition (16) holds. When
First, the algorithm increases Qm,i by 1/(ρ∗m μi ), and Qm parameter η is close to 0, the virtual queuing process is
by bi /(rj∗ μi B). Then, the algorithm chooses a candidate a positive recurrent countable discrete-time Markov chain,
configuration σ j ∈ Sj for each DC j. After that, the algorithm which has a stable steady state. For a fixed η, denote by x̄s,j
(η)
needs to decide whether or not to activate a “super-server” the steady-state probability that configuration s is chosen as
for the virtual queues. If the super-server is activated, which (η)
σj and condition (15) holds. Let p̄i,j denote the steady-state
occurs when the condition probability that an arriving request of game i is assigned to DC
j j. According to the proof of Proposition 1 in [8], as η → 0,
η [ σi Qj,i + Qj ] > 1 (15)
Algorithm 1 solves the problem of minimizing the frequency
j∈J i∈I
222
of superserver activations, subject to the stability of virtual by the
weighted sum of the residual capacity of all resources,
queues. That is, if we pick a stationary distribution for each η i.e., k wk rk , where rk is the residual capacity of the server
(η) (η) ,
then, as η → 0, {x̄s,j }, {p̄i,j } converges to ({x̄s,j }, {p̄i,j }) for resource k and wk is the weight forresource k. In our
,
which is the set of optimal solutions of the following problem: implementation, for DC m, we set wk = i ai,k λi /(λAm,k ).
If no feasible server is found, we start a new server with des-
min max{ȳj , z̄j } (17)
{p̄i,j },{x̄s,j } j∈J ignation s := arg mins∈Sm :si >0 zim (s)/[si x̂s,m ], and assign
the request to this server.
subject to
0 ≤ p̄i,j ≤ Li,j /λi , ∀i, j (18) V. A DAPTATION TO DYNAMIC C HANGES
x̄s,j ≥ 0, ∀s ∈ Sj , ∀j (19) In the previous discussions, we have assumed that all the
system parameters are fixed. In this case, all the parameters
λi p̄i,j bi /μi ≤ B, ∀j (20)
used by Algorithm 1 do not need to be updated. In this section,
i
we discuss how the proposed algorithm adapts to dynamic
p̄i,j = 1, ∀i (21) changes in the parameters. For cloud gaming systems, we
j
note that: a) the request arrivals display a strong daily pattern
(λi /λ)p̄i,j /(ρ∗j · μi ) ≤ si cx̄s,j , ∀i, j (22) of peaks and troughs, but the proportion of arrivals for any
s∈Sj particular game do not change quickly; b) Except for the

x̄s,j = ȳj , ∀j, (23) request arrival rates, other system parameters change very
s∈Sj
infrequently. Based on these observations, we shall show that
the proposed request dispatching algorithm can automatically
where ȳj = s∈Sj x̄s,j and z̄j = i (λi /λ)p̄i,j vi /(Bcrj∗ μi ) adapt to the daily variation of request arrivals.
.
If we rewrite this problem in terms of variables λi,j = λi p̄i,j Let λ1 , · · · , λI denote the request arrival rates of games 1
,
xs,j = λcx̄s,j , yj = λcȳj and zj = λcz¯j , we obtain P2 through I at some time point in a day. Denote by {ρ∗j }λ1 ,··· ,λI
.
Therefore, the proof of the theorem is complete. and {rj∗ }λ1 ,··· ,λI the parameters {ρ∗j } and {rj∗ } for arrival rates
Request-to-server Assignment. After the DC m is deter- λ1 , · · · , λI . As we know, the request arrival rates of different
mined, the new request will be assigned to a specific server games change in a similar way in a daily period. Typically, the
within DC m by the request-to-server assignment subroutine. peak usage occurs during late evening and the trough usage
In order to do that, for each non-empty server in each DC, occurs in the early morning. At any time point t in a one-day
a designation configuration is allocated when the server is period, it is reasonable to denote the arrival rates of games by
started. If a server’s designation is s = (s1 , . . . , sI ), it means γt λ1 , · · · , γt λI , where γt is a scalar dependent on t. Then, we
that we will never place more than si requests of game i have the following theorem.
on this server. A server with designation s is referred to as Theorem V.1. Algorithm 1 with input parameters {ρ∗j }λ1 ,··· ,λI
an s-server. Empty servers do not have a designation. Once and {rj∗ }λ1 ,··· ,λI is optimal at any time point t in a daily
allocated, a server designation does not change until the server period, as long as the bandwidth is not a limiting factor during
becomes empty. peak time.
Consider an s-server in DC m. If the current number of
game i’s requests on the server is less than si , we say the server Proof. Let {λ∗i,j }, {Ns,j ∗
} denote an optimal solution of
is feasible for a request of game i. The algorithm first finds P1 when the arrival rates are λ∗1 , · · · , λI . It follows that
out all the feasible servers (denoted by H) for the new request. for each i ∈ I, ρ∗i = s∈Sj Ns,j . We wish to show that

Let zim (s) denote the total number of game i’s requests in s- {γt λ∗i,j }, {γt Ns,j
∗
} is a feasible solution of P1 when∗ the
servers in DC m. Recall that the configuration usage fractions arrival rates are γt λ1 , · · · , γt λI . Suppose {γt λ∗i,j }, {γt Ns,j }
produced by Algorithm 1 are close to optimal solutions of is
not the optimal
solution of P1. There must exist a solution

P2. Based on this fact, the request-to-server assignment strives {λi,j }, {Ns,j } of P1 such that
to make the real-time configuration usage fractions (which is
proportional to zim (s)/si ) be close to their optimal values (i.e., Cj ρj + Vj Brj < Cj γt ρ∗j + Vj Bγt rj∗ . (25)
x̂s,m ). For this, we assign the new request to the feasible j∈J j∈J j∈J j∈J
server (if there is at least one feasible server found) whose
It is easy to see that {λi,j }/γt , {Ns,j }/γt is a feasible
designation is
solution
∗ of P1 when the arrival rates are λ1 , . . . , λI . Since
s := arg min zim (s)/[si x̂s,m ], (24) {λi,j }, {Ns,j ∗
} is an optimal solution of P1 when the
H :s >0
s∈Sm ∗
i
arrival
rates are λ1 , . . . , λ I . It follows
that j∈J Cj ρj +
∗
j∈J Vj Brj < j∈J Cj ρj /γt + j∈J Vj Brj /γt , which
H
where Sm denotes the set of designations of all the feasible
servers in H. If there is more than one feasible server whose contradicts with (25). Therefore, {γt λ∗i,j }, {γt Ns,j ∗
} is
designation is s , we assign the request in a best-fit manner, the optimal solution of P1 when the arrival rates
i.e., to the server (whose designation is s ) with the minimum are γt λ1 , . . . , γt λI . It follows that {ρ∗j }γt λ1 ,··· ,γt λI =
residual capacity. The residual capacity of a server is measured γt {ρ∗j }λ1 ,··· ,λI and {rj∗ }γt λ1 ,··· ,γt λI = γt {rj∗ }λ1 ,··· ,λI .
223
Let γ min denote the minimum γt . If we set param- TABLE I
eters c such that c > maxi,j 1/(γ min ρ∗j μi ) and c > S UMMARY OF DC S
maxi,j bi /(γ min Brj∗ μi ), condition (16) is satisfied for all γt . DCs Server Type Server Cost Data Transfer Cost
In this case, the Algorithm 1 using {ρ∗j }λ1 ,··· ,λI , {rj∗ }λ1 ,··· ,λI
EC2-Virginia g3.4xlarge $1.14/Hour $0.090/GB
as the input parameters has the same behaviour with that using EC2-Oregon g3.4xlarge $1.14/Hour $0.090/GB
γt {ρ∗j }λ1 ,··· ,λI , γt {rj∗ }λ1 ,··· ,λI , 1/γt as the input parameters. EC2-Frankfurt g3.4xlarge $1.43/Hour $0.090/GB
It implies that the algorithm using {ρ∗j }λ1 ,··· ,λI , {rj∗ }λ1 ,··· ,λI EC2-Singapore p2.xlarge $1.78/Hour $0.120/GB
EC2-Tokyo p2.xlarge $1.52/Hour $0.140/GB
as input parameters also produces optimal configuration us- EC2-Seoul p2.xlarge $1.47/Hour $0.126/GB
age fractions for arrival rates γt λ1 , · · · , γt λI . Therefore, the
theorem is proven.
TABLE II
Theorem V.1 implies that against the daily variations of S UMMARY OF G AMES
the request arrival rates, the proposed dispatching algorithm
can run optimally without adjusting {ρ∗j } and {rj∗ }. For the
Games (CPU, GPU, RAM) Games (CPU, GPU, RAM)
Battlerite2 (0.42, 0.48, 0.23) LOL2 (0.60, 0.25, 0.30)

other parameter changes (such as the number of games and the Need For Speed1 (0.32, 0.13, 0.29) Pro Soccer1 (0.57, 0.56, 0.17)
number of server types), in order to keep the algorithm close TankX1 (0.78, 0.75, 0.27) NBA Online1 (0.30, 0.36, 0.20)
Warframe1 Bandai Namco1
to optimal, we can recalculate {ρ∗j } and {rj∗ } by solving P1.
(0.35, 0.57, 0.25) (0.55, 0.56, 0.22)
XuanYuan2 (035, 0.25, 0.22) Dota22 (0.24, 0.68, 0.37)
This will not incur a huge overhead as these changes do not After Dreams2 (0.13, 0.20, 0.24) Borderland21 (0.58, 0.50, 0.52)
DDDA2 (0.45, 0.80, 0.42) Destined1 (0.22, 0.50, 0.28)
happen very often. H1Z12 (0.80, 0.80, 0.90) PES 20171 (0.25, 0.20, 0.33)
PlanetSide21 (0.28, 0.70, 0.16) RiME2 (0.40, 0.70, 0.50)
Logout2 (0.52, 0.80, 0.95) GUNS UP!2 (0.10, 0.30, 0.13)
VI. E VALUATIONS
We develop a discrete event-driven simulator to simulate a according to the current player count (which can be obtained
cloud gaming platform and conduct extensive experiments to from [35]), i.e., a smaller index indicates a larger player count.
evaluate the proposed dispatching algorithm. Then, we simulate the request arrivals ofeach game i as a
A. Simulation Setup Poisson process with rate λi = (λ/iα )/[ i 1/iα ], where α
is the shape parameter of the Zipf distribution function and
In order to make the evaluations more realistic, most of
is the total arrival rate of all the games. For each game i,
the parameters are set according to real-world data. In the λthe mean session length 1/μi is randomly distributed between
simulation, the cloud gaming platform is deployed on six
10 minutes and 60 minutes. By default, we set λ = 400 and
geographically distributed DCs (the list of DCs is shown in
α = 1.2 in the experiments. We will vary λ and α to study
Table I) and provides services for 20 popular games (the list
the impact of different arrival rates in the experiments.
of games are shown in Table II). We consider three types
of resources: CPU, GPU and RAM. The resource demands We roughly classify the games into two different genres,
of games are measured as follows. We set up a benchmark which are fast-paced games (marked by superscript 1 in Table
server configured with an Intel i5-7400 CPU, 8 GB RAM II) and moderate-paced games (marked by superscript 2 in
and an NVIDIA GeForce GTX1060 graphic card with 3GB Table II). We assume the games in the same genre have the
RAM. For every resource k, we normalize the total capacity same latency requirement. Following the work in [31], [32],
of the benchmark server to 1. We run each game i (with a we set the latency requirement at 100 milliseconds for the fast-
resolution 1280x720) on the benchmark server for a period paced games, and at 300 milliseconds for the moderate-paced
of 10 minutes. The resource demand ai,k is defined to be the games.
mean utilization of resource k on the benchmark server during In order to generate the network delay from the player
the running period. The resource demands of the 10 games are of a request to the DCs, we randomly assign the arriving
summarized in Table II. request to one from a selection of 200 PlanetLab nodes,
Two commonly used GPU servers in Amazon EC2 are and assume that the player is playing from this location. We
g3.2xlarge and p2.xlarge. For simplicity, we assume the DCs then compute the network delay from the player to each DC,
in Virginia, Oregon and Frankfurt provide servers of type according to the network latency dataset collected by Wu et
g3.2xlarge, and the DCs in Singapore, Tokyo and Seoul al [48], which contains latency (RTT) measurements between
provide servers of type p2.xlarge. The resource capacity Aj,k PlanetLab nodes and DCs of Amazon.
of each server type is defined in a normalized fashion as We assume that the bitrate of video streaming for each game
follows: We run game i on cloud server j for 10 minutes and is uniformly distributed between 3 Mbps and 5 Mbps, which
of resource k as ui,k . Then we
write the average is set according to the measurements in [36]. We assume the
consumption
define Aj,k = i ai,k /ui,k /I. The rental cost of different
bandwidth capacity of each DC is 1280 Gbps, which is set
server types in different DCs, as well as the bandwidth cost according to the network scale of Google’s DC [37]. In the
rate, are set based on Amazon EC2’s pricing model. Table I simulation, the total request arrival rate λ varies in the range
summarizes the details of the DCs. from 50 to 1600 requests per minute. Other parameters used
The arrival rates of games in cloud gaming generally follow in the dispatching algorithm are set or computed as described
Zipf’s law [34]. We index the games in descending order in Section VI-B.
224
1.4 1.4 1.4
Normalized Cost Shadow Shadow Shadow
Normalized Cost
Normalized Cost
1.3 1.3 1.3
1.2 1.2 1.2
1.1 1.1 1.1
1 1 1
1.01 1.03 1.05 1.07 1.09 0.25 0.5 1 2 4 0.1 0.3 0.5 0.7 0.9
Value of c (× max{A1 , A2 }) Value of ϵ Value of θ
Fig. 2. Parameter settings for Algorithm 1
Given all the parameters, we use JOptimier [38] to solve P1, the results in [8]), which implies that ηQj,i and ηQj are
yielding {ρ∗j } and {rj∗
}, and the optimal
value of the objective almost
constants in condition (16) as η → 0. It follows that
function of P1, i.e., j∈J Cj ρ∗j + j∈J Vj rj∗ B, which we η[ j,i Qj,i + j Qj ] ≈ 1 if we set “crude” values to the
note is a lower bound of the total cost. Moreover, we compare parameters in (16). Then, we can obtain η[J(1 + I)c] ≈ 1,
the proposed dispatching algorithm with the following two where J(1 + I) is the total number of virtual queues. Based
alternatives: on this conclusion, we use an adjustment parameter for η,
Lowest Latency Assignment (LLA). In this algorithm, such that η = /(J(1 + I)c). Figure 2(b) presents the mean
each request is dispatched to the nearest DC in terms of cost produced by the shadow routing algorithm as varies
network latency among all its eligible DCs. If there is more from 0.25 to 4. As can be seen, the shadow routing algorithm
than one DC with attaining the same minimum latency, the performs better when is close to 1.0. In the rest of the
algorithm randomly chooses one of them. experiments, we set = 1.0.
Lowest Combined Price Assignment (LCPA). This algo- The parameter θ should be between 0 and 1. Figure 2(c)
rithm dispatches each request to the eligible DC which has the presents the mean cost produced by the shadow routing
lowest combined price for the request. The combined price of a algorithm as θ varies from 0.1 to 0.9. As can be seen, the cost
DC j for a request of game i is defined as k ai,k wk Cj +bi Vj , for different θ is similar, indicating that the shadow algorithm
where wk is the weight for resource k, which is defined as in is not very sensitive to θ, as long as it is within a reasonable
Section IV-B. range. In the rest of the experiments, we set θ = 0.1.
After the DC is determined, both LLA and LCPA assigns the
request to server within the selected DC in a best-fit manner. C. Results
That is, among all the running servers that can accommodate
the request, they assign the request to the server with the 1) Static Request Arrival Rates: We first evaluate the
minimum residual capacity. If no server can accommodate the proposed algorithms in the scenario with static request arrival
request, a new server is started. rates. The simulation is warmed up for a period of 24 hours
and then runs for another 24 hours. Figure 3 shows the real-
B. Parameter Settings for Algorithm 1 time cost produced by each algorithm over the simulation
The dispatching algorithm uses several parameters which period for λ = 400 (requests per minute). Among all the
should be properly set. The parameter c should satisfy (16), algorithms, the shadow routing algorithm produces the lowest
but it should not be too large. Otherwise, the increments cost, which is relatively close to the optimal. It indicates
of virtual queues in one step will be very large. Figure that the dispatching rates of different types of requests to
2(a) shows the mean cost (normalized to the lower bound) DCs and the configuration usage fractions produced by the
produced by the shadow routing algorithm as c varies from shadow algorithm are close to optimal. The LCPA algorithm
c = 1.01 max{A1 , A2 } to c = 1.09 max{A1 , A2 } (A1 and produces higher cost than the shadow algorithm. This is
A2 are the right-hand sides of condition (16)) in the scenario because it does not consider the configuration usage fractions
with a static request arrival rate. The simulation is warmed up in the assignment of requests to servers. The LLA algorithm
for a period of 24 hours and then runs for another 24 hours. achieves the worst performance among all the algorithms.
The other parameters of Algorithm 1 are set by default (the This is because both the dispatching rates of requests and
default values will be presented later). As can be seen, the the configuration usage fractions are not considered in this
cost increases as c grows. So, we set c = 1.01 max{A1 , A2 } algorithm.
in the rest of the experiments. Figure 4 shows the mean cost (normalized to the lower
According to the results in [8], the smaller the value of bound) produced by each algorithm over a period of 24 hours
η the more stationary that the system will be, but η cannot for different arrival rates. We can see that the shadow algorithm
be too small because the system will take time O(1/η) to is very cost-efficient for large-scale systems with high arrival
transform from the current steady state to a new steady state. rates. However, the benefit is reduced when the arrival rate
As η → 0, the scaled virtual lengths ηQj,i and ηQj in steady is small. This is because fewer servers are used with a small
state converge to a set of optimal non-negative values (see arrival rate, making the queuing system less stable and thus
225
1.4 1.4 1.4
LLA LCPA Shadow
Normalized Cost
LLA LCPA Shadow LLA LCPA Shadow
Normalized Cost
Normalized Cost
1.3 1.3 1.3
1.2 1.2 1.2
1.1
1.1 1.1
1
0 4 8 12 16 20 24 1 1
50 100 200 400 800 1,600 10 20 40 60 80 100
Time (Hours)
Total Arrival Rate (λ) Number of Games
Fig. 3. Real cost produced by each algorithm

Fig. 4. Impact of total arrival rate (λ) Fig. 5. Impact of number of games
during a 24 hours period
1.4 500 3
LLA LCPA LLA LCPA Shadow
Normalized Cost
LLA LCPA Shadow
Real Cost (×102 )

400
Normalized Cost
1.3 Shadow Lower Bound 2.5

300
2
1.2 200
100 1.5
1.1
0 1
1 12:00 00:00 12:00 00:00 12:00 12:00 00:00 12:00 00:00 12:00
1 1.1 1.2 1.3 1.4 1.5
Hours Hours
Value of α
Fig. 7. Performance with dynamic request arrival Fig. 8. Performance with dynamic request arrival
Fig. 6. Impact of Zipf’s shaping parameter α rates (absolute cost) rates (normalized cost)
the configuration usage fractions produced by the shadow routing algorithm is adaptive to various distributions of arrival
algorithm are not so close to optimal. rates.
We next simulate more varieties of games to study the 2) Dynamic Arrival Rates: We next show how the proposed
scalability of the proposed algorithm. The resource demands of algorithms perform with dynamic request arrival rates. In this
the simulated games are synthetically generated based on the experiment, the request arrivals are generated according to the
measurements of the 10 real games. Let amin k and amax
k denote WoWAH dataset [39]. The WoWAH dataset contains 667032
the minimum and the maximum resource demand for resource game sessions which are observed over a 3-year period in a
k of the 10 games. We assume the resource demands of the World of Warcraft server in Taiwan. For each game session
simulated games for each resource k are uniformly distributed in the WoWAH dataset, we generate a play request whose
in the range [amin
k , amax
k ]. Note that as the number of games start/ending time is set according to the start/ending time of
increases, the size of Sj increases rapidly, which increases the game session. Then, we choose game i as the requested
the computational complexity of solving P1. To address this game with a probability of (1/iα )/[ i 1/iα ].
issue, we divide the games into several groups. For each group, Figure 7 shows the real-time cost produced by each algo-
we run the shadow algorithm separately, i.e., the games from rithm over a randomly selected period of two days. Figure
different groups cannot share servers. 8 shows the normalized cost of each algorithm to the lower
Figure 5 shows the performance of the algorithms for bound. For each sampling time point t, the lower bound of the
different numbers of simulated games. In this experiment, total cost is estimated as follows. Let Ni (t) denote the number
games are randomly partitioned into equal-sized groups with of game i’s request arrivals over the period [t − 10, t + 10] (the
each group having 10 games. We can see that the performance unit of t is minute). We estimate the request arrival rate of
of the shadow algorithm is always relatively close to the game i at time t as λi (t) = Ni (t)/20. Then, the lower bound
lower bound as the number of games increases, and at times of the total cost can be computed by solving P1 using λi (t) as
outperforms the other algorithms significantly. It indicates that the inputs. Let λmin
i denote the minimum arrival rate of game
the shadow algorithm is scalable and the impact of partitioning i during the simulation period. The parameters {ρ∗j } and {rj∗ }
the games in this way is insignificant. used in the shadow algorithm are computed using {λmin i } as
Finally, we evaluate the impact of parameter α in the Zipf the arrival rates of games.
distribution function for controlling the distribution of the From Figures 7 and 8 we can see that the shadow algorithm
arrival rates. Figure 6 shows the mean cost produced by each always outperforms the other two algorithms, and that the
algorithm as α varies from 1 to 1.5. A small α will make shadow algorithm can automatically adapt to the daily work-
the distribution more uniform while a large α will make the load variations without adjusting parameters {ρ∗j } and {rj∗ }.
distribution more skewed. As can be seen, the performance We also observe that for all the algorithms, the performance
of the shadow routing algorithm is similar for different α, in the “declining” period in a daily period (i.e., from 08:00 to
which outperforms LLA and LCPA, indicating that the shadow 18:00) is worse than the performance during the “climbing”
226
period (i.e., from 18:00 to the next day’s 08:00). This is [15] M. Basiri and A. Rasoolzadegan, “Delay-aware resource provisioning for
cost-efficient cloud gaming,” IEEE Transactions on Circuits & Systems
because there are more play request arrivals than departures for Video Technology, vol. PP, no. 99, pp. 1–1, 2016.
in the climbing period. In this case, all the running servers [16] H. Tian, D. Wu, J. He, Y. Xu, and M. Chen, “On achieving cost-
are almost fully occupied by the continuously arrival requests, effective adaptive cloud gaming in geo-distributed data centers,” IEEE
Transactions on Circuits & Systems for Video Technology, vol. 25,
causing the server to be almost fully utilized and thus the no. 12, pp. 2064–2077, 2015.
cost produced by the algorithms is close to the optimal. In [17] D. Wu, X. Zheng, and J. He, “iCloudAccess: cost-effective streaming
contrast, there are more player departures than arrivals in the of video games from the cloud with low latency,” IEEE Transactions on
Circuits and Systems for Video Technology, vol. 24, no. 8, 2014.
declining period, which causes many servers to run at low [18] Y. Li, X. Tang, and W. Cai, “On dynamic bin packing for resource
levels of resource utilization so that the total cost produced by allocation in the cloud,” in Proceedings of the 26th ACM Symposium on
the algorithms is far from optimal. Compared with the other Parallelism in Algorithms and Architectures, 2014, pp. 2–11.
[19] ——, “Play request dispatching for efficient virtual machine usage in
two algorithms, the performance degradation of the shadow cloud gaming,” IEEE Transactions on Circuits and Systems for Video
algorithm during the declining period is much lower. Technology, vol. 25, no. 12, pp. 2052–2063, 2015.
[20] Y. Deng, Y. Li, X. Tang, and W. Cai, “Server allocation for multiplayer
VII. C ONCLUSIONS cloud gaming,” in Proceedings of the 2016 ACM International Confer-
ence on Multimedia. ACM, 2016, pp. 918–927.
In this paper, we studied the play request dispatching prob- [21] Y. Deng, Y. Li, R. Seet, X. Tang, and W. Cai, “The server allocation
lem for distributed cloud gaming infrastructure. We employ problem for session-based multiplayer cloud gaming,” IEEE Transac-
tions on Multimedia, vol. PP, no. 99, pp. 1–1, 2017.
a shadow routing based approach, which is asymptotically [22] Y. Zhao, H. Jiang, K. Zhou, Z. Huang, and P. Huang, “Meeting service
optimal, to jointly solve the request-to-DC assignment and level agreement cost-effectively for video-on-demand applications in the
request-to-server assignment problem for minimizing the total cloud,” in IEEE Conference on Computer Communications 2014. IEEE,
2014, pp. 298–306.
resource rental cost. Simulations show that our algorithm [23] Y. Wu, C. Wu, B. Li, X. Qiu, and F. C. Lau, “Cloudmedia: When
is more cost-efficient than existing heuristics, scalable with cloud on demand meets video on demand,” in 2011 31st International
a large number of games and highly adaptive to dynamic Conference on Distributed Computing Systems (ICDCS). IEEE, 2011,
pp. 268–277.
changes. In this paper, the proposed algorithms were evaluated [24] R. Panigrahy, K. Talwar, L. Uyeda, and U. Wieder, “Heuristics for vector
using Amazon EC’s cloud resources only. In the future, we bin packing,” 2011.
would like to consider using resources from more cloud [25] E. G. Coffman, Jr, M. R. Garey, and D. S. Johnson, “Dynamic bin
packing,” SIAM Journal on Computing, vol. 12, no. 2, pp. 227–258,
providers to evaluate the proposed algorithms. We also wish 1983.
to consider the performance interferences among co-located [26] A. L. Stolyar, “Large-scale heterogeneous service systems with general
games when packing the requests onto servers. Moreover, we packing constraints,” Advances in Applied Probability, vol. 49, no. 1,
pp. 61–83, 2015.
would like to apply similar ideas to general datacenters. [27] S. T. Maguluri, R. Srikant, and L. Ying, “Stochastic models of load
balancing and scheduling in cloud computing clusters,” Proceedings of
R EFERENCES IEEE INFOCOM, vol. 131, no. 5, pp. 702–710, 2012.
[1] (2018) GeForce Now. [Online]. Available: http://www.geforce.com/ [28] J. W. Jiang, T. Lan, S. Ha, M. Chen, and M. Chiang, “Joint vm placement
[2] (2018) LiquidSky. [Online]. Available: https://liquidsky.com/ and routing for data center traffic engineering,” Proceedings of IEEE
[3] (2018) Simplay. [Online]. Available: https://simplay.io/ INFOCOM, vol. 131, no. 5, pp. 2876–2880, 2012.
[4] (2018) Vortex. [Online]. Available: https://vortex.gg/ [29] A. L. Stolyar and T. Tezcan, Control of systems with flexible multi-
[5] (2018) Amazon EC2. [Online]. Available: https://aws.amazon.com/ec2/ server pools: a shadow routing approach. J. C. Baltzer AG, Science
[6] (2018) Microsoft Azure. [Online]. Available: Publishers, 2010.
https://azure.microsoft.com/ [30] X. Fei, F. Liu, H. Xu, and H. Jin, “Towards load-balanced vnf assign-
[7] R. Shea, D. Fu, and J. Liu, “Rhizome: utilizing the public cloud to ment in geo-distributed nfv infrastructure,” in IEEE/ACM International
provide 3D gaming infrastructure,” in Proceedings of the 6th ACM Symposium on Quality of Service, 2017.
Multimedia Systems Conference. ACM, 2015, pp. 97–100. [31] M. Claypool and K. Claypool, “Latency and player actions in online
[8] G. Yang, A. L. Stolyar, and A. Walid, “Shadow-routing based dynamic games,” Communications of the Acm, vol. 49, no. 49, pp. 40–45, 2006.
algorithms for virtual machine placement in a network cloud,” in 2013 [32] M. Jarschel, D. Schlosser, S. Scheuring, and T. Hofeld, “Gaming in
Proceedings IEEE INFOCOM, 2013, pp. 620–628. the clouds: Qoe and the users perspective,” Mathematical & Computer
Modelling, vol. 57, no. 1112, pp. 2883–2894, 2013.
[9] C.-Y. Huang, C.-H. Hsu, Y.-C. Chang, and K.-T. Chen, “GamingAny-
[33] S. Boyd and L. Vandenberghe, Convex optimization. Cambridge
where: an open cloud gaming system,” in Proceedings of the 4th ACM
university press, 2004.
Multimedia Systems Conference, 2013, pp. 36–47.
[34] D. Finkel, M. Claypool, S. Jaffe, T. Nguyen, and B. Stephen, “Assign-
[10] M. Claypool, D. Finkel, A. Grant, and M. Solano, “On the performance
ment of games to servers in the OnLive cloud game system,” in 2014
of OnLive thin client games,” Multimedia Systems, pp. 1–14, 2014.
13th Annual Workshop on Network and Systems Support for Games.
[11] Y.-T. Lee, K.-T. Chen, H.-I. Su, and C.-L. Lei, “Are all games equally
IEEE, 2014, pp. 1–3.
cloud-gaming-friendly? an electromyographic approach,” in 11th Annual
[35] (2018) Steam & Game Stats. [Online]. Available:
Workshop on Network and Systems Support for Games. IEEE, 2012,
http://store.steampowered.com/stats/
pp. 1–6.
[36] K. T. Chen, Y. C. Chang, H. J. Hsu, D. Y. Chen, C. Y. Huang, and
[12] H. Ahmadi, S. Z. Tootaghaj, M. R. Hashemi, and S. Shirmohammadi, “A
C. H. Hsu, “On the quality of service of cloud gaming systems,” IEEE
game attention model for efficient bit rate allocation in cloud gaming,”
Transactions on Multimedia, vol. 16, no. 2, pp. 480–495, 2014.
Multimedia Systems, pp. 1–17, 2014.
[37] (2018) Google data centers. [Online]. Available:
[13] M. Hemmati, A. Javadtalab, A. A. Nazari Shirehjini, S. Shirmohammadi,
https://en.wikipedia.org/wiki/Google Data Centers
and T. Arici, “Game as video: bit rate reduction through adaptive object
[38] (2018) JOptimier. [Online]. Available: http://www.joptimizer.com/
encoding,” in Proceeding of the 23rd ACM Workshop on Network and
[39] Y.-T. Lee, K.-T. Chen, Y.-M. Cheng, and C.-L. Lei, “World of warcraft
Operating Systems Support for Digital Audio and Video. ACM, 2013,
avatar history dataset,” in Proceedings of the Second Annual ACM
pp. 7–12.
Conference on Multimedia Systems. ACM, 2011, pp. 123–128.
[14] H.-J. Hong, D.-Y. Chen, C.-Y. Huang, K.-T. Chen, and C.-H. Hsu,
“Placing virtual machines to optimize cloud gaming experience,” IEEE
Transactions on Cloud Computing, 2014.
227

Cost-Efficient Request Dispatching in Geo-Distributed Cloud Gaming Infrastructure

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Cost-Efficient Request Dispatching in Geo-Distributed Cloud Gaming Infrastructure

Uploaded by

Copyright:

Available Formats

2020 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing,

Cost-Efﬁcient Request Dispatching in

Abstract—Cloud gaming is a recent approach to gaming, where

978-1-6654-1485-2/20/$31.00 ©2020 IEEE 218

Battlerite2 (0.42, 0.48, 0.23) LOL2 (0.60, 0.25, 0.30)

1.2 1.2 1.2

1.1 1.1 1.1

LLA LCPA Shadow LLA LCPA Shadow

Fig. 3. Real cost produced by each algorithm

Real Cost (×102 )

1.3 Shadow Lower Bound 2.5

You might also like