Professional Documents
Culture Documents
JSA D 22 00243 Reviewer
JSA D 22 00243 Reviewer
Keywords: mobile edge computing; DNN service migration; multi-exit DNN; model predictive
control
Abstract: Edge intelligence (EI) becomes a trend to push the deep learning frontiers to the
network edge. In this paper, we consider a user-centric management for DNN
inference service migration and exit point selection, aiming at maximizing overall user
utility (e.g., DNN model inference accuracy) with various service downtime. We first
leverage dynamic programming to propose an optimal offline migration and exit point
selection strategy (OMEPS) algorithm when complete future information of user
behaviors is available. Amenable to a more practical application domain without
complete future information, we incorporate the OMEPS algorithm into a model
predictive control (MPC) framework, and then construct a mobility-aware
servicemigration and DNN exit point selection (MOMEPS) algorithm,
which improves the long-term service utility within limited predictive future information.
However, heavy computation overheads of MOMEPS algorithm impose burdens on
mobile devices, thus we further advocate a cost-efficient algorithm, named smart-
MOMEPS, which introduces a smart migration judgement based on Neural Networks to
control the implementation of (MOMEPS) algorithm by wisely estimating whether the
DNN service should
Powered by Editorial Manager® and ProduXion Manager® from Aries Systems Corporation
Manuscript File Click here to view linked References
1
Abstract—Edge intelligence (EI) becomes a trend to push the constrained computing capability and energy. For example,
deep learning frontiers to the network edge, so that deep neural real-time video analytic applications require processing tens
networks (DNNs) applications can be well leveraged at resource- of frames per second by computation-intensive DNN models
constrained mobile devices with benefits of edge computing. Due
to the high user mobility among scattered edge servers in many (e.g., YOLO, ResNet), which are unsuitable for resource-
scenarios such as internet of vehicular applications, dynamic constrained mobile devices [1]. To tackle this challenge, mo-
service migration is desired to maintain a reliable and efficient bile edge computing (MEC) is proposed to push these services
quality of service (QoS). However, inevitable service downtime from the local IoT devices to the network edges, which are
incurred by service migration would largely degrade the real- servers in proximity to mobile devices (e.g., at base stations or
time performance of delay-sensitive DNN inference services.
Fortunately, based on the characteristics of container-based DNN WiFi hotpots). Mobile devices can collaborate with network
services, exit point selection and layer sharing feature of the edges to achieve a higher quality of service and a low latency
container technique can alleviate such performance degradation. via task offloading [2].
Thus, we consider a user-centric management for DNN inference However, maintaining a reliable and efficient service perfor-
service migration and exit point selection, aiming at maximizing mance of DNN applications at the network edge is nontrivial
overall user utility (e.g., DNN model inference accuracy) with var-
ious service downtime. We first leverage dynamic programming due to the mobility of users. To reduce the network latency,
to propose an optimal offline migration and exit point selection DNN services are generally deployed at a nearby edge server,
strategy (OMEPS) algorithm when complete future information where users can access their services through wireless commu-
of user behaviors is available. Amenable to a more practical nications to a local base station (BS), such as WiFi hotspots.
application domain without complete future information, we When a user moves away from the service scope of the nearby
incorporate the OMEPS algorithm into a model predictive control
(MPC) framework, and then construct a mobility-aware service edge server, the connection will switch to a new BS for remote
migration and DNN exit point selection (MOMEPS) algorithm, access to the service, and the QoS will inevitably degrade due
which improves the long-term service utility within limited to the longer transmission delay caused by increasing network
predictive future information. However, heavy computation over- distance. Although methods such as service replica [3], can
heads of MOMEPS algorithm impose burdens on mobile devices, alleviate the service performance degradation caused by user
thus we further advocate a cost-efficient algorithm, named smart-
MOMEPS, which introduces a smart migration judgement based mobility via backing up services on different edge servers, it
on Neural Networks to control the implementation of (MOMEPS) brings issues such as user privacy and storage redundancy.
algorithm by wisely estimating whether the DNN service should Therefore, we focus on another widely adopted approach,
be migrated or not. Extensive trace-driven simulation results dynamic service migration [4]–[7], to solve this problem.
demonstrate the superior performance of our smart-MOMEPS Designing a suitable dynamic service migration scheme
algorithm for achieving significant overall utility improvements
with low computation overheads compared with other online faces a few challenges. Specifically, frequent service migration
algorithms. would cause extensive service downtime [8], and thus is
unacceptable for latency sensitive DNN services, especially
Index Terms—DNN service migration, multi-exit DNN, service
downtime, mobile edge computing, model predictive control. in internet of vehicular applications where the users have
high mobility. Besides, blindly allowing service to follow user
mobility (such as Ping-pong loops, i.e., the user goes back
I. I NTRODUCTION and forth between two BSs’ coverages) incurs unnecessary
Container Image Registry delay. To combat the service outage, we could run DNN
service on a mobile device with a proper exit point during
Layer 1
...
migration. However, exiting inference at an early DNN layer
Layer n results in a loss of accuracy. As the network distance continues
DNN service
container image to increase, the benefits of early exit will gradually decrease
as the inference needs to exit at earlier exit point.
As illustrated in Fig. 1, a mobile vehicle travels around
EDGE1 EDGE2 EDGE3
DNN service three regions, each containing a BS endowed with an edge
conv
conv
conv
Exit 1
FC
The rest of this paper is organized as follows. Section II Container-based Service Migration: In addition,
briefly reviews related works of service migration. In section container-based service migration is specifically considered
III, we describe the system model and problem formulation. In in our work which is more lightweight and saves storage
section IV, we propose an optimal offline service migration and space compared to traditional VM. To reduce the migration
exit point selection algorithm to maximize overall user utility. overhead, [11] designed an efficient live migration system
We explain the MPC-based online algorithm and furthermore which ensures integrity of components by leveraging the
propose a more lightweight algorithm in Section V. In Section layered structure of container. In [9], authors proposed
VI, we evaluate our online algorithm to demonstrate the an edge computing platform architecture which supports
effectiveness. Finally, we conclude this paper in Section VII. seamless docker container migration of offloading services
while also keeping the moving mobile user with its nearest
edge server. [12] presented Voyager, a just-in-time live
II. R ELATED W ORK
container migration service which combines service memory
With the rapid development of MEC, which allows low state migration with local filesystem migration to minimize
transmission delay and fast response speed, significant chal- service downtime. However, these works mainly focus on
lenges are gradually emerging. The inherent characteristics of leveraging container features to reduce service downtime.
MEC services which are distributed in different geographic While our work incorporates the impact of various service
regions, and the mobility of users bring dominating difficulties. downtime due to layer sharing feature of container into
Hence, edge service migration solution, which can help to migration and improves long-term user utility by optimizing
maintain QoS in a dynamic MEC environment, has become migration decisions.
one of the significant research topics. DNN Inference Service Migration: Since DNN has been
Service Migration without Future Prediction: Plenty of extensively applied for intelligent applications at the edge,
literature has focused on service migration in MEC to cope ongoing service downtime becomes extremely unaffordable
with the key challenge of user dynamics. A branch of such for latency-sensitive DNN inference services. To figure out
works accommodate arbitrary user mobilities without predic- the above difficulty, [25] developed a mobility-included DNN
tion. Zhao et al. tackled virtual machine migration strategy partition offloading algorithm to adapt to user movement.
based on multiple attribute decision-making, aiming at mini- Wang et al. adopted DNN inference exit at earlier layers during
mizing a comprehensive cost given current network situation service outage to shorten the inference delay by sacrificing
and user location [17]. In [18], Ouyang et al. researched an an acceptable level of accuracy [26]. Different from [25] and
adaptive service placement mechanism at MEC and formulated [26], we solve DNN inference service outage collaboratively
it as a contextual multi-armed bandit learning problem to via early exit mechanism and dynamic service migration to
optimize user’s perceived latency and service migration cost. maintain the user QoS.
Besides, many studies utilized Markov decision process-based
(MDP) methods to solve dynamic service placement under the III. S YSTEM M ODEL AND P ROBLEM F ORMULATION
assumption that user mobility follows or can be approximated
In this section, we present system model and problem
by a Markov chain-mobility model. Specifically, [19] worked
formulation for DNN service migration with downtime and
on balancing a trade-off between migration cost and quality by
exit point selection in mobile edge computing.
modeling the service migration procedure using MDP. In [20],
the optimal policy of edge service migration formulated as
MDP was proved to be a threshold policy when user mobility A. Overview of Dynamic Migration for DNN Service
follows a one-dimensional (1-D) asymmetric random walk Fig. 1 illustrates a typical edge intelligence scenario, i.e.,
mobility mode. edge-assisted one-device object detection for video streaming
Service Migration with Future Prediction: Accordingly, analytics. More explicitly, a smart vehicle traveling around an
user activities prediction in service migration scenario is also urban city uses DNN service to process surrounding informa-
widely studied to improve user QoS. Ma et al. incorporated tion gathered by its camera. Based on existing virtualization
two-timescale Lyapunov optimization method and limited user technologies, the service profile and environment for multi-
mobility prediction to find the optimal service placement exit DNN model (e.g., AlexNet) can run on a dedicated
decisions [21]. Zhang et al. in [22] overcame the challenges container. Thus, the real-time video stream can be collabora-
of an underlying dynamic rendering-module placement prob- tively processed at both local vehicle and nearby edge server
lem by leveraging model predictive control (MPC) to tackle via task offloading [1]. To guarantee a reliable QoS for a
user trajectory prediction at the edge. Both [23] and [24] moving vehicle, dynamic DNN service migration and exit
took advantages of MPC to work out dynamic placement point selection are adopted to accommodate the high dynamics
of virtual network functions (VNF) and achieved efficient within required completion time.
resource scheduling. However, service downtime, which is With the presented edge-assisted architecture, we consider
widespread at the edge and has great influence on migration a set of base stations (BSs): B = {1, ..., B}, each of which is
policies, was not considered in these works. In this paper, equipped with an edge server to provide DNN service to the
we take service downtime fully into account and propose an user. And the user accesses the service through the nearest base
MPC-based algorithm, which integrates NN-based migration station. In line with the recent work [27] on edge computing,
judgement for mitigating the heavy computation cost of MPC. we adopt a discrete time-slotted model to fully characterize the
4
Latency of offloading: When user offloads its request to an the problem. Next, we focus on the challenging case without
edge server, the request delay lf (t) is jointly determined by future information. We leverage advanced machine learning
the DNN inference delay and communication delay. We set the methods to predict limited information, and devise an online
delay for edge server to perform model inference at the j-th algorithm.
early exit point as dfj . lc (t) denotes the communication delay
at t. Apparently, lc (t) includes the delay for transmitting the IV. O FFLINE O PTIMAL S ERVICE M IGRATION AND E XIT
request from the user’s device to the BS that user is connected P OINT S ELECTION A LGORITHM S OLUTION
to at time t, and the forwarding delay which majorly depends In this section, we present the offline optimal solution of
on the hop distance along the shortest communication path the service migration and DNN exit point selection problem
among these BSs. The constraints of lf (t) can be expressed through dynamic programming, when the complete future
as: information of the user activities is assumed to be exactly
lf (t) = S(t)(lc (t) + dfj ), known.
First, we show that this offline optimal problem possesses
lf (t) ≤ Tmax , ∀t ∈ T . (9)
the property of optimal substructure. We define (i, q, n) as
Latency of migration: During migration, DNN inference the state of the user, in which i ∈ T represents time, q ∈ B
will be processed at local device with a fixed early exit point represents the BS that hosts the user service and n ∈ M
γ ∈ M. Hence the user request latency le (t) is mainly deter- represents the selected DNN model exit point. Accordingly,
mined by the computation delay. We set the DNN inference (i− , q − , n− ) is the previous state of (i, q, n). We set p ∈ B
delay while running at local device as deγ , and its constraints and m ∈ M as the user migration decision and exit point
are shown as follows: selection, respectively. Let C((i, q), p, m) represent the whole
time slots to complete p and m. It can be expressed as
le (t) = (1 − S(t))deγ ,
le (t) ≤ Tmax , ∀t ∈ T . (10) 1,
p = q,
C((i, q, n), p, m) = sp , p ̸= q, sp + i ≤ T, (12)
T − i, p ̸= q, sp + i > T.
E. Problem Formulation
After introducing the user’s decision variables and request Here, p = q indicates that user decides to re-select the model
latency, we are ready to present the user utility maximization exit point in the current state, and it cost one time slot to
problem in a given time horizon. Here we define the utility as complete. When the user chooses to migrate service (p ̸= q),
the inference accuracy of the DNN model. The accumulated the decision execution time is the service downtime (sp ), and
user utility contains the utility in migrating and offloading. We the portion beyond T is discarded.
define the utility at j-th early exit point is uj . The user request Let U ((i, q, n), p, m) represent the sum of the utilities
rate at time t is rt . When the request is offloading, the user during the completion of decision p and m. Note that DNN
utility Uo (t) at time t is the sum of the utility of all requests inference will be processed at local device with a fixed early
at that time, which can be expressed as: exit point γ during service downtime (p ̸= q). f (i, q, n)
represents the accumulated utility at state (i, q, n). Then, given
Uo (t) = S(t)rt uj , ∀t ∈ T . (i, q, n) and the user’s decisions p and m, the state is transited
During migration, we set the utility at a fixed exit point γ to (i+ , q + , n+ ), where
is uγ , and the migration user utility Um (t) at time t can be i+ = i + C((i, q, n), p, m), (13)
expressed as: +
q = p, (14)
Um (t) = (1 − S(t))rt uγ , ∀t ∈ T . +
n = m. (15)
We formulate an accumulated user utility maximization prob- And the user obtains following utility
lem in a given finite time horizon T as follows:
um ri , if p = q,
T U ((i, q, n), p, m) = (16)
X uγ ri C((i, q, n), p, m), otherwise.
max (Uo (t) + Um (t)) (11)
xi (t),yj (t)
t=1 Thus the accumulated utility at (i+ , q + , n+ ) is
s.t. (1) − (10) f (i+ , q + , n+ ) = f (i, q, n) + U ((i, q, n), p, m). (17)
Solving the problem in (11) has the following two chal- Let f ∗ (i, q, n) denote the optimal accumulated utility at
lenges. Firstly, as the objective is the accumulative utility in a (i, q, n), and then we have
time horizon, it is difficult to gain the complete future infor-
f ∗ (i+ , q + , n+ ) = max (U ((i, q, n), p, m) + f ∗ (i, q, n)) (18)
mation (e.g., user trajectory and request frequency). Secondly, p∈B,
m∈M
the user’s decisions are coupled across different time slots.
s.t. (12) − (17).
That is, decision at current slot would affect decisions in the
future. To solve these challenges, we begin with an ideal case Since the problem can be regarded as the Bellman equation,
where future information is available, and develop an optimal a dynamic programming approach can be adopted to derive an
offline solution. This helps provide some insights to solve offline optimal solution of our service migration and DNN exit
6
point selection problem. For ease of exposition, we transform Algorithm 1 Offline Optimal Migration and Exit Point Selec-
the offline optimal problem into a longest path problem by tion Strategy (OMEPS) Algorithm
constructing a directed acyclic graph (DAG). 1: Parameter Notation
2: Vector O(i,q,n) is the optimal migration strategy, it
T1 T2 T3 T4 T5 contains a series optimal states from initial state to state
B1 B1 B1 B1 B1 B1 (i, q, n).
3: Initialization: Initialize the initial state (i− , q − , n− ) =
(0, 1, 1), optimal strategy O(i,q,n) = ∅, O(0,1,1) =
{(0, 1, 1)}, optimal accumulated utility of the initial state
S B2 B2 B2 B2 E ∗
f(0,1,1) = 0.
4: for each time slot i = 1, ..., T do
5: for all q such that q ∈ B do
6: Determine the optimal previous state according to
B3 B3 B3
decision p by using (12) − (17), i.e., (i− , q − , n− )opt =
arg max(i− ,q− ,n− ) {U ((i− , q − , n− ), p, m) + f(i∗ − ,q− ,n− ) }.
Fig. 2. Longest-path problem transformation of offline service migration and
DNN exit point selection problem over T = 5 time slots with three BSs B1 , 7: if the optimal previous state (i− , q − , n− )opt is
B2 and B3 . found, then
8: Update the optimal decisions to current state
As shown in Fig. 2, we construct a graph G = (V, E) to O(i,q,n) = {(i, q, n)} ∪ O(i− ,q− ,n− )opt .
represent all possible service migration and DNN exit point se- 9: Update the optimal accumulated utility at
∗
lection decisions within T time slots. Each vertex presents the current state f(i,q,n) = U ((i− , q − , n− )opt , p, m) +
state (i, q, n) that user can reach. Since the future information ∗
f((i − ,q − ,n− )
opt )
.
(user trajectory and request frequency) is known, exit point n 10: else
can be determined when i and q are given at each state. Note 11: Update the optimal accumulated utility at cur-
that the source vertex S represents the initial state (we set it as ∗
rent state f(i,q,n) = 0.
(0, 1, n)). Each state (except the initial state) is transited from 12: end if
the previous state by performing the corresponding decision. 13: end for
The destination vertex E is an auxiliary vertex to ensure that 14: end for
a single longest path can be founded. Each edge weight on 15: Pick the maximum accumulated utility state (i, q, n)opt =
the DAG between two states represents the sum of the request ∗
arg max(i,q,n) f(i,q,n) and set its corresponding policy
utilities of executing decisions, and the edges connecting to O(i,q,n) as the offline optimal service migration and DNN
E have zero weight. It’s worth noting that, suppose the user model exit point selection strategy Oof f .
decision can be completed before T , we can draw a directed
edge between two states. However, if the decision completes
at a time beyond T , for example, when a user in state B1 to find out the optimal strategies O of each time slot for a
at T4 performs the decision of transferring to state B2 . We given finite time horizon. In the algorithm, we can obtain
draw a directed edge from B1 at T4 to the corresponding the longest-path (i.e., the optimal service migration and DNN
yellow auxiliary vertex B2 . Accordingly, the weight of the model exit point selection strategy) for each state by solving
edge represents the sum of the utilities that user can obtain the Bellman’s equation (i.e., line 6). Then we can pick the
from T4 to the end. The weight of the edge connecting each path that contains the state with the highest accumulated utility
vertex of time T to the yellow auxiliary vertices is zero. We at T as the longest path (i.e., line 15), which is the optimal
have now completed the construction of the DAG. solution to the problem. For searching the longest path, the
We can derive the user’s optimal strategy by finding the algorithm needs to enumerate at most B 2 possible states at
longest path from S to E. Specifically, given all the informa- each time slot. Thus, for the T -time slots, the time-complexity
tion of user activities over T time slots, the weight of all edges of Algorithm 1 is O(B 2 T ).
can be calculated. And the total weight of a path from source
vertex S to destination vertex E can hence present the whole
utility over the time horizon. Consequently, the optimal service V. O NLINE S ERVICE M IGRATION AND E XIT P INT
migration and DNN model exit point selection strategy can be S ELECTION A LGORITHM
found by taking the longest path from S to E. As shown in So far we have presented the solution of the problem in
Fig. 2, we give the longest path for T = 5 with 3 BSs. Each (11) under complete future information scenario as a baseline.
red vertex represents the state of the user at the corresponding In practice, it is challenging to obtain complete information.
time slot, and the vertex pointed by the solid black edge is the This motivates an optimal online algorithm without complete
user state after performing the decision. Obviously, since this information. To this end, in this section, we combine some
longest path problem has an optimal sub-structure property, it popular machine learnig techniques (e.g., LSTM) to predict
can be solved by the classical dynamic programming approach. future information (e.g., mobility traces) to improve the long-
Algorithm 1 shows the pseudocode of our optimization al- term service performance with informed decision making.
gorithm which uses dynamic programming with memoization However, frequent prediction would incur large running costs
7
FC
1: Parameter Notation
Migration CONV
Exit Point 1
2: Vector π(i,q,n) is the optimal migration and exit point
decision-making selection strategy of a prediction window, it contains a
FC
Exit Point 2
er
BS series optimal states from current time slot t to t + W .
Se th
le
cta
to
an
o
... Repeat 3: Initialization: τ = 1, t = 1.
n
ex te
ra
4: while current time slot t ≤ T do
it p ig
Predict information of
oi M
nt
t0+1~t0+w 5: τ =t
prediction OMEPS 6: Predict the user mobility and request frequency of
future W time slots [τ + 1, ..., τ + W ].
7: Determine the optimal strategy π(i,q,n) of the predic-
t0 t0+1 t0+2 ...... t0+n tion window [τ, ..., τ + W ] by using OMEPS.
8: Select the first step (i, q, n) of π(i,q,n) to execute.
Fig. 3. Overview of MPC-Based Online Migration and Exit Point Selection 9: t=i+1
Strategy. 10: end while
Migration
judgement
NN model
Offloading MOMEPS
exit point selection
Mi
gr
ati
on
CONV
FC
Exit Point 1
CONV
FC
Exit Point 2
TABLE II
L ATENCY AND ACCURACY AT EACH EXIT POINT OF A LEXNET.
1400 Lazy-MPC
Smart-MOMEPS
Exit point 1 2 3 4 5 MOMEPS
1200
Latency (ms) 9.4 14.0 18.5 24.4 30.2
Computation overhead
Accuracy 70.0 71.2 76.0 77.7 78.0 1000
800
600
120
FHC Lazy-MPC 400
LM MOMEPS
110 AM Smart-MOMEPS 200
PLM
Algorithm efficiency (%)
100 0
500 1000 1500 2000
90 Time slot
70
TABLE III
L ATENCY OF LSTM AND OMEPS ON DIFFERENT DEVICES .
80
E. Impact of Prediction Error and Look-ahead Window Size
lstm-MOMEPS
Furthermore, we investigate both the impact of prediction 75 lstm-smart-MOMEPS
error and how far to look into the future on algorithm arima-MOMEPS
efficiency. We use two prediction methods to obtain the 70 arima-smart-MOMEPS
future information. One is long short-term memory (LSTM) 2 3 4 5 6 7 8
Prediction window size
and the other is autoregressive integrated moving average
model (ARIMA) [40]. The prediction accuracy of these two Fig. 8. Algorithm efficiency at different model accuracy and prediction
methods are 93.3%, and 82.5%, respectively. Intuitively, the window size.
more accurate the MPC model’s predictions are, the better the
algorithm’s performance will be, and the experimental data
confirm this intuition. As shown in Fig. 8, the efficiency of 100 200
LSTM-based algorithm is 5.3% higher than the ARIMA-based
95
algorithm on average. 180
90
Algorithm efficiency (%)
As for the influence of different prediction window size
Journal on Selected Areas in Communications, vol. 31, no. 12, pp. 762–
96 772, 2013.
[5] T. Ouyang, Z. Zhou, and X. Chen, “Follow me at the edge: Mobility-
aware dynamic service placement for mobile edge computing,” IEEE
94 Journal on Selected Areas in Communications, vol. 36, no. 10, pp. 2333–
2345, 2018.
[6] V. Farhadi, F. Mehmeti, T. He, T. F. L. Porta, H. Khamfroush,
92 S. Wang, K. S. Chan, and K. Poularakis, “Service placement and request
scheduling for data-intensive applications in edge clouds,” IEEE/ACM
Transactions on Networking, vol. 29, no. 2, pp. 779–792, 2021.
90 [7] R. Urgaonkar, S. Wang, T. He, M. Zafer, K. Chan, and K. K. Leung,
4.16 6.32 8.40 10.04 12.57 “Dynamic service migration and workload scheduling in edge-clouds,”
Density of base station (1/km²)
Performance Evaluation, vol. 91, pp. 205–228, 2015.
[8] A. Machen, S. Wang, K. K. Leung, B. J. Ko, and T. Salonidis, “Live ser-
Fig. 10. Algorithm efficiency at different base station densities. vice migration in mobile edge clouds,” IEEE Wireless Communications,
vol. 25, no. 1, pp. 140–147, 2018.
[9] L. Ma, S. Yi, N. Carter, and Q. Li, “Efficient live migration of edge
efficiency of each algorithm under different BS densities as its services leveraging container layered storage,” IEEE Transactions on
Mobile Computing, vol. 18, no. 9, pp. 2020–2033, 2019.
ratio to the offline optimal when the BS density is 12.57. Fig.
[10] L. Gu, D. Zeng, J. Hu, B. Li, and H. Jin, “Layer aware microservice
10 presents that there exists a growth in algorithms efficiency placement and request scheduling at the edge,” in IEEE INFOCOM 2021
as the density of BS increases from 4.16 to 12.57. This is - IEEE Conference on Computer Communications, 2021, pp. 1–9.
because more base stations bring more migration choices when [11] B. Xu, S. Wu, J. Xiao, H. Jin, Y. Zhang, G. Shi, T. Lin, J. Rao,
L. Yi, and J. Jiang, “Sledge: Towards efficient live migration of docker
user migrates services. It is also easier for user to migrate containers,” in 2020 IEEE 13th International Conference on Cloud
services to a base station closer to it for higher utility. To Computing (CLOUD), 2020, pp. 321–328.
summarize, the denser the base stations are, the better our [12] S. Nadgowda, S. Suneja, N. Bila, and C. Isci, “Voyager: Complete
container state migration,” in 2017 IEEE 37th International Conference
algorithms perform. on Distributed Computing Systems (ICDCS), 2017, pp. 2137–2142.
[13] S. Fu, R. Mittal, L. Zhang, and S. Ratnasamy, “Fast and
VII. C ONCLUSTION efficient container startup at the edge via dependency scheduling,”
in 3rd USENIX Workshop on Hot Topics in Edge Computing
In this paper, we investigate a user-centric DNN service (HotEdge 20). USENIX Association, Jun. 2020. [Online]. Available:
migration and exit point selection problem with various service https://www.usenix.org/conference/hotedge20/presentation/fu
downtime in the mobile edge computing environment. We [14] tensorflow, “Tensorflow docker images,” [EB/OL], https://hub.docker.
com/r/tensorflow/tensorflow Accessed March 7, 2022.
leverage the exit point selection and layer sharing feature of [15] S. Teerapittayanon, B. McDanel, and H. Kung, “Branchynet: Fast
the container technique to alleviate performance degradation inference via early exiting from deep neural networks,” in 2016 23rd
caused by inevitable service downtime. To maximize long- International Conference on Pattern Recognition (ICPR), 2016, pp.
2464–2469.
term user utility, we first propose an offline optimal migra- [16] E. F. Camacho and C. Bordons, Model Predictive Control. Model
tion and exit point selection strategy (OMEPS) algorithm Predictive control, 2007.
by leveraging dynamic programming when complete future [17] D. Zhao, T. Yang, Y. Jin, and Y. Xu, “A service migration strategy based
on multiple attribute decision in mobile edge computing,” in 2017 IEEE
information is available. To deal with the uncertain user be- 17th International Conference on Communication Technology (ICCT),
havior, we incorporate a Model Predictive Control framework 2017, pp. 986–990.
to the OMEPS algorithm. And then construct a proactive [18] T. Ouyang, R. Li, X. Chen, Z. Zhou, and X. Tang, “Adaptive user-
managed service placement for mobile edge computing: An online
service migration and DNN exit point selection (MOMEPS) learning approach,” in IEEE INFOCOM 2019 - IEEE Conference on
algorithm. To cope with the heavy computation overheads Computer Communications, 2019, pp. 1468–1476.
of MOMEPS, we propose a cost-efficient algorithm, smart- [19] A. Ksentini, T. Taleb, and M. Chen, “A markov decision process-
based service migration procedure for follow me cloud,” in 2014 IEEE
MOMEPS, which introduces a neural network based smart International Conference on Communications (ICC), 2014, pp. 1350–
migration judgement to navigate the performance and compu- 1354.
tation overhead trade-off. Finally, we conduct extensive trace- [20] S. Wang, R. Urgaonkar, T. He, M. Zafer, K. Chan, and K. K. Leung,
driven experiments to evaluate our online algorithm. We also “Mobility-induced service migration in mobile micro-clouds,” in 2014
IEEE Military Communications Conference, 2014, pp. 835–840.
explore the performance of our algorithms under a variety of [21] H. Ma, Z. Zhou, and X. Chen, “Predictive service placement in mo-
system settings and give corresponding analysis. bile edge computing,” in 2019 IEEE/CIC International Conference on
Communications in China (ICCC), 2019, pp. 792–797.
[22] Y. Zhang, L. Jiao, J. Yan, and X. Lin, “Dynamic service placement for
R EFERENCES virtual reality group gaming on mobile edge cloudlets,” IEEE Journal
[1] M. Hanyao, Y. Jin, Z. Qian, S. Zhang, and S. Lu, “Edge-assisted on Selected Areas in Communications, vol. 37, no. 8, pp. 1881–1897,
online on-device object detection for real-time video analytics,” in IEEE 2019.
INFOCOM 2021 - IEEE Conference on Computer Communications, [23] K. Kawashima, T. Otoshi, Y. Ohsita, and M. Murata, “Dynamic place-
2021, pp. 1–10. ment of virtual network functions based on model predictive control,” in
[2] Z. Zhou, X. Chen, E. Li, L. Zeng, K. Luo, and J. Zhang, “Edge NOMS 2016 - 2016 IEEE/IFIP Network Operations and Management
intelligence: Paving the last mile of artificial intelligence with edge Symposium, 2016, pp. 1037–1042.
computing,” Proceedings of the IEEE, vol. 107, no. 8, pp. 1738–1762, [24] M. Kumazaki and T. Tachibana, “Optimal vnf placement and route
2019. selection with model predictive control for multiple service chains,” in
12