Professional Documents
Culture Documents
net/publication/357149409
CITATIONS READS
0 19
4 authors, including:
Lena Mashayekhy
University of Delaware
53 PUBLICATIONS 943 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Erfan Farhangi on 19 December 2021.
5G Network
EAS
Enterprise
Backhaul
gNB CU
Midhaul
gNB DU
Submit requests
Fronthaul
gNB RU
UEs
reducing latency) for UEs. Furthermore, it degrades backhaul ca- requirement. Essentially, assigning a new EAS means migrating
pacity demands by up to 35% [20]. Improving QoS for UEs has the content delivery service from the previous EAS to the new EAS,
enormous economic profits for enterprises. Amazon reported that leading to a migration latency. On the other hand, allocating a new
every 100ms of latency reduces 1% in sales. Google announced an PSA UPF leads to relocating the previous PSA UPF to a new PSA
extra 0.5 seconds in search page generation time declined traffic UPF, causing a relocation latency.
by 20%. Likewise, if the trading platform is 5ms behind the competi- This paper addresses the problem of component selection for
tion, a broker could lose $4 million in revenues per millisecond [1]. content delivery in a 5G-enabled MEC. We first formulate an opti-
Despite all these advantages, however, UEs’ high mobility causes mal integer programming (IP) model for this problem. To tackle the
severe uncertainty in deciding the places to locate the UEs’ most NP-hardness and intractability of the IP model, we then design an
popular content and the delivery routes. online learning-based approach to solve the problem in a reasonable
The integration of 5G and MEC, called 5G-enabled MEC, includes time and minimize the content delivery latency for UEs over their
the 5G components and Edge Application Servers (EASs) The 3GPP lifetime. Our proposed approach is formulated as the Multi-Armed
5G RAN architecture, which is recognized as NG-RAN, consists of Bandit (MAB) problem. We introduce a suitable online machine
radio base stations called gNBs connected to the 5G core network learning solution to learn the optimal routing path for UEs in a
(5GC) and each other. The gNB component incorporates three main time-slotted format. In this regard, at each time slot, the system
functional modules: the Radio Unit (RU), the Distributed Unit (DU), assigns a proper routing path for delivering requested contents to
and the Centralized Unit (CU), which can be located separately UEs based on the currently observed information (e.g., the requests,
or together in various combinations. Furthermore, PDU Session UEs’ mobility patterns, mean latency) to reduce the experienced
Anchor UPF (PSA UPF) plays a pivotal role in building the content latency.
routing path between each UE and EAS content [26]. Each gNB The rest of the paper is organized as follows. In the next section,
connects to a PSA UPF to enable UE’s access to the content cached we provide an overview of existing studies in this domain. We
at an EAS. As shown in Fig. 1, the component selection problem formulate our problem in Section 3. We then describe our proposed
(UEs routing path to content) in 5G-enabled MEC is composed approach, Q-CSCD, in Section 4. We evaluate the performance
of selecting the best gNB, PSA UPF, and EAS to determine each of our proposed approach by extensive experiments in Section 5.
UE’s optimal routing path with minimum latency from a gNB to Finally, we summarize our results and present potential directions
an EAS with the content. Optimal component selection requires for future research.
considering the dynamic of UEs’ behavior (e.g., mobility). Many
existing studies in 4G/5G networks mainly focus on collecting the 2 RELATED WORK
most popular contents and caching them on the local EAS. However,
We summarize the most relevant 5G-based studies in the literature
to the best of our knowledge, the component selection problem is
based on their research direction.
not studied in 5G-enabled MEC. The high mobility of UE results in
Enhancing QoS. A group of studies has focused on enhanc-
multiple gNB allocations as the connected gNB may change over
ing QoS for users as the only objective. Solozabal et al. [23] in-
time based on UE’s new locations. Therefore, when a UE departs
troduced a Non-Standalone (i.e., disconnected from the Internet)
from the coverage of the currently connected gNB, the system
5G ETSI MEC-based architecture for Mission-Critical Push-to-Talk
needs to determine which EAS and PSA UPF to allocate to the
(MCPTT) services to achieve the delay requirement. Kiani et al. [15]
UE to support the service continuity while satisfying the latency
proposed an edge computing-aware NOMA technique that utilizes
QoS-Aware 5G Component Selection for Content Delivery in Multi-access Edge Computing UCC’21, December 6–9, 2021, Leicester, United Kingdom
Hybrid Computing
Content Delivery
Edge Computing
and caching (3C) solution for the 5G environment so that the con-
Online Learning
Fog Computing
Multi Objective
tent and service providers can deploy their functions, services, and
Real Dataset
Multi-User
contents closed to UEs. Bastug et al. [5] proposed a predictive and
Mobility
Latency
proactive caching approach to reduce peak traffic demands, leverag-
ing the 5G function, including D2D communication. Hou et al. [11]
QoS
𝑡 +1 𝑡 +1
content between PSA UPF 𝑖 and the targeted EAS 𝑗 is denoted 𝑡
𝑦𝑛𝑖 + 𝑦𝑛𝑖 ′ − 1 ≤ 𝑝𝑛𝑖𝑖 ′ ∀𝑛 ∈ N, 𝑖, 𝑖 ′ ∈ M, 𝑡 ∈ T (7)
by 𝑑𝑖 𝑗 . To formulate the problem mathematically, we first define
the decision variables as follows. 𝑧𝑛𝑡 𝑗 + 𝑧𝑛𝑡 +1 𝑡 +1
𝑗 ′ − 1 ≤ 𝑞𝑛 𝑗 𝑗 ′ ∀𝑛 ∈ N, 𝑗, 𝑗 ′ ∈ M, 𝑡 ∈ T (8)
(
1 if gNB 𝑔 is allocated to UE 𝑛 at time slot 𝑡 𝑡 𝑡
+ 𝑦𝑛𝑖 + 𝑧𝑛𝑡 𝑗 − 2 ≤ 𝑘𝑛𝑔𝑖
𝑡
𝑡
𝑥𝑛𝑔 = 𝑥𝑛𝑔 𝑗 ∀𝑛 ∈ N, 𝑔 ∈ G, 𝑖, 𝑗 ∈ M, 𝑡 ∈ T (9)
0 otherwise.
𝑡
𝑥𝑛𝑔 ∈ {0, 1} ∀𝑛 ∈ N, 𝑔 ∈ G, 𝑡 ∈ T (10)
(
𝑡 1 if PSA UPF 𝑖 is allocated to UE 𝑛 at time slot 𝑡 𝑡
𝑦𝑛𝑖 ∈ {0, 1} ∀𝑛 ∈ N, 𝑖 ∈ M, 𝑡 ∈ T (11)
𝑦𝑛𝑖 =
0 otherwise.
𝑧𝑛𝑡 𝑗 ∈ {0, 1} ∀𝑛 ∈ N, 𝑗 ∈ M, 𝑡 ∈ T (12)
1 if the targeted content for UE 𝑛 is located at 𝑡
𝑝𝑛𝑖 𝑗 ∈ {0, 1} ∀𝑛 ∈ N, 𝑖, 𝑗 ∈ M, 𝑡 ∈ T (13)
𝑧𝑛𝑡 𝑗 = EAS 𝑗 at time slot 𝑡
𝑡
𝑗 ∈ {0, 1} ∀𝑛 ∈ N, 𝑖, 𝑗 ∈ M, 𝑡 ∈ T (14)
𝑞𝑛𝑖
0
otherwise.
𝑡
𝑘𝑛𝑔𝑖 𝑗 ∈ {0, 1} ∀𝑛 ∈ N, 𝑖, 𝑗 ∈ M, 𝑔 ∈ G, 𝑡 ∈ T (15)
1 if the allocated PSA UPF for UE 𝑛 is rerouted
The objective function in Eq. (1) minimizes the total sum of the
𝑡
𝑝𝑛𝑖 𝑗 = from EAS 𝑖 to EAS 𝑗 at time slot 𝑡
relocation latency and migration latency along with the content
0 otherwise.
delivery latency over the entire time trip. The first term calculates
the relocation and migration latencies. The second term calculates
1 if the targeted content for UE 𝑛 is migrated from the content delivery latency when a specific routing path is chosen
𝑡
𝑡
=
(indicated by 𝑘𝑛𝑔𝑖 𝑗 ). The content delivery latency is the total sum
𝑞𝑛𝑖 𝑗 EAS 𝑖 to EAS 𝑗 at time slot 𝑡
0 otherwise.
of content delivery latencies over different components of the 5G
𝑐𝑡
network. For example, 𝑑 𝑡𝑛 denotes content delivery latency when
𝑛𝑔
a requested content is passed through gNB 𝑔 to UE 𝑛. Constraints
1 if gNB 𝑔 and PSA UPF 𝑖 are allocated to UE 𝑛
(2) ensure each UE connects to only one gNB for accessing the edge
𝑡
and the targeted content for UE 𝑛 is located at content at any time slot. Constraints (3) guarantee only one PSA
𝑘𝑛𝑔𝑖 𝑗 =
EAS 𝑗 at time slot 𝑡 UPF is selected for each UE each time. Constraints (4) are to ensure
the targeted content of a UE is located at only one EAS each time.
0 otherwise.
Constraints (5) guarantee the relocation between PSA UPFs for
Now, the 5G component selection problem in MEC can be formu- each UE happens at most once between two consecutive time slots.
lated as an Integer Program (IP) as follows: Constraints (6) guarantee migration of the targeted content happens
Õ Õ Õ Õ at most once between two consecutive time slots. Constraints (7)
𝑡 𝑡
Minimize D = (𝑝𝑛𝑖 𝑗 𝑟𝑖 𝑗 + 𝑞𝑛𝑖 𝑗 𝑠𝑖 𝑗 )+ are to set the relocation decision variables according to whether
𝑡 ∈ T 𝑛 ∈N 𝑖 ∈M 𝑗 ∈M a PSA UPF relocation happens or not. Constraints (8) are to set
(1) the migration decision variables according to whether a content
Õ Õ Õ Õ Õ
𝑡 𝑡 1 11 1 1 migration happens or not. Constraints (9) are to set the routing path
𝑘𝑛𝑔𝑖 𝑗 𝑐𝑛 ( 𝑡 + 𝑑 + + +
𝑑𝑔𝑐 𝑑𝑔𝑖 𝑑𝑖 𝑗
)
decision variables according to whether a routing path composed
𝑡 ∈ T 𝑛 ∈N 𝑔 ∈ G 𝑖 ∈M 𝑗 ∈M
𝑑𝑛𝑔 𝑔𝑑
of gNB, PSA UPF, and EAS is selected or not. Constraints (10)-(15)
Subject to: are to ensure the decision variables are binary.
Õ
𝑡
𝑥𝑛𝑔 =1 ∀𝑛 ∈ N, 𝑡 ∈ T (2) 4 QOS-AWARE 5G COMPONENT SELECTION
𝑔∈G
FOR CONTENT DELIVERY
Õ
𝑡 This section presents a scenario where several UEs submit their
𝑦𝑛𝑖 =1 ∀𝑛 ∈ N, 𝑡 ∈ T (3)
requests to the service provider (enterprise) to receive specific
𝑖 ∈M
contents with minimum latency. In such a scenario, each UE 𝑛
Õ owns only local information (e.g., its location) and does not have a
𝑧𝑛𝑡 𝑗 = 1 ∀𝑛 ∈ N, 𝑡 ∈ T (4)
priori information about other factors on the 5G side, such as the
𝑗 ∈M
relocation latency between PSA UPFs, migration latency of content,
Õ Õ
𝑡 or transmission speed of contents.
𝑝𝑛𝑖 𝑗 ≤1 ∀𝑛 ∈ N, 𝑡 ∈ T (5) Online machine learning is a promising technique to deal with
𝑖 ∈M 𝑗 ∈M dynamics and uncertain information in realtime. It is a proper
Õ Õ technique for making decisions when not all information is avail-
𝑡
𝑞𝑛𝑖 𝑗 ≤1 ∀𝑛 ∈ N, 𝑡 ∈ T (6) able initially and data comes in sequential order. Therefore, online
𝑖 ∈M 𝑗 ∈M learning can efficiently cope with constant moves and changes in
QoS-Aware 5G Component Selection for Content Delivery in Multi-access Edge Computing UCC’21, December 6–9, 2021, Leicester, United Kingdom
massive datasets, and it can sustain changes for an extended period. Several algorithms have been studied under the MAB framework.
Furthermore, it does not require a considerable amount of memory Q-CSCD is adopted from the UCB1 strategy, which sets a balance
or data storage versus traditional batch learning techniques. In between exploration and exploitation [3, 4]. We present extensive
addition, online learning saves time and is cost-efficient since it comparisons of Q-CSCD with two other notable MAB strategies,
only processes a small portion of data instead of training over the Epsilon Greedy and Thompson Sampling, in Section 5. UCB1 al-
entire dataset at a time, which can be computationally infeasible locates a counter for each route to specify the number of times
for realtime MEC requirements. that route has been selected. It decides the priority of every route
We design a QoS-Aware 5G Component Selection for Content based on its obtained mean latency and the number of times that
Delivery (Q-CSCD) to solve the component selection problem ef- route has been selected. However, due to the online learning char-
ficiently, achieving a bounded performance. Q-CSCD learns the acteristics of Q-CSCD, it is inevitable to choose suboptimal routing
optimal component selection online for UEs, including selection paths and to have higher radio handover and content migrations
of gNB, PSA UPF, and EAS, to minimize latency for the content between EASs, especially at the beginning of the learning process.
delivery over time. Q-CSCD is an online learning approach formu- Fig. 2 further depicts a general framework of our learning-based
lated as the multi-armed bandit problem [14]. Q-CSCD provides an approach for the problem of 5G component selection for UEs in
online learning solution for each UE to learn optimal routing path, 5G-enabled MEC. Fig. 2a shows that each UE independently learns
denoted by Z : (𝑔𝑁 𝐵𝑔 , 𝑃𝑆𝐴 𝑈 𝑃𝐹𝑖 , 𝐸𝐴𝑆 𝑗 ), on the fly independently the optimal component selection using the MAB model based on
from other UEs. Q-CSCD, as an MAB-based approach, is expected the history of past observations. Once new latency is observed (for
to provide a near real-time solution for highly mobile UEs, and the allocated content routing path), the history of information is
it is also lightweight making it suitable for realtime decisioning updated. Then, the content provider is informed about the updated
in MEC. It includes two fundamental choices each time: 1) Explo- policy (or updated 5G component selection). Fig. 2b shows that
ration: by collecting more information leading to better choices each UE 𝑛 selects a single routing path at each time 𝑡 based on the
in the future; 2) Exploitation: by choosing the best option given up-to-now information.
current information. The core idea of our proposed approach is
that Q-CSCD chooses an unselected routing path for every UE 4.1 Q-CSCD Algorithm
at each time slot and observes its average latency per unit of re- The pseudo-code of our proposed approach is presented in Algo-
ceived content. During time slots with no new unselected routing rithm 1. For every UE 𝑛, the current location 𝑙𝑛𝑡 , the routing paths
path, UEs select a routing path that has the highest preference over in close proximity of the UE N (𝑙𝑛𝑡 ), the amount of received con-
others thus far. Q-CSCD also lowers the complexity of learning tent 𝑐𝑛𝑡 , and the value for 𝛽 to specify the weight of exploration
by considering just routes that could deliver the UE’s requested are available at each time slot (Line 1). The history of the selected
content and are in closer proximity. This approach applies a trade- routing paths for every UE 𝑛 is denoted by H𝑛 , and it is set to an
off between exploration and exploitation, where Q-CSCD explores empty set (Line 2). Q-CSCD first checks the existence of an uns-
different routing paths for each UE to learn the range of possible elected routing path Z𝑚 ∈ N (𝑙𝑛𝑡 ) (Line 4). If such a route exists,
new latencies. At the same time, it gives a chance to exploit a priori Q-CSCD selects that once (Line 5) and appends that to the history
known optimal routing paths. set (Line 6). It observes the perceived latency by UE 𝑛, when it is
UCC’21, December 6–9, 2021, Leicester, United Kingdom Farhangi Maleki, et al.
Algorithm 1 QoS-Aware 5G Component Selection for Content for UE 𝑛 until 𝑇 is denoted by 𝑅𝑛,𝑇 and is defined as follows:
Delivery (Q-CSCD) 𝑇
Õ
1: Input: 𝑙𝑛𝑡 , N (𝑙𝑛𝑡 ), 𝑐𝑛𝑡 , 𝛽 at the beginning of each time slot for UE 𝑅𝑛,𝑇 = E[ D𝜋𝑛𝑡 ,𝑡 − D𝜋𝑛∗ ,𝑡 ] + E[ℎ𝑛,𝜋𝑛 ]
𝑛, ∀𝑡 ∈ 𝑇 𝑡 =1 | {z }
| {z } handover
regret
2: H𝑛 = ∅ sampling regret
3: for all 𝑡 ∈ 𝑇 do where D𝜋𝑛𝑡 ,𝑡 , and D𝜋𝑛∗ ,𝑡 represent the latency for UE 𝑛 by routing
4: if ∃ Z𝑚 : (𝑔𝑁 𝐵𝑔 , 𝑃𝑆𝐴 𝑈 𝑃𝐹𝑖 , 𝐸𝐴𝑆 𝑗 ) ∈ N (𝑙𝑛𝑡 ) such that Z𝑚 ∉ path 𝜋𝑛 and the optimal routing path 𝜋𝑛∗ , respectively. Further-
H𝑛 then more, E[ℎ𝑛,𝜋𝑛 ] indicates the expected relocation regret for UE 𝑛.
5: Select Z𝑚 once We provide an upper bound on the learning regret of Q-CSCD in
the following Theorem.
6: H𝑛 = H𝑛 ∪ Z𝑚
7: Observe D𝑛,𝑚,𝑡 Theorem 1. The upper bound of learning regret for UE 𝑛 is as
D follows:
8: Update D̄𝑛,𝑚,𝑡 = 𝑛,𝑚,𝑡 2 Í
𝑅𝑛,𝑇 ≤ [8 𝑚≠𝜋𝑛∗ Δln𝑇 + (1 + 𝜋3 ) 𝑚≠𝜋𝑛∗ Δ𝑛,𝑚 ] +
Í
𝑐𝑛𝑡
9: 𝜒𝑛,𝑚,𝑡 = 1 𝑛,𝑚
2
C̄ [2 𝑚≠𝜋𝑛∗ (8 (Δln𝑇 ) 2 + 1 + 𝜋3 ) + 1],
Í
10: else q 𝑛,𝑚
11: Select 𝜋𝑛𝑡 = arg min ( D̄𝑛,𝑚,𝑡 − 𝛽 𝜒2𝑛,𝑚,𝑡 ln 𝑡 ) where C̄ represents the maximum handover latency that can be
𝑚 ∈H𝑛 experienced, and Δ𝑛,𝑚 is defined as D𝑛,𝑚 − D𝑛∗ such that D𝑛,𝑚 is
12: Observe D𝑛,𝜋𝑛𝑡 ,𝑡 the expected latency when routing path Z𝑚 is selected by UE 𝑛,
𝜒𝑛,𝜋 𝑡 ,𝑡 × D̄𝑛,𝜋 𝑡 ,𝑡 +
D𝑛,𝜋𝑛 ,𝑡
𝑡
and D𝑛∗ is the minimum latency when an optimal routing path is
𝑛 𝑛 𝑐𝑛
13: Update D̄𝑛,𝜋𝑛𝑡 ,𝑡 = 𝜒𝑛,𝜋 𝑡 ,𝑡 +1 selected by UE 𝑛. The proof is provided in the appendix.
𝑛
14: 𝜒𝑛,𝜋𝑛𝑡 ,𝑡 = 𝜒𝑛,𝜋𝑛𝑡 ,𝑡 + 1
15: end if
5 EXPERIMENTAL RESULTS
16: end for In this section, we conduct and analyze a set of experiments that are
designed to evaluate the effectiveness of our proposed approach,
Q-CSCD.
assigned to routing path Z𝑚 at time slot 𝑡 (Line 7). Then, it updates
the sample mean latency per unit of the received content by rout- 5.1 Experimental Setup
ing path Z𝑚 until time slot 𝑡, denoted by D̄𝑛,𝑚,𝑡 to compare the
We compare the performance of Q-CSCD with the following ap-
average experienced latency per unit of the content by each routing
proaches:
path (Line 8). The notation 𝜒𝑛,𝑚,𝑡 denotes the number of times that
the newly selected routing path Z𝑚 is selected until time slot 𝑡, • IP: we implement our IP model using IBM ILOG Concert
and initially is set to 1 for every newly selected routing path, and Technology API for C++ [13]. We use this IP, presented in
will be used in the calculation of the average latency per unit of equations (1-15), as a benchmark to compare our results with
content by that routing path (Line 9). In a case that such a routing the optimal solutions when is possible.
path does not exist, Q-CSCD selects the optimal routing path 𝜋𝑛𝑡 • Epsilon Greedy (𝜖-Greedy): This approach is a multi-armed
among the already explored routing paths, which represents the bandit strategy that balances exploration and exploitation
up to this point optimal path. The parameter 𝛽 determines a bal- by choosing between exploration and exploitation randomly.
ance between exploitation and exploration and specifies the value The value of 𝜖 is a threshold to determine the probability of
given to less selected routing paths (Line 11). In other words, the choosing to explore or to exploit the already best routing
first term in the formula gives a higher value to the already best path.
routing paths with minimum average latency, which is equivalent • Thompson Sampling: This approach is a multi-armed bandit
to exploitation. However, as the routing path, Z𝑚 , is less selected strategy that uses a Beta distribution to prioritize the choice
especially in a larger value of time slot 𝑡, the second term gets a between different routing paths. The shape of the Beta dis-
higher value, causing to explore the less observed routing paths. tribution is controlled by two positive shape parameters,
The value of 𝛽 further controls the weight of the first term (exploita- denoted by 𝛼 and 𝛽, such that if the selected routing path
tion) versus the second term (exploitation). Q-CSCD then observes leads to less experienced latency than the current average
new latency D𝑛,𝜋𝑛𝑡 ,𝑡 (Line 12) and updates D̄𝑛,𝜋𝑛𝑡 ,𝑡 by calculating
the mean latency for the past and newly observed latencies and Table 2: Experiment Scenarios
then increments 𝜒𝑛,𝜋𝑛𝑡 ,𝑡 (Lines 13-14).
Exp. # UEs # gNB # PSA UPF # EAS
1 10 2 2 2
2 12 4 2 2
4.2 Q-CSCD Regret Analysis 3 50 7 2 2
We measure the performance loss of every UE due to learning by 4 100 8 3 3
applying the concept of learning regret. The learning regret is the 5 150 8 4 4
combination of sampling regret and handover regret. The regret 6 200 10 5 5
QoS-Aware 5G Component Selection for Content Delivery in Multi-access Edge Computing UCC’21, December 6–9, 2021, Leicester, United Kingdom
1400 20 1400
1200 15 1200
1000 10 1000
800 5 800
600 0 600
Exp.1 Exp.2 Exp.3 Exp.4 Exp.5 Exp.6 Exp.1 Exp.2 Exp.3 Exp.4 Exp.5 Exp.6 Exp.1 Exp.2 Exp.3 Exp.4 Exp.5 Exp.6
Experiments Experiments Experiments
Figure 3: Performance Analysis - Latency Components (*IP was unable to determine any solution for Exp. 3-6 in feasible time, and
thus, there are no bars in the plots for those cases)
latency per unit of content for a UE, 𝛼 is incremented by 1; transmission speed for connecting UEs with gNB RUs at each time
otherwise, 𝛽 is incremented by 1. The incrementing of 𝛼 slot. Finally, we set the relocation latency between PSA UPFs and
increases the route’s likelihood of selection in the next time the migration latency of content between EASs both to 10 ms.
slot. In contrast, incrementing 𝛽 reduces the its likelihood We consider six scenarios for the experiments. Table 2 summa-
of selection in the next time slot. rizes these scenarios. The values of 𝛽 for our approach and 𝜖 for
Epsilon Greedy are set to 10−4 and 0.3, respectively. We perform
The algorithms are implemented in C++, and the experiments a sensitivity analysis on the value of 𝛽 in a range between 10−5
are conducted on a desktop PC with 2.80 GHz, 11th Gen Intel(R) to 10−2 to find the best 𝛽 value that leads to minimizing latency
Core(TM) i7-1165G7 and 16 GB RAM. (see Fig. 5b).
For our experiments, we consider a 5G-enabled MEC for content
delivery in an environment where UEs are scattered uniformly, 5.2 Comparative Analysis
changing locations between time slots with various rates from
We use several metrics such as latency, handover time, content
distribution U(1, 10) to model different mobility patterns. We assume
delivery time, handover ratio, regret, and runtime to compare our
all components of 5G networks, including gNBs, PSA UPFs, and
proposed approach Q-CSCD with IP, Epsilon Greedy, and Thomp-
EASs, are randomly scattered over the geographical area according
son Sampling. Due to the NP-hardness of our problem, solving
to a uniform distribution. In this network, the contents required
the IP to get optimal results is intractable. We initiated 60 min-
by the UEs at each time slot are randomly taken from U(100, 200)
utes as a maximum feasible time to receive the IP results by the
Mbits.
solver. However, the solver was unable to obtain optimal solutions
ThousandEyes [2], a network intelligence company acquired
within 60 minutes for most scenarios. It should be noted that we
by Cisco, uses an active monitoring technique to collect network
set the number of time slots 𝑇 to 3000 for Fig. 3a-3c and to 1000 for
metrics such as loss, latency, jitter, and comprehensive path metrics
Fig. 4a-4b.
with detailed layer-3 hops. We use latency between hops obtained
by ThousandEyes to model transmission speeds between differ- 5.2.1 Comparative Analysis on Latency. Our optimization crite-
ent links in a 5G network. In general, 5G links are decomposed to rion is to minimize the experienced latency time of UEs, composed
Fronthaul, Midhaul, and Backhaul. Fronthaul represents the con- of content delivery time and handover time (relocation time and
nectivity between gNB RUs and gNB DUs; Midhaul connectivity migration time). The average latency time of UEs per time slot
describes the communication link between gNB DUs and gNB CUs, is presented in Fig. 3a, which is measured in milliseconds. The
and Backhaul comes into play to connect gNB CUs with PSA UPFs, results show that IP was unable to find a solution for Exp. 3-6
and EASs. To model transmission speeds in these links, we first due to the intractability of our NP-hard problem. Q-CSCD obtains
set 3 hops between source and target nodes; therefore, there are close to optimal results in Exp. 1-2 and much lower latency time
five nodes in the path from the source node to the target node. We in all experiments than Epsilon Greedy and Thompson Sampling
model the source node as gNB RU, second node as gNB DU, third approaches. Epsilon Greedy utilizes a simple technique to trade-
node as gNB CU, fourth node as PSA UPF, and fifth node as EAS to off between exploration and exploitation randomly. At the same
represent the 5G framework shown in Fig. 1. time, Q-CSCD deliberately established this tradeoff by consider-
Considering the latency and data transmission size between ing a repetition of the least observed routing paths and uses a
nodes, we calculate the respective data transmission speeds in more in-depth formula to minimize the latency. On the other hand,
links and scale them up to represent speeds in a 5G network. We Thompson sampling immediately increments parameters in each
use these paths as a sample and then obtain the minimum and repetition only by comparing the currently experienced latency
maximum speeds in each hop and utilize a uniform distribution to with the up-to-now average latency. This will determine the shape
simulate more paths for our experiments whenever it is necessary. of the distribution immediately. Then, those that can be better in
Similarly, measuring distance, we consider 1 to 10000 Mbps wireless the future but obtained larger beta parameters in the initial time
UCC’21, December 6–9, 2021, Leicester, United Kingdom Farhangi Maleki, et al.
106
Runtime (milliseconds)
0.8
1.5
105
Handover ratio
0.6
104
Regret
1
3
0.4 10
102
0.5
0.2
10
0 1 0
Exp.1 Exp.2 Exp.3 Exp.4 Exp.5 Exp.6 Exp.1 Exp.2 Exp.3 Exp.4 Exp.5 Exp.6 T=100 T=200 T=300 T=400 T=500
Experiments Experiments Time Slot
Figure 4: Performance Analysis (*IP was unable to determine any solution for Exp. 4-6 in feasible time for Fig. 4a)
slots are less likely to happen in the future, which causes a worse to calculate so many beta distributions that it is time-consuming.
obtained latency. Q-CSCD and Epsilon Greedy are very close. Q-CSCD is faster than
Epsilon Greedy for small-case scenarios because fewer routing
5.2.2 Comparative Analysis on Handover Time. Fig. 3b shows the paths exist to learn from. However, as the size of the experiments
average handover time of UEs per time slot. Q-CSCD has compar- increases, epsilon greedy is faster because Q-CSCD has to learn
ative results with IP in Exp. 1-2 (close to zero) and significantly among more routing paths.
less handover time than Epsilon Greedy and Thompson Sampling
approaches in all experiments. This is due to choosing the best 𝛽 5.2.6 Comparative Analysis on Regret. Regret is the deviation per-
value for Q-CSCD that leads to an appropriate balance between ex- centage of the obtained result from the optimal result. The regret
ploration and exploitation resulting in minimum handover time and for IP is 0 for all time slots. We choose Exp. 1 to assess regret of our
latency. Epsilon Greedy achieves a higher handover time due to its approach since the optimal results obtained by the IP are available
random nature determined by the value of epsilon. Since Thompson for all time slots. As the number of time slots increases, Q-CSCD
Sampling does not get adequate samples for the average latency and leads to less regret because it has more samples to learn from. This
instantly starts incrementing parameters, it requires more handover is due to the fact that Q-CSCD can learn and obtain closer to opti-
whenever a better beta distribution value is obtained. mal paths, even though a higher number of handovers and far from
optimal solutions are inevitable in the beginning. However, this
5.2.3 Comparative Analysis on Content Delivery Time. We compare pattern is not linear in Epsilon Greedy and Thompson Sampling
the content delivery time of the UEs per time slot obtained by each due to their randomness and not-well informed inherent (from the
approach. The results are presented in Fig. 3c. UEs with Q-CSCD past observations).
experience close to optimal results in Exp. 1-2 and better content
delivery time than other approaches. Because Q-CSCD gives a 5.3 Sensitivity Analysis
more reasonable value to choose already best routing paths or This section presents sensitivity analysis concerning the 𝛽 value
to explore less selected routing paths, both Epsilon Greedy and and the number of time slots 𝑇 to show their impacts on the results.
Thompson Sampling approaches result in worse content delivery In each analysis, we fix other parameters to study sensitivity over
time as expected. a single parameter.
5.2.4 Comparative Analysis on Handover ratio. We present the han- 5.3.1 Sensitivity Analysis on Beta. In Q-CSCD, 𝛽 is a parameter that
dover ratio (the percentage of handover for each UE at each time controls the value of exploration versus exploitation. We consider
slot) obtained by each approach in Fig.4a. IP can only solve Exp. 1-3 Exp. 6 and 𝑇 = 3000. We conduct two sensitivity analyses on 𝛽 to
(with 𝑇 = 1000) and cannot obtain a solution in a feasible time study its impacts on handover ratio and latency. We present the
for other experiments. As expected, IP will have a lower handover results in Figs. 5a and 5b, respectively. Note that only Q-CSCD is
ratio even though minimization of the handover ratio is not our sensitive to the value of 𝛽, and we only present the results of other
objective. However, we believe approaches with lower handover approaches. The results show as 𝛽 increases, the handover ratio
ratios are expected to have a better total latency. As the size of increases as well. This is because a higher value of 𝛽 leads to a
experiments increases, Q-CSCD leads to a higher handover ratio be- higher chance for exploration that will result in more handover. On
cause there are more routing paths to choose from. Epsilon Greedy the other hand, the best 𝛽 for the latency is the one that can set the
and Thompson Sampling have much higher handover ratios for all best tradeoff between exploration and exploitation, which in our
experiments. case, 0.0001 obtains the best latency compared to other values.
5.2.5 Comparative Analysis on Runtime. Fig. 4b shows the runtime 5.3.2 Sensitivity Analysis on Time Slot. We consider Exp. 3 and per-
of different approaches per UE in logarithmic scale. IP is the worst form sensitivity analysis on the number of time slot (𝑇 ) on handover
as expected and uses the maximum feasible time to obtain a solution. ratio and runtime, and we present the results in Figs. 5c and 5d,
Thompson Sampling is second-worst because of the requirement respectively. As the number of iterations (time slots) increases, our
QoS-Aware 5G Component Selection for Content Delivery in Multi-access Edge Computing UCC’21, December 6–9, 2021, Leicester, United Kingdom
Handover ratio
0.6
1100
1000
0.4
900
0.2
800
0 700
Beta=0.00001 Beta=0.0001 Beta=0.001 Beta=0.01 Beta=0.00001 Beta=0.0001 Beta=0.001 Beta=0.01
Beta Beta
106
Runtime (milliseconds)
0.8
105
Handover ratio
0.6
104
0.4 103
102
0.2
10
0 1
T=500 T=1000 T=2000 T=3000 T=500 T=1000 T=2000 T=3000
Time Slot Time Slot
approach can get better decisions and learn which routing path UEs, and therefore, reduce the routing path length to enhance QoS.
has the best latency; thus, leading to a lower handover ratio over We further plan to conduct more intensive experiments to com-
time. Our approach obtains close to the IP results for the first two pare Q-CSCD with different learning-based and prediction-based
time slots that IP results are available. As expected, Epsilon Greedy approaches.
and Thompson Sampling obtain the worst results as the duration
of the experiments increases. For the runtime, as the number of 7 ACKNOWLEDGMENTS
time slot increases, the runtime increases with the value of 𝑇 for all We gratefully acknowledge the support of Cisco grant CG#1935382.
approaches, except the runtime of IP, which is set to a maximum of
60 minutes, and it utilizes this maximum feasible time limit due to REFERENCES
its intractability. [1] 2009. Latency Is Everywhere And It Costs You Sales - How To Crush
It. Available: http://highscalability.com/latency-everywhere-and-it-costs-you-
To sum up, the results show that our proposed approach, Q- sales-how-crush-it.
CSCD, is efficient in finding the best routing path along with the [2] 2021. Thousandeyes. Available: https://www.thousandeyes.com/.
[3] Peter Auer, Nicolo Cesa-Bianchi, and Paul Fischer. 2002. Finite-time analysis of
representative 5G components for UEs fast and with low latency, the multiarmed bandit problem. Machine learning 47, 2 (2002), 235–256.
handover time, and content delivery time. It also obtains a lower [4] Peter Auer, Nicolo Cesa-Bianchi, Yoav Freund, and Robert E Schapire. 2002. The
nonstochastic multiarmed bandit problem. SIAM journal on computing 32, 1
handover ratio and less regret than Epsilon Greedy and Thompson (2002), 48–77.
Sampling approaches. [5] Ejder Bastug, Mehdi Bennis, and Mérouane Debbah. 2014. Living on the edge:
The role of proactive caching in 5G wireless networks. IEEE Communications
Magazine 52, 8 (2014), 82–89.
[6] Pol Blasco and Deniz Gündüz. 2014. Learning-based optimization of cache
6 CONCLUSION AND FUTURE WORK content in a small cell base station. In Proc. of the IEEE International Conference
5G-enabled Multi-access Edge Computing alleviates limited back- on Communications (ICC). 1897–1903.
[7] Cisco. 2016. Internet of Things. Available: https://www.cisco.com/c/en/us/
haul capacity and enhances QoS for UEs. This paper tackled the products/collateral/se/internet-of-things/at-a-glance-c45-731471.pdf.
5G component selection problem for content delivery, which is a [8] GMDT Forecast. 2019. Cisco visual networking index: global mobile data traffic
critical problem in 5G-enabled MEC. We proposed a multi-armed forecast update, 2017–2022. Update 2017 (2019), 2022.
[9] Shaoyong Guo, Sujie Shao, Yao Wang, and Hui Yang. 2017. Cross stratum re-
bandit-based approach, called Q-CSCD, to learn the optimal 5G com- sources protection in fog-computing-based radio over fiber networks for 5G
ponents, leading to minimum experienced latency for UEs. Q-CSCD services. Optical Fiber Technology 37 (2017), 61–68.
[10] Najmul Hassan, Kok-Lim Alvin Yau, and Celimuge Wu. 2019. Edge computing in
learns the optimal routing paths for UEs over time. The results indi- 5G: A review. IEEE Access 7 (2019), 127276–127289.
cate that Q-CSCD is highly scalable, achieves efficient latency and [11] Tingting Hou, Gang Feng, Shuang Qin, and Wei Jiang. 2018. Proactive content
handover ratio versus other strategies, and significantly reduces caching by exploiting transfer learning for mobile edge computing. International
Journal of Communication Systems 31, 11 (2018), e3706.
regret over time. For future work, we plan to enable and incen- [12] Yun Chao Hu, Milan Patel, Dario Sabella, Nurit Sprecher, and Valerie Young. 2015.
tivize cooperation among UEs to share their content with nearby Mobile edge computing—A key technology towards 5G. ETSI white paper 11, 11
UCC’21, December 6–9, 2021, Leicester, United Kingdom Farhangi Maleki, et al.