Professional Documents
Culture Documents
net/publication/357643577
CITATION READS
1 13
9 authors, including:
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Ailton Oliveira on 10 February 2022.
João Paulo Tavares Borges1 ; Ailton Pinto de Oliveira1 ; Felipe Henrique Bastos e Bastos1 ; Daniel Takashi Né do Nascimento
Suzuki1 ; Emerson Santos de Oliveira Junior1 ; Lucas Matni Bezerra2 ; Cleverson Veloso Nahum1 ; Pedro dos Santos Batista3 ;
Aldebaro Barreto da Rocha Klautau Júnior1
1 Universidade Federal do Pará, Belém 66075-110, Brazil
2 Universidade Estácio de Sá, Belém 66055-260, Brazil
3 Ericsson Research, 164 80 Stockholm, Sweden
ABSTRACT
the RL agent to schedule a user and then choose the index UAV
ofabeamformingcodebooktoserveit. Akeyaspectofthis
problemisthatthesimulationofthecommunicationsystem
and the artificial intelligence engine is based on a virtual
world created with AirSim and the Unreal Engine. These
components enable the so-called CAVIAR methodology,
which leads to highly realistic 3D scenarios. This paper
describes the communication and RL modeling adopted in
the framework and also presents statistics concerning the
implementedRLenvironment,suchasdatatraffic,aswellas
resultsforthreebaselinesystems. Figure 1 – CAVIAR simulation scenario, depicting the
radiationpattern(inlightgreen)correspondingtothechosen
Keywords-5G,6G,beamselection,MIMO,mmWave,RL beamformingcodebookindextoserveadrone(attheright).
1. INTRODUCTION Systems such as IEEE 802.11ad are usually designed for
worst-case scenarios and, in most situations, continuously
ReinforcementLearning(RL)isalearningparadigmsuitable send signals that do not carry information (overhead) [9].
for problems in which an agent has to maximize a given This overhead may represent a significantp arcelo fthe
reward,whileinteractingwithanever-changingenvironment. channelcapacity,anddecreasingitisafundamentalproblem
This class of problem appears in several points of interest that can enable systems to improve the usage of physical
in 5th Generation (5G) and 6th Generation (6G) mobile resources (e.g., with lower latency and higher bit rates)
networks,suchas:congestioncontrol[1],networkslicing[2], [10,11,12].
resourceallocation[3],andthe5GPhysicalLayer(PHY)[4].
However,thelackoffreelyavailabledatasetsorenvironments In this work, the beam selection and user scheduling
totrainandassessRLagentsisapracticalobstaclethatdelays problems are posed as a game that must be solved with
thewidespreadadoptionofRLin5Gandfuturenetworks. RL. The game is based on a simulation methodology
namedCommunicationNetworks,ArtificialIntelligenceand
To address this challenge, some works explore the use of Computer Vision with 3D Computer-Generated Imagery
virtualworldstogeneratedatasetsbycreatingenvironments (CAVIAR), with a preliminary version proposed in [13].
forcommunicationsingeneral[5],andArtificialIntelligence The CAVIAR simulation integrates three subsystems: the
(AI) / Machine Learning (ML) applied to 5G/6G [6], communicationsystem, theAIandMLmodels, andfinally
leveragingthefactthat5Gandbeyondsystemswillbenefit thevirtualworldcomponents. Inthispaper,theproblemis
from rich contextual information to improve performance based on simulating a communication system immersed in
and reduce loss of radio resources to support its services avirtualworldcreatedwithAirSim[14]andUnrealEngine[15].
[4, 7, 8]. So, the key idea in this paper is to use realistic
representations of deployment sites together with physics More specifically, the goal is to schedule and allocate
and sensor simulations, to generate a virtual representation resources to Unmanned Aerial Vehicles (UAVs), cars and
that combined with the communication network simulator, pedestrians,composingascenariowithaerialandterrestrial
enablestrainingRLagentsfortaskssuchasbeamselection.
Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO PARA. Downloaded on February 10,2022 at 17:42:35 UTC from IEEE Xplore. Restrictions apply.
2021 ITU Kaleidoscope Academic Conference
User Equipment (UE). The RL agent is executed at the files, which are named episodes, containing the trajectory
Base Station (BS) and periodically takes actions based on the data of all moving objects within a simulation. To generate it,
information captured from the environment, which includes a waypoint file, which is a text file with reference points, must
channel estimates, buffer status, and positions from a Global be executed by AirSim. During its execution, the information
Navigation Satellite System, such as GPS. The RL agent from the mobile elements is stored in the episode. Each
receives a reward based on the service provided to the episode lasts about three minutes, with a sampling interval
users. The training occurs “offline”, without rendering the of ten milliseconds, and is composed by columns related to
3D scenes, but it is possible to render the output in a position and orientation for pedestrians and cars, with the
post-processing step and generate a video. addition of acceleration, linear, and angular velocities for
UAVs. To use the episode files to obtain information from
This work is organized as follows. In Section 2 we discuss Multiple-Input Multiple-Output (MIMO) channels and data
CAVIAR simulations in general and the specific RL problem traffic, one must execute them within the CAVIAR simulation
addressed in this paper. Sections 3 and 4 describe the environment.
communication and machine learning models, respectively.
Simulations results are presented in Section 5, while Section Unreal/AirSim
6 concludes the paper.
Waypoint
Episodes
generator
2. CAVIAR SIMULATIONS
- MIMO channels
As proposed in [6] and shown in Figure 2, the CAVIAR SimulaƟon
- Combined channel magnitudes
framework incorporates three subsystems: AI/ML, virtual - Data traĸc environment
world, and wireless communications. In the following
paragraphs we describe the framework, focusing first on the Figure 3 – CAVIAR data generation.
overall description of the methodology and then on how
it was realized in the user scheduling and beam selection
environment. 2.1 Overall CAVIAR description
RL tasks can be continuous or episodic; this last category As previously mentioned, Figure 2 displays an overview of the
assumes the context adopted in this work. Figure 3 expected components in a CAVIAR simulation. In summary,
exemplifies the CAVIAR data generation pipeline: the data the blocks encompassing the proposed simulation strategy
set is provided as a set of Comma-Separated Values (CSVs) can be described as follows: the Communications Engine
Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO PARA. Downloaded on February 10,2022 at 17:42:35 UTC from IEEE Xplore. Restrictions apply.
Connecting physical and virtual worlds
handles all information regarding the communication aspect (position, orientation, acceleration, etc.) of each mobile
of the simulation, such as data traffic, buffer and channel entity. For this problem, the samples are collected at
information. In the CAVIAR simulations, channels can be every 10ms and they contain information of 37 entities (34
pre-computed and the communication simulation decoupled pedestrians, 2 cars, and 1 UAV).
from the physics engine, as often used in AI/ML applied to
beam selection [7, 4]. As shown in Figure 4 the spatial data generated by the
virtual scenario is the input for the CAVIAR simulation
The 3D assets used in the Environment and as Mobile entities, environment, more specifically, the communication engine,
such as UAVs, cars, buildings, etc, are either created or which is responsible for computing the radio channels and
obtained online, as described in [4]. They compose the other parameters related to the telecommunication system,
simulation environment as fixed or mobile objects, whose such as buffer size, etc. The output of the communication
eventual movements and interactions are managed by the engine along with the spatial data is the input for an RL
Mobility engine and by the Physics engine of the virtual agent, that is trained to choose, in each time slot, a user to
world subsystem, respectively. The Sensors engine output serve, and the beam that should be used.
constitutes the input to the AI/ML frontend engine.
Figure 4 – CAVIAR simulation flow. We do not take noise into account in order to isolate the
impact of the beam selection procedure.
Using a virtual scenario provided by CAVIAR, three mobile
entities: a pedestrian, a car, and a UAV are simulated in Ray Tracing (RT) was used in [4], to generate realistic
order to generate a data set of urban mobility. This data communication channels H. For this paper, we have not used
is organized in episodes that contains spatial information RT but a simpler procedure based on the geometric MIMO
Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO PARA. Downloaded on February 10,2022 at 17:42:35 UTC from IEEE Xplore. Restrictions apply.
2021 ITU Kaleidoscope Academic Conference
The geometric channel model [16] is adopted with L = 2 Figure 6 – Histogram of packets traffic received by the BS
Multipath Components (MPCs): for each user.
L
H= Nt Nr α ar (φA, θ A)a∗t (φD, θ D ). (3)
4. MACHINE LEARNING MODEL
=1
The parameters in Eq. (3) are obtained as follows. The 4.1 Evaluation of RL agents
phase of the complex-gain α is obtained from a uniform
To evaluate the RL agent, the return G over the test episodes
distribution with support [0, 2π]. For generating the
is used. The return G e for episode e is
magnitude |α |, first the distance d between the BS and
the given receiver is used to calculate the received power e
Ns
via the Friis equation [17]. The path loss is obtained from Ge = re [t], (4)
this equation and determines |α |, which decreases with t=1
d. The elevation φ and azimuth θ angles, for departure
where Nse is the number of scenes in episode e. The
(e.g. φD ) and arrival (e.g. φA) are obtained from the
corresponding reward re [t] at discrete-time t is a weighted
orientation provided by the LoS path. The nominal LoS
sum of transmitted and discarded packets given by
angles are slightly changed by adding to them Gaussian
random variables with zero-mean and variance of 1 degree. Ptx [t] − 2Pd [t]
re [t] = , (5)
These angles are used to compose the steering vectors at and Pb [t]
ar .
where Ptx [t], Pd [t], and Pb [t] correspond, respectively, to the
total amount (summation for all users) of transmitted, dropped
3.1 Traffic model and buffered packets at time t. The reward re [t] is restricted
to the range −2 ≤ re [t] ≤ 1. At each time t, a single user can
The users’ data traffic is defined as Poisson processes with be served, but Pb [t] accounts for the number of packets in all
time-varying mean λu [t] for user u. We specified two three buffers. Hence, re [t] = 1 only if all buffered packages
Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO PARA. Downloaded on February 10,2022 at 17:42:35 UTC from IEEE Xplore. Restrictions apply.
Connecting physical and virtual worlds
of the scheduled user are transmitted, while the buffers of the 4.3 Experiment description
other two users were empty.
We developed an experiment using CAVIAR for the problem
of scheduling and beam selection. Given that a complete
4.2 Possible inputs to RL agents episode file contains information about all moving objects in
a scene (all pedestrians, cars, etc.), we simplified the data
The inputs (also known as states or observations) can generated by the simulation assuming that the beam selection
be selected both from information provided in CSV files RL agent, named B-RL, only uses data from the three served
(position (x,y,z), velocities, etc.) or obtained from the users (uav1, simulation_car1 and simulation_pedestrian1).
environment, such as the buffer state and channel information
for specific beam index previously chosen.
Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO PARA. Downloaded on February 10,2022 at 17:42:35 UTC from IEEE Xplore. Restrictions apply.
2021 ITU Kaleidoscope Academic Conference
ACKNOWLEDGEMENTS
REFERENCES
Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO PARA. Downloaded on February 10,2022 at 17:42:35 UTC from IEEE Xplore. Restrictions apply.
Connecting physical and virtual worlds
<> “Themostpowerfulreal-time3Dcreationtool,”
https://www.unrealengine.com,accessed: 2021-10-19.
<> R.W.Heath,N.González-Prelcic,S.Rangan,W.Roh,
andA.M.Sayeed,“AnOverviewofSignalProcessing
Techniques for Millimeter Wave MIMO Systems,”
vol.10,no.3,pp.436–453,Apr.2016.
Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO PARA. Downloaded on February 10,2022 at 17:42:35 UTC from IEEE Xplore. Restrictions apply.
View publication stats