Freq Domain BM Time Series

Efficient Cross-Frequency Beam Prediction in 6G
Wireless Using Time Series Data

1st Vaibhav Bhosale§ 2nd Navrati Saxena 3rd Ketan Bhardwaj
School of Computer Science Department of Computer Science School of Computer Science
Georgia Institute of Technology San Jose State University Georgia Institute of Technology
Atlanta, USA San Jose, USA Atlanta, USA
vbhosale6@gatech.edu navrati.saxena@sjsu.edu ketanbj@gatech.edu
4th Ada Gavrilovska 5th Abhishek Roy

School of Computer Science Advanced Communication Technology
Georgia Institute of Technology Mediatek Inc
Atlanta, USA San Jose, USA
ada@cc.gatech.edu Abhishek.Roy@mediatek.com
Abstract—Next generation 6G wireless envisions much higher beams between the next-generation base station nodes (gNB)
data rate and lower latency compared to 5G wireless networks. and mobile User Equipment (UE).
Directional antennas, with narrow beams, across high mmWave
frequencies, hold the key to achieving such high data rates. One major challenge [1] in BM lie in inefficient beam
However, Beam Management (BM), which is the process of sweeping for the downlink (DL)/uplink (UL) beam corre-
finding appropriate transmit and receive beams, offers significant spondence. Beam sweeping involves an exhaustive search to
challenges. Dynamic channel variation, user mobility, and narrow find the beam/beam-pair having the strongest signal strength.
beams in high frequency mmWave channels further complicate
Naturally, performing an exhaustive search in all possible
these challenges. Efficient Machine Learning (ML) strategies can
be used to alleviate this overhead. In spite of their underlying directions is quite complex and expensive (both in terms of
differences, sub-6 GHz and high frequency mmWave channels latency and power). This already complex problem become
share some similarities in array geometry, number of paths, and even more complicated in the mmWave bands, located in the
surrounding environment. As sub-6 GHz channel characteristics frequency spectrum ranging from 30 to 300 GHz. MmWave
are relatively easier to acquire and learn, we introduce a new
bands provide the benefits of a big chunk of available and
machine learning framework, using transformer and LSTM to
learn sub-6 GHz channel information over time for efficient beam un-explored frequencies. However, these benefits come with
prediction across high frequency mmWave channels. System a high attenuation cost in free space. To address the high
Level simulation results point out that when using time-series attenuation, a large number of highly directional MIMO
data-based learning of beam patterns with transformers or antennas are used, whose gain compensates for the path
LSTMs in sub-6 GHz channels, our proposed scheme achieves
loss. The use of these highly directional MIMO antennas
up to 99.5% top-5 beam prediction accuracy while reducing the
BM overhead by over 50% compared to existing work. demands precise and efficient beam selection methods to
Index Terms—6G wireless, beam management, machine learn- ensure the required application data rate and meet the strict
ing, transformer, LSTMs, channel measurements delay requirements. Another challenge such bands pose is
the low diffraction capacity and severe blocking caused by
I. I NTRODUCTION most materials. Furthermore, beam searching across these
The insatiable demand for high data rates and ubiquitous large number of narrow beams in high mmWave frequencies
connectivity is fueling the interests for upcoming sixth gener- incurs more delay and computational power. This makes BM
ation (6G) wireless. The vision of 6G wireless aims to provide in mmWave more complex and expensive compared to BM in
an order-of-magnitude more spectrum and ubiquitous connec- sub-6 GHz bands. Therefore, 6G wireless will continue using
tivity for emerging new applications, like eXtended Reality sub-6 GHz bands, also known as frequency range 1 (FR1)
(XR) and Metaverse, by exploring high-frequency millimeter for outdoor to indoor connectivity, while exploiting mmWave
wave (mmWave) bands that provide the benefits of a big chunk bands or frequency range 2 (FR2) for high data rates in indoor
of available and un-explored frequencies. However, mmWave and outdoor environments [2].
bands face challenges such as high path loss, blockage, Artificial intelligence (AI) and machine learning (ML) are
and oxygen absorption. Dense network deployment with a expected to play a key part in alleviating the complexity of
wide number of small-size, directional antennas is expected 6G wireless. Major wireless standard bodies, like the 3rd
to alleviate these challenges. This will be aided by beam Generation Partnership Project (3GPP), anticipate native AI-
management (BM) to find the (near) optimal transmit/receive ML support in 6G wireless. 3GPP’s efforts encompass AI-
ML applications for networks and radio interfaces, with an
§ This work was done as an intern at Mediatek Inc eventual goal of standardization [3]. More recently, 3GPP
has recognized the use of AI/ML techniques for efficient works are known to suffer from significant path loss and
BM as a key use-case. As many fundamental relationships blockage [14], and hence dense network deployments with
in wireless communications are often non-linear, deep neural large-scale antenna arrays are used to achieve high beam-
networks (DNNs) have recently been adopted in various fields forming gains to compensate this loss. However, the increased
of wireless communications including BM [4]–[12]. scale of antenna arrays introduces high overhead when UEs try
Interestingly, while different types of DNN models [11], to periodically measure the signal quality of different beams
[12] have been recently used for beam prediction across for reporting the beam measurements for beam alignment and
frequencies, time-series data for cross-frequency BM is not potential handovers [15].
yet explored. As BM in wireless involves non-linear mappings Adaptive learning of time varying wireless channels using
across time-varying wireless channels, we believe that explor- AI/ML techniques are recently used to address this challenge
ing time-series data for cross-frequency BM has sufficient and enable reliable and efficient beam management [16], [17].
potential for improving its efficiency and accuracy. This is The work in [4] uses reinforcement learning to develop a
where we believe that models, such as transformers [13] multi-armed bandit framework for beam tracking. A data-
and long short-term memory (LSTMs) [6] come into play§ . driven strategy that fuses the reference signal received power
Similar to recurrent neural networks (RNNs) [5], LSTM is (RSRP) with orientation information using an RNN is devel-
a deep learning model architecture capable of learning long- oped in [7]. The authors in [8] use LSTM to support large-
term dependencies for sequence prediction problems. A trans- scale antenna array enabled hybrid, directional BM in hotspot-
former, on the other hand, is another deep learning model based virtual small cells. [9] explores LSTM to utilize the
architecture that avoids recurrence and instead relies entirely past channel state information (CSI) for efficient prediction of
on an ‘attention’ mechanism, differentially weighing the sig- future channels in vehicular mmWave systems. Deep learning
nificance of each part of the input data for obtaining global techniques, similar to Natural Language Processing (NLP)
dependencies between input and output. Removing recurrence is used in [10] to predict the best serving beams of the
in favor of attention mechanisms endows transformers with mobile UEs. Interestingly, the work in [11] points out the
significantly higher parallelization, compared to other popular efficacy of fully-connected neural networks (FCNN) to closely
DNN methods like RNNs and LSTMs. approximate these mapping functions, required for predicting
In this paper, we introduce a new time-series data-based the optimal mmWave beams from sub-6 GHz channels. [12]
beam prediction using models, such as LSTMs and trans- also uses deep neural networks for exploring power delay
formers that exploits channel characteristics of a sub-6 GHz profile (PDP) of a sub-6 GHz channel to predict the optimal
channel to choose a mmWave beam. The use of LSTMs mmWave beams.
incurs lower number of FLOPs resulting in lower energy Broadly a lot of this work has focused on predicting beam
consumption, while transformers can serve inference faster due parameters by either (i) looking at fewer beam parameters, (ii)
to the presence of its underlying attention mechanism. Our parameters over time, (iii) parameters in a different frequency
framework lets 6G developers choose which model they need band, etc. Our focus is on the latter two classes of problems,
to use based on their specific requirements. Assuming a sub-6 where AI/ML has been recently used to share information
GHz connection is already established, we learn its underlying along the temporal and frequency domains individually. How-
channel information over time by using LSTM/transformer ever, to the best of our knowledge, exploiting time series data
learning. While learning over sub-6 GHz information aids for beam prediction in different frequency bands is not yet
in reducing the search space needed for the initial mmWave explored. As BM in high frequency mmWave bands are more
beam establishment, the use of LSTMs/transformers over time complex and expensive compared to BM in sub-6 GHz bands
series helps in the efficient capturing of time-varying wireless and wireless channels are inherently time-varying, we posit
channel characteristics. that there is an additional value in combining the information
The rest of the paper is organized as follows: In Sec- across the two domains (time and frequency) to use temporal
tion II we take a look into the major related works on beam information for improving the prediction accuracy in the
prediction and point out our motivation behind using time frequency domain beam prediction. Fortunately, LSTMs and
series information. Section III explains the system model and transformers provide an efficient tool to capture such time
problem formulation. Subsequently, we explain our proposed series information [18] Although originally developed for
time-series based beam prediction method in Section IV. Natural Language Processing (NLP) [6], [13], over the next
Simulation results in Section V show the efficacy of our few sections we will show how time-series based learning can
solution. Section VI concludes the paper with pointers to delve into time varying wireless channels information to assist
future research. and improve cross-frequency BM in 6G wireless.
II. R ELATED W ORKS AND M OTIVATION III. S YSTEM AND C HANNEL M ODELS
BM problem has recently raised an increased research
We consider a system where uplink communication happens
interest, especially for the high frequencies (e.g., mmWave
in FR1 and downlink in FR2 as shown in Figure 1. We assume
and THz) 6G wireless networks. These high frequency net-
MF R1 antennas for uplink and MF R2 antennas for downlink.
§ RNN was also a candidate, but since it experiences vanishing/exploding We follow a similar system operation and channel models as
gradients for longer sequences, it does not help here. defined in [11]. We use the uplink channel vector as hF R1 [k] ∈
[11], [12] have recently pointed out that it is feasible to
use the FR1 channel information to predict the beamforming
vector in FR2. Moreover, they have also shown that there exists
FR1 hFR1 FR1 a mapping from the FR1 channels to achievable rate in FR2
Transceiver Transceiver channel.
FR2 hFR2 FR2 Φn : hF R1 → R(hF R2 , fn ), n = 1, 2, . . . , ∥F ∥, (5)
Transceiver Transceiver UE and that there exists a mapping function that can be leveraged
Base-station to obtain the optimal mapping between FR1 and FR2 channels.
Let’s define the optimal mapping function as ηt⋆ at timestep t
as
Fig. 1: Depicting the system setup where the FR1 and FR2
η ⋆ = argmaxn∈{n=1,2,...,∥F ∥} ΦnF R1 (hF R1 ) (6)
antenna arrays are colocated on the base-station and UE.
The works in [7]–[10] showed that temporal FR2 channel
information can be utilized to predict the optimal FR2 beam-
CMF R1 ×1 at the kth subcarrier with k = 1, . . . , K (K is the
forming vector at next time-step, thereby giving
total number of subcarriers) and the signal received at the gNB
Ψt : {ht−m t−m+1
F R2 , hF R2 , . . . , ht−1 t ⋆
F R2 } → R(hF R2 , f ) (7)
sub-6 GHz or FR1 array as:
Combining with Equation 7 with Equation 6, gives us
yF R1 [k] = hF R1 [k]sp [k] + nF R1 [k], (1) Ψt : {η ⋆ (ht−m ⋆ t−m+1
F R1 ), η (hF R1 ), . . . , η ⋆ (ht−1
F R1 )} (8)
t ⋆
→ R(hF R2 , f )
where sp [k] is the uplink pilot signal satisfying E∥sp [k]∥2 =
PF R1 We term this composition Ω mapping from the FR1 chan-
K , with PF R1 as the uplink transmit power, and nF R1 [k] ∼ nels for m time-steps to the optimal beamforming vector at
NC (0, σ 2 , I) is the noise at the gNB FR1 array. Similarly,
the latest time-step.
defining the downlink beamforming vector as f ∈ CMF R2 ×1
Ωt : {ht−m t−m+1
F R1 , hF R1 , . . . , ht−1 t
F R1 } → R(hF R2 , f )
⋆
(9)
leads to the signal received by the UE as:
Prior works have shown that high beam training time
raises significant challenges with solving these functions using
yF R2 [k̄] = hTF R2 [k̄]f sd + nF R2 [k̄] (2)
traditional (non-ML) approaches. Hence, we decided to look
where hF R2 [k̄] ∈ CMF R2 ×1 represents the downlink channel at machine learning techniques that can help to solve this
from the mobile UE to the gNB FR2 array at the k̄ th problem. We observe that this is quite similar to the mapping
subcarrier, k̄ = 1, 2, . . . , K̄. Owing to the hardware constraints used to solve time-series problems [18] in AI/ML. Different
in the FR2 analog beamforming vectors, these vectors are ML approaches, like RNNs, LSTMs, and transformers have
generally selected from quantized codebooks. Thus, it can been used to solve these types of time-series problems. We
be assumed that the beamforming vector f takes one of the next look at which of these approaches will be suitable for
candidate values collected in the codebook F , i.e., f ∈ F , this problem.
with cardinality ∥F ∥= NCB . We adopt a geometric (physical)
IV. C ROSS F REQUENCY BM USING T IME S ERIES DATA
model for the FR1 and FR2 channels [19]. Using this allows
us to model FR2 channel as As we discussed earlier, ML can help to drastically reduce
DXC −1 X L
2πk the time associated in BM by incorporating information about
hF R2 [k] = αl e−j K p(dTs − τl )a(θl , ϕl ), (3) the channel and predicting the best k beams. Prior work [11],
d=0 l=1
where L is the total number of channel paths, α, τ, θ, ϕ are [12] focused on using the FR1 measurements at the current
the path gains (including the path-loss), the delay, the azimuth timestep to exploit the correlation between the FR1 and
angle of arrival (AoA), and elevation AoA, respectively, and FR2 measurements. We observed that this correlation can
a is the steering vector for the lth channel path. Ts represents be composited with the correlation between measurements
the sampling time while DC denotes the cyclic prefix length at consecutive time steps i.e., a time series data of FR1
(assuming that the maximum delay is less than DC Ts ). Using measurements helps with better prediction of FR2 beams.
this channel model and the system setup mentioned above, The most common techniques to deal with time series
the achievable downlink rate for an FR2 channel hF R2 and data include RNNs, LSTMs and transformers. We ruled out
beamforming vector f can be written as: RNNs as a potential solution since they suffer from vanish-
ing/exploding gradients [6] and hence won’t be able to capture
K̄
X the time series nature of this data. We observe comparable
R({hF R2 [k̄], f}) = log2 (1 + SN R∥hTF R2 f∥2 ) (4) accuracy results for transformers and LSTMs (Section V) with
k̄=1 the transformer doing marginally better. However, transformers
PF R2
where SNR is defined as SN R = Kσ 2 . The optimal beam- have a faster inference speed due to their ability to replace
F R2
forming vector is the one that maximizes the R({hF R2 [k̄], f}) recurrence to entirely focus on attention making them faster
which we denote by f ⋆ . Predicting f ⋆ requires estimating the while also increasing the parallelism of the model. On the
channel hF R2 or an online exhaustive beam training, which other hand, LSTMs incur less number of Floating Point
incurs large training overhead, especially for high-frequency Operations (FLOPs) [20], [21] that will result in lower energy
FR2 or mmWave channels. utilization to run the model. Our contribution in this paper is
Layer Output Shape Parameters Dropout Layer
Layer Output Shape Parameters
Input (10,504) 0
Input (10,504) 0
Transformer Block 1 (10,504) 256159 Normalization Layer
LSTM 1 (10,256) 779264
Transformer Block 2 (10,504) 256159
LSTM 2 (10,256) 525312 Feed-Forward Layer
LSTM 3 (10,128) 197120
LSTM 4 (10,64) 49408 Feed-Forward Layer
Average Pooling (504,) 0
LSTM 5 (10,32) 12416
Dense (128,) 64640
Dense (128,) 4224 Multi-Head Attention
Dropout (128,) 0
Dense (504,) 65016
Dense (504,) 65016 Input
Fig. 2: The structure of every

TABLE I: The different layers used in TABLE II: The different layers used individual transformer block
the design of the LSTM model in the design of transformer
to provide a framework to use time-series data, so we provide

the various options through which this data can be utilized
for beam management, while also outlining the tradeoffs
associated with the same.
• Model Structure: Table I and Table II show the structure
of our LSTM and transformer models. We use standard
LSTM blocks of the Keras library. The LSTM model
consists of 5 LSTM cells and the transformer model
consists of 4 transformer blocks (Figure 2) to process
sequential data. These layers are followed by dense layers
that help transform the obtain the FR2 data values for the
requisite channels (504 in our current implementation)
that help us identify the k best beams for BM.
• Training Procedure. Both these models are trained using
the data generated by our system-level simulation (more
details in Section V). We use sparse categorical cross- Fig. 3: The layout of different cells used in our simulation
entropy as the loss function and the Adam optimizer with environment
a learning rate of 0.0001 for training. The training process
for both the models required 40-60 epochs for our dataset.
The design of our algorithm provides flexibility to choose
Algorithm 1 Beam Management function for cross-frequency the value of k which provides a tradeoff between the mea-
prediction surement overhead and accuracy. As we show in Section V,
the top-5 beams can provide over 99% accuracy using LSTMs
1: function C ROSS F REQUENCY BM(FR1Data) and transformers.
2: P rocessedData ← DataP rocessing(F R1Data) V. S IMULATION E XPERIMENTS AND R ESULTS
3: topKBeams ← T imeSeriesM odel(P rocessedData)
4: topKM easurements ← M easure(topKBeams) In this section, we first briefly present our simulation
5: return ArgM ax(topKM easurements) setup and go through the overall simulation methodology.
6: end function Subsequently, we discuss the simulation results obtained by
carrying out the simulation experiments.
We describe the steps involved in our cross-frequency beam A. Simulation Setup.
management algorithm in Algorithm 1. At every step, the FR1 As ML-enabled 6G modem chipsets are not yet widely
data for the previous 10 timesteps is collated to act as the input available for experimental design and verification, we resort
to the algorithm. to MediaTek’s custom-built System Level Simulator (SLS)
1) First, the FR1 data is normalized which helps to maintain to generate data using 3D ray tracing and subsequently use
the numerical stability of the data. 3GPP-specified simulation parameters [22]. We use the two
2) Next, we run this data through a trained time-series frequencies 4 GHz (for sub-6GHz/FR1) and 30 GHz (for
model (transformer / LSTM) to provide the top-k beams. mmWave/FR2). Our SLS simulates an outdoor environment
3) These k beams are then measured, ensuring that a tiny consisting of 21 cells (as shown in Figure 3) for 1000
subset of all the beams needs to be measured thus, timesteps, with each timestep 5 ms apart. At every cell, we
speeding up the beam management process. consider a base station equipped with two co-located uniform
4) Finally, the optimal beam from these k beams is chosen. linear arrays for both frequencies. We run our simulations for
Top 1 Top 3 Top 5
FLOPs (millions)
100 99.1 97.96 99.34 99.54 200
95.42 96.5 96.75
92.76 150
90 100
Accuracy
80
50
0
72.42 FCNN TF#1 TF#5 TF#10 TF#20 TF#40
70 66.52
69.69
Model
63.3
60 Fig. 5: Comparing the number of Floating Point Operations
FCNN RNN LSTM Transformer
Model required for all the different models.
Fig. 4: Top-1, Top-3, and Top-5 accuracy for different models.

Top 1 Top 3 Top 5
100 99.22 99.41 99.44 99.54 99.56
96.26 96.79 96.69 96.75 97.07
a total of 100 UEs. We consider both stationary and mobile
UEs (upto 30km/h). 90
Accuracy
B. Simulation Methodology.
80
We compare our methodology of using time-series data with 71.91 72.42 71.44
69.97 71.28
prior work [11], [12] that suggested the use of fully connected 70
neural networks (FCNN) to determine the best beam in FR2.
The FCNN model consists of an input layer and four hidden 60
layers all having the same number of neurons. All the layers TF 1 TF 5 TF 10 TF 20 TF 40
are activated with the ReLu function [23] followed by the Model
dropout function [24], except the last layer which is activated
Fig. 6: Top-1, Top-3, and Top-5 accuracy for transformer
with the softmax function [25]. In addition to using the FCNN
variants. (TF n refers to TF with window size n)
model, we also show the results for a RNN model as a
baseline comparing with our proposed LSTM and transformer
models. All the models used in our simulation are developed Top 1 Top 3 Top 5
using Keras [26]. While we use the entire models for our 100 98.17 98.32 99.34 99.23 98.88
simulation study, using Keras provides us the benefit of using 96.5 95.58 94.89
93.91 94.14
the TensorFlow Lite Converter which can easily convert our
90
models to run efficiently on UEs.
Accuracy
We use a 90-10 train-test split for our simulations. During

the training phase, we use a 80-20 train-validation split. We
80
compare the models based on the Top-1 accuracy, Top-3 69.69
accuracy, Top-5 accuracy, and the number of floating point 70 65.81
68.48 67.06
65.6
operations (FLOPs) required for every instance of inference.
We obtain the FLOPs for the FCNN and transformer models 60
LSTM 1 LSTM 5 LSTM 10 LSTM 20 LSTM 40
using the keras-flops library [27]. However, as there is no way Model
to find the FLOPs for RNN and LSTM, we do not share the
FLOPs for those models. Fig. 7: Top-1, Top-3, and Top-5 accuracy for LSTM variants.
(LSTM n refers to LSTM with window size n)
C. Results
We measure the accuracy for the scenario using all 21 cells
from our simulation environment in Figure 4. We observe transformer is due to the inherent design of the two models.
that the transformer model outperforms all the other models In the case of an FCNN model, the number of multiply-
(FCNN, RNN and LSTM). While RNN performs poorer com- accumulate operations increases as the number of FLOPs
pared to the FCNN model due to its difficulties in capturing increases quadratically with the increase in the number of
patterns in longer sequences, LSTM does indeed perform neurons [28] whereas, for a transformer, the number of FLOPs
better than the FCNN, but slightly worse than the transformer. increases linearly. Thus, this has greater implications while
Further, as we see in Figure 5, a transformer requires far fewer modifying the size of the model, which is why we advocate
number of FLOPs compared to the FCNN. However, as shown the use of a transformer.
in [20], [21], LSTMs will require even fewer FLOPs. Further, we also show a micro-benchmark for the trans-
This difference in trends for the FLOPs for the FCNN and former varying the window sizes trying to converge to an
optimal window size. We observe that the accuracy of the [11] M. Alrabeiah and A. Alkhateeb, “Deep Learning for mmWave Beam and
model increases rapidly as we increase the window size from 1 Blockage Prediction Using Sub-6GHz Channels,” IEEE Transactions on
Communications, vol. 68, no. 9, pp. 5504–5518, 2020.
to 10, with very little increase from 10 to 20, but then observes [12] M. S. Sim, Y.-G. Lim, S. H. Park, L. Dai, and C.-B. Chae, “Deep
a drop at window size 40, as shown in Figure 6. This tradeoff Learning-Based mmWave Beam Selection for 5G NR/6G With Sub-6
is important especially since the number of FLOPs required GHz Channel Information: Algorithms and Prototype Validation,” IEEE
Access, vol. 8, pp. 51 634–51 646, 2020.
by the model increases as the window size increases as we [13] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez,
observed in Figure 5. L. Kaiser, and I. Polosukhin, “Attention is all you need,” Proceedings
Similarly, we also show a micro-benchmark for LSTMs of the 31st International Conference on Neural Information Processing
Systems (NIPS’17, pp. 6000–6010, 2017.
while varying the window size. Similar to the transformer, the [14] Y. Niu, Y. Li, D. Jin, L. Su, and A. V. Vasilakos, “A Survey of
accuracy of the model increases while increasing the window Millimeter Wave (mmWave) Communications for 5G: Opportunities and
size from 1 to 10, but then drops at 20 and 40 (Figure 7). In Challenges,” Wireless networks, vol. 21, no. 8, pp. 2657–2676, 2015.
[15] S. Hur, T. Kim, D. J. Love, J. V. Krogmeier, T. A. Thomas, and
a way, LSTMs are constrained to utilize less amount of data A. Ghosh, “Millimeter Wave Beamforming for Wireless Backhaul and
compared to the trnasformer. Access in Small Cell Networks,” IEEE transactions on communications,
vol. 61, no. 10, pp. 4391–4403, 2013.
VI. C ONCLUSION [16] M. Q. Khan, A. Gaber, P. Schulz, and G. Fettweis, “Machine Learning
for Millimeter Wave and Terahertz Beam Management: A Survey and
In this paper, we introduce a new ML framework which Open Challenges,” IEEE Access, vol. 11, pp. 11 880–11 902, 2023.
explores using time series based transformers and LSTMs to [17] K. Ma, Z. Wang, W. Tian, S. Chen, and L. Hanzo, “Deep Learning
for Beam-Management: State-of-the-Art, Opportunities and Challenges,”
efficiently learn the historical channel measurement of sub-6 arXiv preprint arXiv:2111.11177, 2021.
GHz wireless channels. As mmWave channel characteristics [18] S. Ahmed, I. E. Nielsen, A. Tripathi, S. Siddiqui, G. Rasool, and R. P.
are more complex and expensive to learn, we subsequently Ramachandran, “Transformers in Time-series Analysis: A Tutorial,”
arXiv preprint arXiv:2205.01138, 2022.
use this sub-6 GHz wireless channel characterisitics, learnt [19] R. W. Heath, N. Gonzalez-Prelcic, S. Rangan, W. Roh, and A. M.
over time, to efficiently predict the channel information and Sayeed, “An Overview of Signal Processing Techniques for Millimeter
BM in high frequency mmWave channels for 6G wireless. Wave MIMO Systems,” IEEE journal of selected topics in signal
processing, vol. 10, no. 3, pp. 436–453, 2016.
System Level simulation results demonstrate that our time [20] W. Li, J. Qin, C.-C. Chiu, R. Pang, and Y. He, “Parallel Rescoring
series based learning of beam patterns in sub-6 GHz channels with Transformer for Streaming On-Device Speech Recognition,” arXiv
results in up to 99.5% top-5 accuracy for beam prediction and preprint arXiv:2008.13093, 2020.
[21] Y. Wang, Q. Wang, S. Shi, X. He, Z. Tang, K. Zhao, and X. Chu,
reduces the BM overhead by over 50% in mmWave channels “Benchmarking the Performance and Energy Efficiency of AI Accelera-
compared to existing approaches. We hope that our work tors for AI Training,” in 2020 20th IEEE/ACM International Symposium
inspires the community to incorporate such learning based on on Cluster, Cloud and Internet Computing (CCGRID). IEEE, 2020, pp.
744–751.
time series data for other aspects of beam management and [22] 3GPP Release 14, “Study on International Mobile Telecommunications
channel estimations. (IMT) parameters for 6.425 - 7.025 GHz, 7.025 - 7.125 GHz and 10.0
- 10.5 GHz,” 3GPP Technical Report, 2018.
R EFERENCES [23] A. F. Agarap, “Deep Learning using Rectified Linear Units (ReLU),”
arXiv preprint arXiv:1803.08375, 2018.
[1] Y. Heng, J. G. Andrews, J. Mo, V. Va, A. Ali, B. L. Ng, and J. C. Zhang, [24] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and
“Six Key Challenges for Beam Management in 5.5G and 6G Systems,” R. Salakhutdinov, “Dropout: A Simple Way to Prevent Neural
IEEE Communications Magazine, vol. 59, no. 7, pp. 74–79, 2021. Networks from Overfitting,” Journal of Machine Learning Research,
[2] Q. Xue, C. Ji, S. Ma, J. Guo, Y. Xu, Q. Chen, and W. Zhang, “A Survey vol. 15, no. 56, pp. 1929–1958, 2014. [Online]. Available:
of Beam Management for mmWave and THz Communications Towards http://jmlr.org/papers/v15/srivastava14a.html
6G,” arXiv preprint arXiv:2308.02135, 2023. [25] J. S. Bridle, “Probabilistic Interpretation of Feedforward Classification
[3] 3GPP Release 18, “Summary for RAN rel-18 package, 3GPP RAN Network Outputs, with Relationships to Statistical Pattern Recognition,”
Plenary 94-e,” Technical Report RAN Chair, RP-213469, 2022. in Neurocomputing. Springer, 1990, pp. 227–236.
[4] I. Aykin, B. Akgun, M. Feng, and M. Krunz, “MAMBA: A Multi-armed [26] F. Chollet et al. (2015) Keras. [Online]. Available:
Bandit Framework for Beam Tracking in Millimeter-wave Systems,” in https://github.com/fchollet/keras
IEEE INFOCOM 2020-IEEE Conference on Computer Communications. [27] Keras Flops. [Online]. Available: https://pypi.org/project/keras-flops/
IEEE, 2020, pp. 1469–1478. [28] M. Hollemans, “How fast is my model?”
[5] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning Internal https://machinethink.net/blog/how-fast-is-my-model/.
Representations by Error Propagation,” California Univ San Diego La
Jolla Inst for Cognitive Science, Tech. Rep., 1985.
[6] S. Hochreiter and J. Schmidhuber, “Long Short-Term Memory,” Neural
computation, vol. 9, no. 8, pp. 1735–1780, 1997.
[7] K. N. Nguyen, A. Ali, J. Mo, B. L. Ng, V. Va, and J. C. Zhang,
“Beam Management with Orientation and RSRP using Deep Learning
for Beyond 5G Systems,” arXiv preprint arXiv:2202.02247, 2022.
[8] Y. Liu, X. Wang, G. Boudreau, A. B. Sediq, and H. Abou-zeid, “Deep
Learning Based Hotspot Prediction and Beam Management for Adaptive
Virtual Small Cell in 5G Networks,” IEEE Transactions on Emerging
Topics in Computational Intelligence, vol. 4, no. 1, pp. 83–94, 2020.
[9] Y. Guo, Z. Wang, M. Li, and Q. Liu, “Machine Learning Based mmWave
Channel Tracking in Vehicular Scenario,” in 2019 IEEE International
conference on communications workshops (ICC Workshops). IEEE,
2019, pp. 1–6.
[10] A. Ö. Kaya and H. Viswanathan, “Deep Learning-based Predictive
Beam Management for 5G mmWave Systems,” in 2021 IEEE Wireless
Communications and Networking Conference (WCNC). IEEE, 2021,
pp. 1–7.

Freq Domain BM Time Series

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Freq Domain BM Time Series

Uploaded by

Copyright:

Available Formats

Efficient Cross-Frequency Beam Prediction in 6G

Wireless Using Time Series Data

4th Ada Gavrilovska 5th Abhishek Roy

Fig. 2: The structure of every

to provide a framework to use time-series data, so we provide

Fig. 4: Top-1, Top-3, and Top-5 accuracy for different models.

We use a 90-10 train-test split for our simulations. During

You might also like