Professional Documents
Culture Documents
CNN-Based Hybrid Precoding Design With Geometric Mean Decomposition
CNN-Based Hybrid Precoding Design With Geometric Mean Decomposition
Decomposition
Mahmoud A. Abugubba 1 , Nagia M. Gaboua 1 , Taissir Elganimi 2 , and Khaled Rabie 1
1
Affiliation not available
2
University of Tripoli
Abstract
Communications over millimeter-wave (mmWave) frequencies is a key technology for the fifth generation (5G) cellular networks
due to the large bandwidth available at mmWave bands. The short wavelength of mmWave bands enables large antenna
arrays to be placed on the transceivers which forms massive multiple-input multiple-output (MIMO). Massive MIMO with
conventional fully-digital (FD) beamforming is difficult to be implemented due to high power consumption and hardware cost.
One of the most effective solutions to this problem is hybrid beamforming which can be used to balance the beamforming gain,
hardware implementation cost, and the power consumption. However, due to the non-convex constraints imposed by phase
shifters, finding the global optima for the hybrid beamforming system is very challenging with high computational complexity.
To address this issue, deep learning (DL)-based hybrid precoding with geometric mean decomposition (GMD) algorithm for
narrowband mmWave massive MIMO system is proposed in this paper, where it can directly estimate the hybrid analog and
digital precoders (combiners) from a given optimal FD precoder (combiner). Simulation results demonstrated that the proposed
hybrid precoding model can more accurately approximate the FD precoding performance.
1
CNN-Based Hybrid Precoding Design with
Geometric Mean Decomposition
Mahmoud A. Abugubba1 , Nagia M. Gaboua1 , Taissir Y. Elganimi1 and Khaled M. Rabie2
1 Department of Electrical and Electronic Engineering, University of Tripoli, Libya
2 Department of Engineering, Manchester Metropolitan University, United Kingdom
Abstract—Communications over millimeter-wave (mmWave) The hybrid beamforming design problem is a non-convex
frequencies is a key technology for the fifth generation (5G) cel- optimization problem due to the constant modulus constraints
lular networks due to the large bandwidth available at mmWave that imposed by phase shifters. Most existing methods reduce
bands. The short wavelength of mmWave bands enables large
antenna arrays to be placed on the transceivers which forms the complexity by decoupling the optimization problem into
massive multiple-input multiple-output (MIMO). Massive MIMO two sub-problems, where the objective of each sub-problem is
with conventional fully-digital (FD) beamforming is difficult to to approximate the FD precoding using matrix decomposition
be implemented due to high power consumption and hardware [5]. In [6], a phase-extraction (PE) and manifold optimization
cost. One of the most effective solutions to this problem is hybrid (MO)-based alternating minimization algorithm have been pro-
beamforming which can be used to balance the beamforming
gain, hardware implementation cost, and the power consumption. posed. Although the MO algorithm can achieve near optimal
However, due to the non-convex constraints imposed by phase performance by iteratively reducing the Euclidean distance
shifters, finding the global optima for the hybrid beamforming between both the hybrid beamformer and the FD beamformer,
system is very challenging with high computational complexity. its computational complexity prevents it from being used
To address this issue, deep learning (DL)-based hybrid precod- in implementations. In the PE algorithm, the computational
ing with geometric mean decomposition (GMD) algorithm for
narrowband mmWave massive MIMO system is proposed in complexity has been reduced with slight performance loss, but
this paper, where it can directly estimate the hybrid analog and it still provides a better precoding algorithm than most existing
digital precoders (combiners) from a given optimal FD precoder algorithms. These algorithms are based on the singular value
(combiner). Simulation results demonstrated that the proposed decomposition (SVD) which requires complicated bit alloca-
hybrid precoding model can more accurately approximate the tion schemes to achieve the channel capacity. To avoid this
FD precoding performance.
Index Terms—Massive MIMO, millimeter-wave, fully digital
issue, geometric mean decomposition (GMD) was proposed in
precoding, hybrid precoding, deep learning, CNN. [7] to decompose the channel matrix into several parallel sub-
channels with equal signal-to-noise-ratios (SNRs), and hence
the simple identical bit allocation can be utilized for all sub-
I. I NTRODUCTION
channels. However, high computational complexity is still a
A key technology for the fifth generation (5G) of wireless great challenge in hybrid beamforming design.
communications and beyond is millimeter-wave (mmWave) Recently, embedding deep learning (DL) into wireless
communication, which offers greater data rates on the order communications has had a great impact on solving com-
of gigabit per second (Gbps), wider bandwidth, and higher plex problems and high computation issues. For example,
spectral efficiency than conventional cellular communications the authors in [8] used deep recurrent neural network to
[1]. However, the communication over the mmWave band optimize the resource allocation problem with low compu-
is very challenging due to signal propagation and channel tational complexity for the non-orthogonal multiple access
properties. To address these issues, massive multiple-input (NOMA) heterogeneous internet of things (IoT) network. In
multiple-output (MIMO) with beamforming techniques is re- [9], two convolutional neural networks (CNNs) were proposed
quired, where a large antenna arrays are used to concen- to address the problem of modulation classification, where it
trate the radiated energy and steer it toward the receiver can accurately recognize various modulation types. In another
direction [2]. Higher diversity and multiplexing gains can work [10], a deep neural network (DNN) was employed
be attained through massive MIMO with beamforming gain, to estimate the channel and direction-of-arrival for massive
which leads to higher spectral efficiency and higher radiated MIMO systems with better performance than the conventional
energy efficiency [3]. Massive MIMO is costly and very methods.
difficult to be combined with mmWave, though. One of the DL-based hybrid precoding schemes for mmWave massive
promising approaches to address these problems is hybrid MIMO systems has also gained the attention of many re-
beamforming, which utilizes significantly fewer power-hungry searchers. For example, in [11], a DNN is trained to optimize
radio frequency (RF) chains to achieve the performance of the hybrid precoding using GMD. In another work [12], a
fully digital (FD) beamforming, hence reducing the power CNN-based hybrid precoding is proposed with two CNNs,
consumption and the system implementation complexity [4]. where each CNN is trained to estimate the hybrid precoders.
1 1
2 2
RF Chain RF Chain
Fig. 1. Block diagram of mmWave massive MIMO system with hybrid precoding.
Later, a joint hybrid precoding framework based on DNN is NtRF and NrRF RF chains, respectively, where Ns ≤ NtRF ≤ Nt
proposed with end-to-end optimization [13]. In order to reduce and Ns ≤ NrRF ≤ Nr in order to enable data stream multi-
the computation time, deep reinforcement learning has been plexing between the transmitter and receiver [5]. In Fig. 1, the
applied to hybrid beamforming designs in [14], while an auto- symbol vector s ∈ CNs ×1 that satisfy E[ss∗ ] = N1s INs is firstly
RF
encoder based on DNN is proposed in [15] for multi-user precoded by low dimensional digital precoder FBB ∈ CNt ×Ns ,
RF
scenarios. In a recent work [16], the authors designed a hybrid and then the high dimensional analog precoder FRF ∈ CNt ×Nt
precoding algorithm based on attention layer and CNN which is applied using phase shifters to shift the input phase and
is trained via unsupervised learning to maximize the spectral keep the amplitude as a constant. Therefore, all elements of
efficiency directly. (i) (i)H
FRF matrix are constrained to satisfy (FRF FRF )i,j = N1t and
Motivated by the foregoing introduction and recent work, can be represented by
a CNN-based hybrid precoding with GMD-based algorithm
is proposed in this paper. More specifically, the main contri- F F F
bution of this paper is to design hybrid precoding using DL ejθ11 ejθ12 ... ejθ1NRF
F F F
ejθ2NRF
jθ21
1 e ejθ22 ...
approaches. The proposed hybrid preceding model simulate
FRF =√ . .. .. ..
, (1)
the working operation of phase shifters, and satisfy the power ..
Nt . . .
constraint on digital precoder. In addition, the model can F F F
directly estimate the hybrid analog and digital precoders and ejθNt 1 ejθNt 2 ... ejθNt NRF
combiners from a given FD precoder and combiner, and also
F
is trained with GMD preceding dataset to take advantages where θij ∈ [0, 2π] for phase shifters of the analog precoder.
of GMD over SVD precoding. Simulation results verified the The transmitted signal at the transmit antenna array can be
ability of the proposed hybrid preceding model to approximate written as x = FRF FBB s, where FBB must be normalized to
the FD precoder performance. satisfy the total power constrain ∥FRF FBB ∥2F = Ns . Therefore,
The rest of this paper is organized as follows. In Section the received signal ỹ ∈ CNr ×1 can be written as
II, the system and channel models are presented for mmWave
√
massive MIMO system with hybrid precoding. The system ỹ = ρHFRF FBB s + n, (2)
model of the proposed CNN-based hybrid precoding with
GMD-based algorithm is presented in Section III. The simu- where ρ denotes the average received power, H ∈ CNr ×Nt is
lation results are discussed in Section IV, and the conclusions the channel matrix that satisfy E[∥H∥2F ] = Nt Nr , and n is the
and possible future work are given in Section V. additive white noise vector that follows an independent and
identical distribution, n ∼ CN (0, σn2 ).
II. SYSTEM MODEL
The receiver has the similar structure as the transmitter,
In this section, the system and channel models for mmWave where the received symbol vector s̃ after the combining
massive MIMO system with hybrid precoding are described, process can be represented by
and the problem formulation for maximizing the achievable
rate is also discussed.
s̃ = WH H
BB WRF ỹ
√ (3)
A. System Model = ρWBB WH
H H H
RF HFRF FBB s + WBB WRF n,
Output
Dense
Layer
Layer
Convolutional Layer
Convolutional Layer
Convolutional Layer
Convolutional Layer
Pooling Layer
Zero Padding
Flatten Layer
128@2x2
64@2x2
64@2x2
32@2x2
4x4
2x2
Output
Dense
Layer
Layer
Fig. 3. The proposed hybrid precoder architecture.
shown in Fig. 2, the channel matrix H is firstly decomposed different filter size and Leaky-ReLU activation functions are
using GMD, and then Fopt and Wopt are fed to each hybrid utilized to extract complex features of the input, the first layer
precoder and hybrid combiner feature generator. The feature has 128 filters with 2×2 filter size, the second and third layers
generator is a pre-process block that allows the model to have 64 filters with 2×2 size. The last convolutional layer has
extract more features of the input, which provides better 32 filters with 2×2 size. In addition, pooling layer is applied to
training performance. The output of the feature generator reduce the output size, and to speed the computation. Flatten
regarded as the input raw data of the model. layer is also required to make a connection between all pooling
The hybrid precoder (combiner) model has two output layer activation outputs and next dense layers neurons.
layers, where the first layer produces the phase angle of analog The output of the flatten layer is shared between two
precoder (combiner) to simulate the working operation of neural networks, the first neural network for analog precoder
phase shifters, while the second layer produces the normalized has a single dense layer with 2Nt NtRF neurons and Leaky-
digital precoder (combiner). Each model is trained to minimize ReLU activation function, and output layer with 2NtRF Ns
the decoupled joint optimization problem. neurons which has been designed to meet the analog precoder
(i) (i)H
constraint (FRF FRF )i,j = N1t . Consequently, the estimated
A. Feature Generator
analog precoder with a constant amplitude F̂RF can be ex-
Feature generator receives a complex-valued matrix such as pressed as
the optimal FD precoder, Fopt , and then it is converted to the 1 F
real valued raw data by using the real, imaginary and the phase F̂RF = √ ej θ̂RF , (12)
Nt
of the input matrix. For example, if the input matrix is Fopt ,
F
the output vector xF ∈ R3Nt Ns ×1 of the feature generator can RF
where θ̂ RF ∈ RNt ×Nt is the phase angle of the estimated
be expressed as analog precoder, where each element must be between zero
F
h and 2π, i.e., (θ̂RF )i,j ∈ [0, 2π]. Therefore, the sigmoid function
xF = ℜ(Fopt )1,1 , ℜ(Fopt )1,2 , . . . , ℜ(Fopt )Nt ,Ns , with 2π amplitude is used to limit the output in [0, 2π]
ℑ(Fopt )1,1 , ℑ(Fopt )1,2 , . . . , ℑ(Fopt )Nt ,Ns , (11) range. The second neural network for the digital precoder has
iT single dense layer with 8NtRF Ns neurons and Leaky-ReLU
∠(Fopt )1,1 , ∠(Fopt )1,2 , . . . , ∠(Fopt )Nt ,Ns . activation function, and the output layer has 2NtRF Ns neurons
that produces a vectorized real and imaginary components of
Similarly, the output vector is xW ∈ R3Nr Ns ×1 when the the estimated digital precoder F̂BB . The hybrid precoder must
input matrix is Wopt . satisfy the total power constraint ∥F̂RF F̂BB ∥2F = Ns . To meet
B. Hybrid Precoder Model this constraint, F̂BB should be normalized as
√
The objective of the proposed CNN-based hybrid precoder Ns
F̂BB = F̂BB . (13)
model is to minimize the hybrid precoder design problem as ∥F̂RF F̂BB ∥F
stated in (9). Therefore, the proposed hybrid precoder model
must be trained to minimize the Euclidean distance between The model is trained with a defined loss function, that can
the optimal FD precoder Fopt and hybrid precoders (FRF , be expressed as the hybrid precoder design problem as
FBB ). Fig. 3 shows the hybrid precoder model architecture.
L(θ) = ∥Fopt − F̂RF F̂BB ∥F , (14)
As shown, the hybrid precoder model receives a feature vector
xF which contains a vectorized version of the real, imaginary where θ denotes the parameters of hybrid precoder model, and
and angular values of Fopt . The first layer is zero-padding layer the hybrid precoder matrix F̂RF F̂BB should be a unitary matrix,
with 4 × 4 padding which adds zeros with size 4 around the where all columns are orthonormal vectors, such that
input. Zero-padding layer is used to extract the features at the
H H
corner as well as the center. Four convolutional layers with F̂BB F̂RF F̂RF F̂BB = INs . (15)
Adding this constraint to loss function as penalty term can TABLE I
improve the model performance. Thus, the loss function can A SUMMARY TABLE OF TRAINING PARAMETERS .
be rewritten as Parameter Value
L(θ) = ∥Fopt −F̂RF F̂BB ∥F Optimizer Adam (with β1 = 0.9, β2 = 0.999 )
H H (16) Learning rate (η) 0.00005
+ λF ∥F̂BB F̂RF F̂RF F̂BB − INs ∥F ,
Leaky-ReLU constant (α) 0.2
where λF is a non-negative constant of the penalty term λF , λW 0.15
that used to satisfy F̂RF F̂BB to be semi-unitary matrix with Mini batch size 64
orthonormal vectors.
Size of dataset 200,000
In order to minimize the lost function L(θ), Adam optimizer
Epochs 200
is used during the training process to update the model
parameters in the direction of local minimum of L(θ). The
layers parameters at the first iteration should be initialized
at random points. In order to model a general problem as using the channel model expressed in (6) with 200,000 realiza-
possible, each initialized parameters have different random tions. Adam optimizer is selected during the back propagation
distribution based on the layer activation function. process with leaning rate of 0.00005. At each iteration, the
model is trained with 64 mini batch size. Table I summarizes
C. Hybrid Combiner Model the training parameters.
Hybrid combiner model is designed with the similar design Fig. 4 shows the spectral efficiency performance comparison
objectives of hybrid precoder model, where it has the same with NtRF = NrRF = 2 chains. It can be seen from Fig. 4
building blocks: padding layer, four convolutional layers with that the spectral efficiency of the proposed CNN-based hybrid
single pooling layer, flatten layer, dense layers and output precoding model, PE and MO algorithms are almost the same
layers. The first output layer is designed to meet the analog as the FD precoding in the case of Ns = 1, and are almost the
(i) (i)H
combiner constraint (WRF WRF )i,j = N1r . Consequently, the same as PE and MO algorithms and within a small gap from
estimated analog combiner with a constant amplitude ŴRF can the FD precoding in the case of Ns = 2 data streams. This
be expressed as reveals that the proposed hybrid precoder (combiner) model
1 W can more accurately approximate the optimal FD precoder
ŴRF = √ ej θ̂RF , (17)
Nr (combiner).
W RF
To explore the performance of the proposed hybrid pre-
where θ̂ RF ∈ RNr ×Nr is the phase angle of the estimated coding model at larger input and output dimensions, Fig. 5
W
analog combiner, where each element in θ̂RF is limited in shows the spectral efficiency performance comparison with
[0, 2π] range using sigmoid function with 2π amplitude. The NtRF = NrRF = 4 chains. It can be noticed from Fig. 5 that
second output layer produces the estimated digital combiner the spectral efficiency performance of the proposed hybrid
ŴBB . precoding model is almost the same as PE algorithm and
The model is trained with a defined loss function, that can within a small gap from MO algorithm and FD precoding
be expressed as the hybrid combiner design problem as in the case of Ns = 2, and within a small gap from the FD
L(θ) = ∥Wopt −ŴRF ŴBB ∥F precoding, PE and MO algorithms in the case of Ns = 4 data
H H (18) streams.
+ λW ∥ŴBB ŴRF ŴRF ŴBB − INs ∥F , In addition, the estimation time comparison of MO and
where λW is a non-negative constant of the penalty term PE algorithms with the proposed DL-based hybrid precoding
that used to satisfy ŴRF ŴBB to be semi-unitary matrix with model for NtRF = NrRF chains which equal to Ns = 2 and 4
orthonormal vectors. Adam optimizer is also used during the multiplexed data streams and 1,000 realizations is shown in
training process to update the model parameters in order to Fig. 6. It can be noticed from Fig. 6 that MO algorithm takes
find a local minimum of L(θ). a long time to estimate the hybrid precoders and combiners
compared with PE algorithm and the proposed hybrid precod-
IV. S IMULATION R ESULTS ing model, and the latter is almost 19 times faster than PE
In this section, simulation results are presented to illustrate algorithm.
the spectral efficiency of the proposed CNN-based hybrid
precoding model compared to FD precoding based on GMD, V. C ONCLUSIONS
PE and MO algorithms. Throughout the simulations, a massive A novel CNN-based hybrid precoding model with GMD
MIMO with Nt = 8 × 8 and Nr = 4 × 4 UPA antennas with algorithm is proposed in this paper. The main purpose of im-
half-wave element spacing is considered and the channel is plementing this proposed model in mmWave massive MIMO
modeled as cluster environment with Nc = 5 clusters, each is that the conventional precoding methods are deployed
cluster with Nray = 10 rays and 10◦ angle of spread. numerically with high computational complexity. DL methods
The proposed model is constructed and processed using have been introduced to overcome this issue by generating
Keras Python package, and the training dataset is generated the optimal hybrid precoders directly from a given optimal
25 104
MO
FD
PE
Spectral efficiency (bits/s/Hz)
MO Proposed
20 PE 103
Proposed
Ns = 2
15 102
Time (sec)
10 101
5 Ns = 1 100
0 10-1
-30 -25 -20 -15 -10 -5 0 5 10 2 4
SNR (dB) RF Chains
Fig. 4. Spectral efficiency performance versus SNR of different precoding Fig. 6. Estimation time of different precoding algorithms with 1,000 realiza-
algorithms with NtRF = NrRF = 2. tions and NtRF = NrRF = Ns = 2 and 4.