Professional Documents
Culture Documents
7, JULY 2019
the channel can be represented by Saleh-Valenzuela (SV) where FRF and WRF denote the feasible sets of analog
model [19] where the clustered channel model is used as the beamformers which obey the constraints defined for FRF
contribution of Nc clusters of Nray paths as and WRF . Obtaining the real-time solution to the prob-
lem in (4) is impractical due to the complexity of several
Nray
Nc
(ij) (ij) (ij) (ij) matrix variables. To cast the problem in (4) more effec-
H=γ αij gR (ΘR )gT (ΘT )aR (ΘR )aH
T (ΘT ), tively, we first define the sets FRF and WRF . Note that
i=1 j=1
the analog beamformers FRF , WRF are related with the
(ij) (ij)
where ΘR
(ij) (ij)
= (φR , θR ) and ΘT
(ij) (ij)
= (φT , θT )
(ij) (ij) array responses aT (ΘT ), aR (ΘR ) through linear trans-
respectively denote the angle of arrivals and angle of formation [6]. Hence the feasible RF beamformer sets can
(1) (Q ) (q )
departures. We denote the angular parameters φ and θ be formed as FRF = {FRF , . . . , FRFF } where FRFF =
(ij)
as the aT (ΘT ), i = 1, . . . ,Nc , j = 1, . . . , Nray for qF =
azimuth and the elevation angles respectively. N
γ = NT NR /(Nc Nray ) is the normalization factor and αij 1, . . . , QF . QF = N path RF is the number RF precoder candi-
T
is the complex channel gain associated with the ith scattering dates and Npath = Nc Nray . The feasible set for RF combiner
cluster and jth path for i = 1, . . . , Nc and j = 1, . . . , Nray . (1)
is similarly defined as WRF = {WRF , . . . , WRFW } where
(Q )
(ij) (ij)
gR (ΘR ) and gT (ΘT ) are the antenna element gains for (qW )
WRF = aR (Θ
(ij)
(ij) R ), i = 1, . . . , Nc , j = 1, . . . , Nray and
receive and transmit antennas respectively. aR (ΘR ) and Npath
(ij) QW = N RF . Now we can present the joint precoder and
aT (ΘT ) are NR ×1 and NT ×1 steering vectors representing R
combiner design problem as follows
the array responses at the receiver and transmitters
respectively. The nth element of the steering vector ρ −1 H
q̄F , q̄W = argmax log2 INS + Λ WBB WH
(ij) (ij)
aR (ΘR ) is given as [aR (ΘR )]n = exp{− 2π T (ij)
λ pn r(ΘR )},
qF ,qW NS n RF
T
where pn = [xn , yn , zn ] is the position of the nth receive × HFRF FBB FH H H
BB FRF H WRF WBB ,
(ij)
antenna in Cartesian coordinate system and r(ΘR ) = (q ) (q )
(ij) (ij) (ij) (ij) (ij) T s.t.: FRF = FRFF , WRF = WRFW ,
[sin(φR ) cos(θR ), sin(φR ) sin(θR ), cos(θR )] . The
(ij)
transmit side steering vector aT (ΘT ) can be defined in a FBB = (FH
RF FRF )
−1 H
FRF Fopt ,
(ij)
similar way as for aR (ΘR ). In order to generate the labels WBB = (WH RF ΛWRF )
−1
(WH RF ΛW ), (5)
opt
(q )
s.t.: WRF = WRFW , 12: end for n, l
13: Training data for CNNF and CNNW is obtained as
WBB = (WH
RF ΛWRF )
−1
(WH RF ΛW ),
opt
(1,1) (L,N )
ρ H DF = ((X(1,1) , zF ), . . . , (X(L,N ) , zF )),
Λ= HFopt Fopt HH + σn2 INR . (7) (1,1) (L,N )
DW = ((X(1,1) , zW ), . . . , (X(L,N ) , zW )).
NS
Once (6) and (7) are solved, the analog beamformers are
(q̄ ) (q̄ )
constructed as F̂RF = FRFF and ŴRF = WRFW . The
baseband beamformers can also be obtained accordingly. 3 × 3N L. In order to obtain the output data the problems in
(6) and (7) are solved ∀n, l. Then the output data of each
network is obtained. We summarize the algorithmic steps of
IV. CNN-BASED A PPROACH
the training data generation in Algorithm 1.
In this section, we present our CNN framework for joint
precoder and combiner design which is shown in Fig. 1. V. N UMERICAL S IMULATIONS
The proposed network is composed of two CNNs with 8 layers In this section, we evaluate the performance of our CNN
which have identical structures except the last layer. The first framework (referred to as HBDL, Hybrid Beamforming via
layer is the input layer of size NR × NT × 3 with c = 3 Deep Learning) and compare it with the state-of-the-art tech-
channels. The first channel of the input is the element-wise niques such as SOMP [6] and PE-Alt-Min [7]. Uniform square
absolute value of the channel matrix as [[X]:,:,1 ]i,j = |[H]i,j |. arrays are considered with half wavelength spacing with NR =
The second and the third channels are defined as the real and NT = 36 antennas. The number of analog beamformers are
the imaginary parts of the channel matrix as [[X]:,:,2 ]i,j = NRRF = NTRF = 4. The feasible sets FRF , WRF are used
Re{[H]i,j } and [[X]:,:,3 ]i,j = Im{[H]i,j }. The second and third for training only, and the output from CNN can be directly
layer are the convolutional layers with 32 filters of size 2 × 2. used for analog beamforming since the analog beamformer
The fourth and sixth layers are fully connected layers with does not have to lie in the set of array response vectors.
1024 units. There are dropout layers after each fully connected The CNNs are fed with the training data generated for N =
layers (the fifth and seventh layers) with %50 probability. L = 100. For each channel matrix realization, the propagation
The output layer of CNNF is of size NT NTRF × 1 which is environment is modeled with Nc = 4 and Nray = 5 for each
2
the vectorized version of the phases of FRF . Similarly, the clusters with σΘ = 5◦ for all transmit and receive azimuth
size of the output layer of CNNW is NR NRRF × 1. The and elevation angles which are uniform randomly selected
complexity of a CNN is directly proportional with the number from the interval [−60◦, 60◦ ] and [−20◦, 20◦ ] respectively.
of parameters which, in our case, calculated as C 2 (2Ncv (wh+ The proposed network is realized in MATLAB on a PC
50
1) + 2(Nf c + 1) · 100 ) [21]. Here C = 3 is the number of with 768-core GPU. Stochastic gradient decent algorithm is
channels, w = h = 2 is the filter size, Ncv = 32 is the used to update the network parameters with the learning rate
number of filters, Nf c = 1024 is the number of units in the 0.005 and mini-batch size 500 for 200 epochs. As a loss
fully connected layer for %50 dropout probability. Hence the function, we use the negative log-likelihood or cross-entropy
CNN structure in Fig. 1 has 12105 parameters. loss [9]. In the training process, 70% and 30% of all data
In data generation, N different realizations of channel matri- generated are selected as the training and validation datasets,
ces H(n) for different user locations are generated together respectively. Validation aids in hyperparameter tuning during
(n) (n)
with the corresponding sets FRF and WRF . Then for each the training phase to avoid the network simply memorizing the
realization, L noisy channel matrices are obtained where the training data rather than learning general features for accurate
added element-wise synthetic noise is defined by SNRTRAIN = prediction with new data. The validation data is used to test the
|[H] |2
20 log10 ( σ2 i,j ). To account for the changes in the wireless performance of the network in the simulations for JT = 100
TRAIN
environment, we use three different SNRTRAIN levels. Hence Monte Carlo trials. In order to prevent the similarity between
the total size of the training input data becomes NR × NT × the test data and the training data we also add synthetic
ELBIR: CNN-BASED PRECODER AND COMBINER DESIGN IN mmWAVE MIMO SYSTEMS 1243
Fig. 2. Spectral efficiency versus SNR for (a) NR = NT = 25, NS = 1; (b) NR = NT = 36, NS = 2; (c) NR = NT = 36, NS = 3.
noise to the test data where the SNR in testing is defined [4] A. Alkhateeb, G. Leus, and R. W. Heath, Jr., “Limited feedback hybrid
|[H] |2 precoding for multi-user millimeter wave systems,” IEEE Trans. Wireless
similar to SNRTRAIN as SNRTEST = 20 log10 ( σ2i,j ) and Commun., vol. 14, no. 11, pp. 6481–6494, Nov. 2015.
TEST
SNRTRAIN ∈ {10, 15, 20}dB is selected. [5] A. Alkhateeb, O. El Ayach, G. Leus, and R. W. Heath, Jr., “Hybrid
In Fig. 2, the spectral efficiency for different algorithms is precoding for millimeter wave cellular systems with partial channel
knowledge,” in Proc. Inf. Theory Appl. Workshop (ITA), Feb. 2013,
presented for NS = {1, 2, 3} and SNRTEST = 10dB. As it pp. 1–5.
is seen, HBDL provides better performance as compared to [6] O. El Ayach, S. Rajagopal, S. Abu-Surra, Z. Pi, and R. W. Heath, Jr.,
the optimization-based method PE-Alt-Min and greedy-based “Spatially sparse precoding in millimeter wave MIMO systems,” IEEE
Trans. Wireless Commun., vol. 13, no. 3, pp. 1499–1513, Mar. 2014.
algorithm SOMP. The performance plot “Best” denotes the [7] X. Yu, J.-C. Shen, J. Zhang, and K. B. Letaief, “Alternating minimization
performance of the test data without prediction. We observe algorithms for hybrid precoding in millimeter wave MIMO systems,”
that HBDL is very close to the best performance as well IEEE J. Sel. Topics Signal Process., vol. 10, no. 3, pp. 485–500,
Apr. 2016.
as the fully-digital beamformer. HBDL effectively selects the [8] D. Yu and L. Deng, “Deep learning and its applications to signal and
analog beamformers from the feasible sets which maximizes information processing [exploratory DSP],” IEEE Signal Process. Mag.,
the spectral efficiency. The effectiveness of HBDL is attributed vol. 28, no. 1, pp. 145–154, Jan. 2011.
[9] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521,
to the best selection of analog beamformers which are the pp. 436–444, May 2015.
optimum solution of (4) through the SVD of the channel [10] A. Alkhateeb, S. Alex, P. Varkey, Y. Li, Q. Qu, and D. Tujkovic,
matrix [6]. SOMP has poor performance due the fact that it “Deep learning coordinated beamforming for highly-mobile millime-
ter wave systems,” Apr. 2018, arXiv:1804.10334. [Online]. Available:
cannot select the “best” set of array responses from the dic- https://arxiv.org/abs/1804.10334
tionary. While PE-Alt-Min has sufficiently good performance, [11] H. Ye, G. Y. Li, and B.-H. Juang, “Power of deep learning for channel
HBDL performs better even when the output of PE-Alt-Min estimation and signal detection in OFDM systems,” IEEE Wireless
Commun. Lett., vol. 7, no. 1, pp. 114–117, Feb. 2018.
is inserted to the feasible sets used for HBDL. [12] H. Huang, J. Yang, H. Huang, Y. Song, and G. Gui, “Deep learn-
To compare the computation time of the algorithms we ing for super-resolution channel estimation and doa estimation based
consider the same settings and observe that HBDL spends massive MIMO system,” IEEE Trans. Veh. Technol., vol. 67, no. 9,
pp. 8549–8560, Sep. 2018.
about 0.020s to compute both precoder and combiners whereas [13] A. M. Elbir, K. V. Mishra, and Y. C. Eldar, “Cognitive radar antenna
SOMP and PE-Alt-Min take about 0.450s and 1.200s respec- selection via deep learning,” IET Radar, Sonar Navigat., to be published.
tively. [14] Y. Long, Z. Chen, J. Fang, and C. Tellambura, “Data-driven-based analog
beam selection for hybrid beamforming under mm-wave channels,”
IEEE J. Sel. Topics Signal Process., vol. 12, no. 2, pp. 340–352,
VI. C ONCLUSIONS May 2018.
In this work, a CNN framework is proposed for the joint [15] S. Dörner, S. Cammerer, J. Hoydis, and S. T. Brink, “Deep learning
based communication over the air,” IEEE J. Sel. Topics Signal Process.,
estimation of precoder and combiners in hybrid beamform- vol. 12, no. 1, pp. 132–143, Feb. 2018.
ing problem. We show that the proposed network archi- [16] V. Raj and S. Kalyani, “Backpropagating through the air: Deep learning
tecture provides better spectral efficiency as compared to at physical layer without channel models,” IEEE Commun. Lett., vol. 22,
no. 11, pp. 2278–2281, Nov. 2018.
the optimization-based and greedy-based algorithm. In future [17] C.-K. Wen, W.-T. Shih, and S. Jin, “Deep learning for massive MIMO
work, we reserve the case when the training data is small CSI feedback,” IEEE Wireless Commun. Lett., vol. 7, no. 5, pp. 748–751,
where transfer learning-like approaches can be developed. Oct. 2018.
[18] H. Huang, Y. Song, J. Yang, G. Gui, and F. Adachi, “Deep-learning-
R EFERENCES based millimeter-wave massive MIMO for hybrid precoding,” IEEE
Trans. Veh. Technol., vol. 68, no. 3, pp. 3027–3032, Mar. 2019.
[1] J. G. Andrews et al., “What will 5G be?” IEEE J. Sel. Areas Commun., [19] R. Méndez-Rial, C. Rusu, A. Alkhateeb, N. Gonzalez-Prelcicy, and
vol. 32, no. 6, pp. 1065–1082, Jun. 2014. R. W. Heath, Jr., “Channel estimation and hybrid combining for
[2] F. Rusek et al., “Scaling up MIMO: Opportunities and challenges with mmWave: Phase shifters or switches?” in Proc. IEEE Inf. Theory Appl.
very large arrays,” IEEE Signal Process. Mag., vol. 30, no. 1, pp. 40–60, Workshop, Feb. 2015, pp. 90–97.
Jan. 2013. [20] T. Kailath, B. Hassibi, and A. H. Sayed, Linear Estimation.
[3] A. Alkhateeb, O. El Ayach, G. Leus, and R. W. Heath, Jr., “Channel Upper Saddle River, NJ, USA: Prentice-Hall, 2000.
estimation and hybrid precoding for millimeter wave cellular systems,” [21] K. Simonyan and A. Zisserman, “Very deep convolutional networks for
IEEE J. Sel. Topics Signal Process., vol. 8, no. 5, pp. 831–846, large-scale image recognition,” Sep. 2014, arXiv:1409.1556. [Online].
Oct. 2014. Available: https://arxiv.org/abs/1409.1556