You are on page 1of 5

DeepMUSIC: Multiple Signal Classification via Deep Learning

Ahmet M. Elbir, Senior Member, IEEE

Abstract—This letter introduces a deep learning (DL) frame- A common assumption in above works is that the number of
work for the classification of multiple signals in direction finding targets is assumed to be small. This is because the complexity of
(DF) scenario via sensor arrays. Previous works in DL context the generation of the training data and the training overhead become
mostly consider a single or two target scenario which is a more difficult as the number of targets increases. Specifically, the
strong limitation in practice. Hence, in this work, we propose data set length increases on the order of N K for K being the number
a DL framework called DeepMUSIC for multiple signal classifi- of targets and N is the number grid points in the angular spectrum. In
cation. We design multiple deep convolutional neural networks
order to reduce the complexity, in this work, we proposed a multiple
(CNNs), each of which is dedicated to a subregion of the
deep network approach for multiple target estimation. In particular,
angular spectrum. Each CNN learns the MUSIC spectra of the
arXiv:1912.04357v3 [eess.SP] 13 Mar 2020

corresponding angular subregion. Hence, it constructs a non- we design multiple CNNs, each of which is dedicated for a subregion
linear relationship between the received sensor data and the of the angular spectrum. Hence, we partition the angular spectrum
angular spectrum. We have shown, through simulations, that into non-overlapping subregions, and assume that there is a single
the proposed DeepMUSIC framework has superior estimation target in each subregion. This assumption is relevant [4], [6] and it
accuracy and exhibits less computational complexity in com- can be generalized for higher number of targets with close separation
parison with both DL and non-DL based techniques. by simply increasing the number of deep networks. In order to feed
the deep networks, the covariance of the sensor outputs is used as
Index Terms—Deep learning, Direction finding, DOA estimation, a common input. Then, the output label of each network is the
CNN, MUSIC, Deep MUSIC.
MUSIC spectra of the corresponding angular subregion. Hence, we
call the proposed approach DeepMUSIC which yields the MUSIC
I. INTRODUCTION spectra at the output. The main contributions of the proposed DL
approach are as follows. 1) We have introduced a DL approach
Direction finding (DF) is a crucial task for direction-of-arrival to estimate multiple target DOAs whereas the previous works can
(DOA) estimation in a variety of fields including, radar, sonar, only work for limited number of targets. 2)DeepMUSIC provides
acoustics and communications [1]. While there are several different less computation time as compared to both DL- and non-DL-based
approaches in the literature, the MUSIC (MUltiple SIgnal Classifi- approaches. 2) DeepMUSIC has higher DOA estimation accuracy
cation) algorithm [2] is the most popular method for this purpose. than the conventional DL-based techniques. It provides asymptotic
In the literature, most of the algorithms are model-based approaches performance for moderate SNR levels and it performance maxes out
such that the performance of the DOA estimation algorithms strongly in high SNR due to the loss of precision because of the biased nature
relies on the perfectness of the input data [3]. In order to mitigate this of DL-based approaches.
drawback, learning-based and data-driven architectures are proposed
so that the non-linear relationship between the input and output data II. ARRAY SIGNAL MODEL
can be learned by neural networks [4]–[6]. Hence, as a class of machine
learning, deep learning (DL) has gained much interest recently. DL is Consider K far-field signals impinging on an M -element uniform
capable of uncovering complex relationships in data/signals and, thus, linear array (ULA). Then, the output of the antenna array in the
can achieve better performance. While there are several papers to baseband can be given by
demonstrate the performance of DL in wireless communications [7]– K
Õ
[9], limited number of works are considered in the context of DOA y(ti ) = a(θ k )sk (ti ) + n(ti ), i = 1, . . . , T, (1)
estimation and array signal processing [6], [10]. k=1

DOA estimation via DL is considered in [4] where a multilayer where T is the number of data snapshots and sk (ti ) ∈ C is the signal
perceptron (MLP) architecture is proposed to resolve two target emitted from the k -th target which is located with the DOA angle
signals. In [5], the authors studies the same problem, also for two θ k with respect to the antenna array. a(θ k ) ∈ C M denotes the array
signal case, by exploiting the sparsity of the received signal in angular steering vector whose m-th element is given by
domain and design a deep convolutional neural network (CNN). A ¯ − 1)
2π d(m
single sound target case is assumed in [11] and an MLP architecture a m (θ k ) = exp{− j sin(θ k )}, (2)
λ
is proposed to estimate the target DOA angle for wideband case.
where λ = cfc0 is the wavelength for fc being the carrier frequency and
Acoustic scenario is also studied in [12] by incorporating long short
c0 is the speed of light. n(ti ) ∼ N (0 M , σn2 I M ) is zero-mean spatially
term memory (LSTM) with CNN for online DOA estimation, which
and temporarily white additive Gaussian noise vector which corrupts
is also performed for a single target case. In a recent work [13],
the emitted signal with variance σn2 . Using the array output in (1)
cognitive radar scenario is considered where DL is applied for sparse
the covariance matrix of the received signal can be written as
array selection and DOA estimation for a single target. This approach
is extended for two targets in [14] for sparse array design. Ry = E{yyH } = AΓAH + σn2 I M , (3)

where E{·} denotes the expectation operation, Γ =


A. M. E. is with the Department of Electrical and Electronics Engineering,
Duzce University, Duzce, Turkey. e-mail: ahmetelbir@duzce.edu.tr, ahmetmel- diag{σ12, σ22, . . . , σK
2
} is a K × K matrix whose diagonal entries
bir@gmail.com. are the signal variances and A is the array steering matrix
DOA angles Θstart and Θfinal respectively. Then, we denote the MUSIC
spectra in (4) by p ∈ R N as p = [P(Θstart ), . . . , P(Θfinal )]T . To obtain
the MUSIC spectra for each subregion, we partition p and Θ into
Q non-overlapping subregions such that Θ = ∪Q q=1 Θ q , where each
angular set is defined by

Θq = {Θstart
q , Θ q + γ, Θ q + 2γ, . . . , Θ q − γ},
start start final
(5)
start final
where γ = |Θ N−Θ | is the angular resolution and Θfinalq = Θstart
q+1 .
Fig. 1. The overall DeepMUSIC framework for DOA estimation. Hence, the number of elements in Θq is |Θq | = L where L = N/Q
which is assumed to be an integer number without loss of generality.
defined as A = [a(θ 1 ), a(θ 2 ), . . . , a(θ K )] ∈ C M ×K 1 . Through We can also rewrite p as
eigendecomposition, we can rewrite (3) as Ry = UΛUH, where Λ is p = [pT1 , pT2 , . . . , pTQ ]T . (6)
a diagonal matrix composed of the eigenvalues of Ry in descending
order as Λ = diag{λ1, λ2, . . . , λ M } and U = [US UN ] is an M × M In particular, the MUSIC spectra for the q-th subregion is represented
eigenvector matrix whose first K column vectors correspond to the by an L × 1 real-valued vector as
signal subspace by US and the remaining M − K column vectors
pq = [P(Θstart
q ), P(Θ q + γ), P(Θ q + 2γ), . . . , P(Θ q − γ)] . (7)
start start final T
are the noise subspace eigenvectors as UN ∈ C M ×M −K . Using the
orthogonality of signal and noise subspaces (i.e., US ⊥ UN ), and the During training, once p is obtained, we assign the MUSIC spectra
Q
fact that the columns of US and A span the same space, we have of each subregion {pq }q=1 as the labels of each deep network.
||UHN A||F = 0 where F denotes the Frobenious norm [2]. Now, we To construct the input data, we use the real, imaginary and the
can write the MUSIC spectra P(θ) as angular values of the covariance matrix Ry . Let X be an M × M ×
1 3 real-valued matrix, then the (i, j)-th entry of the first and the
P(θ) = H
, (4) second ”channels” of X are given by [[X]:,:,1 ]i, j = Re{[Ry ]i, j } and
aH (θ)U N UN a(θ)
[[X]:,:,2 ]i, j = Im{[Ry ]i, j } respectively. Similarly, the (i, j)-th entry of
K
whose largest K peaks correspond to the target DOA angles {θ k }k=1 . the third ”channel” of X is defined as [[X]:,:,3 ]i, j = ∠{[Ry ]i, j } where
To obtain (4) in practice, we use the sample covariance matrix R̂y ∠{·} returns the angle information of a complex quantity. While other
since Ry is not available. As a result, R̂y = T1 Ti=1 y(ti )yH (ti ) is the
Í
input structures are possible such as the real and imaginary parts of
input to the deep network. the upper triangle of the covariance matrix [4], we observed that the
In this work we can formulate the problem as estimating the target above approach provides better feature extraction performance inherit
K
DOA angles {θ k }k=1 when the array output {y(ti )}Ti=1 is given. For in the input as well as achieving satisfactory mapping accuracy [9],
this purpose we introduce a DL framework as shown in Fig. 1 which [13].
is fed by the array covariance matrix Ry and it gives the MUSIC We design Q identical CNN structures to estimate the target DOA
spectra P(θ) at the output. angles. We demonstrate the network architecture of the proposed CNN
structure in Fig. 2. Each CNN is composed of 17 layers including
III. DOA ESTIMATION VIA DEEP LEARNING input and output layers. The overall deep network structure for the
q-th subregion can be represented by a non-linear mapping function
The proposed DeepMUSIC framework accepts the array covari- Σq (X) : R M ×M ×3 → R L . In particular, we have
ance matrix as input and yields the MUSIC spectra at the output.
Σ q (X) = f (17) f (16) (· · · f (1) (X) · · · ) = pq ,

In the following, we first design the labels and the input of the (8)
proposed deep networks, then discuss the network architectures and
where f (14) (·) denotes a fully connected layer which maps an arbitrary
the training.
input x̄ ∈ RC x to the output ȳ ∈ RC y by using the weights W̄ ∈
In the proposed DL framework shown in Fig 1, we design Q
RC x ×C y . Then the cy -th element of the output of the layer can be
(≥ K ) deep networks, each of which is dedicated to a subregion of
given by the inner product
the angular spectrum. Partitioning the angular spectrum allows us Õ
to estimate the multiple target locations more effectively. The use ȳ c y = hW̄ c y , x̄i = [W̄]Tc y , i x̄i , (9)
of a single deep network is computationally prohibitive due to the i

fact that the training data must contain all candidate multiple target for cy = 1, . . . , Cy and W̄ c y is the cy -th column vector of W̄ and
locations, whose complexity increase on the order of N K for N Cx = Cy = 1024 is selected for f (14) (·).
DOA grid points. DOA estimation via partitioned angular spectrum In (8), { f (i) (·)}i={2,5,8,11} represent the convolutional layers. The
is more efficient such that a reasonable assumption is made such arithmetic operation of a single filter of a convolutional layer can
that there is a single target present in each angular subregion [4]. be defined for an arbitrary input X̄ ∈ R d x ×d x ×Vx and output Ȳ ∈
Let Θ = {Θstart, . . . , Θfinal } be the set of DOA angles where the R d y ×d y ×Vy as
MUSIC cost function in (4) is evaluated for the starting and final Õ
Ȳ p y , vy = hW̄vy , p k , X̄ p x i, (10)
1
While (3) requires the uncorrelated signal assumption, the proposed DL ap- pk , p x
proach can also work well for correlated/coherent signals since the non-linear
where d x × dy is the size of the convolutional kernel, and Vx × Vy
mapping provided by DeepMUSIC does not rely on the statistical properties of
the signal [13]. To generate the output label for a coherent scenario, the MUSIC are the size of the response of a convolutional layer. W̄vy , vk ∈ RVx
spectra can be obtained by employing spatial smoothing algorithm. denotes the weights of the vy -th convolutional kernel, and X̄ p x ∈ RVx
Algorithm 1 Training data generation for DeepMUSIC.
Input: Jα , Jβ , T , M , Q, K , SNRTRAIN .
Q
Output: Training data sets {Dq }q=1 .
(α) K (α)
1: Generate Jα DOA angle sets {θ k }k=1 such that θ k ∈
k , Θ k ] for α = 1, . . . , Jα .
[Θstart final

2: Initialize with µ = 1 while the dataset length is J = Jα Jβ .


3: for 1 ≤ α ≤ Jα do
4: Construct A(α) = [a(θ 1(α) ), . . . , a(θ (α) K )].
Fig. 2. Deep network architecture for the proposed algorithm. Each
5: Construct R̃(α) = A (α) (α) (α)
Γ A . convolutional layer block also includes normalization and ReLU layers.
y
6: Using R̃(α) y , obtain noise subspace UN .
(α)

7: Compute P(α) (θ) in (4) for θ ∈ [Θstart, Θfinal ] and 10 2

construct p(α) and the partitioned spectra {p(α) Q


q } q=1 .
Spectral MUSIC
Root MUSIC

8: for 1 ≤ β ≤ Jβ do 10 1
DeepMUSIC
MLP
β)
Generate s(α, (ti ) ∼ CN (0 K , I K ) for T snapshots.
CRB
9: k

RMSE, [Degrees]
10: Generate noisy array output 10 0
(α, β)
y(α, β) (ti ) = k=1 a(θ (α) (ti ) + n(α, β) (ti ),
ÍK
k )s k
(α, β)
where n ∼ CN (0 M , σTRAIN I M ).
2 10 -1

11: Construct sample covariance matrix


β)
R(α, = T1 Ti=1 y(α, β) (ti )y(α, β) (ti ).
Í H 10 -2
y
(µ)
12: Form the input data X as
[[X(µ) ]:,:,1 ]i, j = Re{[Ry(α, β) ]i, j }.
10 -3
-20 -10 0 10 20 30

[[X(µ) ]:,:,2 ]i, j = Im{[Ry(α, β) ]i, j }.


SNR, [dB]

[[X(µ) ]:,:,3 ]i, j = ∠{[Ry(α, β) ]i, j }. Fig. 3. DOA estimation performance versus SNR for K = 2.
13: Form the output of the q-th CNN as z(µ) q = pq .
(α)

(µ) (µ)
14: Design input-output pair for q-th CNN as (X , zq ).
IV. NUMERICAL SIMULATIONS
15: µ = µ + 1.
16: end for β, In this section, we present the performance of the proposed
17: end for α, DeepMUSIC algorithm in comparison with the MLP structure in
18: Training data for the q-th CNN is obtained from the collection [4], the MUSIC algorithm [2] and the Cramer-Rao lower Bound
of the input-output pairs as (CRB) [16]. In the training stage, we use the angular spectrum
Dq = (X(1), z(1) (2) (2) (J ) (J )
q ), (X , z q ), . . . , (X , z q ) . as [Θstart, Θfinal ] = [−60◦, 60◦ ] [4] with N = 212 grid points. We


select K = 5 and the target DOAs are randomly located in Q = 8


subregions. In particular, Jα DOA angle sets {θ (α) K
k } k=1 are realized for
α = 1 . . . , Jα and the DOA angles are selected as the angles drawn uni-
form randomly from Θ. Then, for each realization, noisy array outputs
is the input feature map at spatial position p x . Hence we define p x and y(α, β) (ti ) are generated for β = 1, . . . , Jβ . We select Jα = Jβ = 100
pk as the 2-D spatial positions in the feature maps and convolutional and T = 500 in out simulations. When generating the data, we use
kernels, respectively [10], [15]. In the proposed architecture, we use four different SNR levels, i.e., SNRTRAIN = {15, 20, 25, 30} dB where
256 filters, the first two of which are of size 5 × 5 and the remaining SNRTRAIN = 10 log10 (σS2 /σTRAIN
2
) and σS2 = 1. Hence, the total data set
two have 3 × 3 filters. length is J = 4Jα Jβ = 40000. Further, 80% and 20% of all generated
data are chosen for training and validation datasets, respectively. For
In (8), { f (i) (·)}i={3,6,9,12} are the normalization layers and the prediction process, we select the DOA angles as the floating
{ f (i) (·)}i={4,7,10,13} are the rectified linear unit (ReLU) layers which angles generated uniform randomly in the subregions defined above.
are defined as ReLU(x) = max(0, x). f (15) (·) is a dropout layer and Then, JT = 100 Monte Carlo experiments are conducted to obtain
f (16) (·) is a softmax layer defined for an arbitrary input x̄ ∈ R D the statistical performance of the proposed DeepMUSIC framework.
x̄ i }
as softmax( x̄i ) = ÍDexp{exp{ x̄ i }
. Finally, the output layer f (17) (·) is a Throughout the simulations, we use a ULA with M = 16 antennas
i=1
regression layer of size L × 1. The current network parameters are with λ = d/2 ¯ , T = 100.
obtained from a hyperparameter tuning process providing the best In Fig. 3 and Fig. 4 we present the DOA estimation performance
performance for the considered scenario [9], [13]. of the algorithms for K = 2 and K = 6 respectively. When K = 2,
The proposed deep networks are realized and trained in MATLAB we see that DeepMUSIC outperforms MLP and provides very
on a PC with a single GPU and a 768-core processor. We used close performance to the spectral and Root-MUSIC algorithms
the stochastic gradient descent algorithm with momentum 0.9 and respectively. When K = 6, it can be seen that DeepMUSIC performs
updated the network parameters with learning rate 0.01 and mini- better than the MUSIC algorithms in low SNR regimes and closely
batch size of 128 samples. Then, we reduced the learning rate by follows the performance of the MUSIC algorithms as SNR increases.
the factor of 0.5 after each 10 epochs. We also applied a stopping The performance of DeepMUSIC can be attributed to the use of
criteria in training so that the training ceases when the validation convolutional layers which extract the hidden features in the input
accuracy does not improve in three consecutive epochs. Algorithm 1 data whereas the MLP architecture consists of only fully connected
summarizes the training data generation. layers which do not provide effective feature extraction. Furthermore,
We also compare the computation complexity of the DOA
10 1 Spectral MUSIC
Root-MUSIC
estimation algorithms for the same settings and K = 2. We observe
DeepMUSIC
CRB
that DeepMUSIC and MLP take about 0.0020 s and 0.0110 s whereas
spectral MUSIC and Root-MUSIC need 0.0300 s and 0.0040 s to
RMSE, [Degrees]

10 0 obtain the MUSIC spectra. These results show that DeepMUSIC has
the lowest computation time as compared to the competing algorithms.
While DeepMUSIC has multiple networks as compared to MLP, it
10 -1
provides less computation time due to the fact that 1) DeepMUSIC
has many convolutional layers rather than fully connected layers
which involve higher complexity [17] and 2) the multiple networks
10 -2
-20 -10 0 10 20 30 in DeepMUSIC can be trained and run with parallel processing so
SNR, [dB]
that the computational complexity is reduced. Similar observations
are also reported in [5].
Fig. 4. DOA estimation performance versus SNR for K = 6.

V. SUMMARY
10 1
Spectral MUSIC
Root MUSIC
DeepMUSIC
In this letter, we introduced a DL framework called DeepMUSIC
MLP
for DOA estimation. The major advantage of the proposed approach
is that it can work for multiple targets in comparison with the previous
RMSE, [Degrees]

10 0
works. Furthermore, DeepMUSIC provides less computational
complexity as compared to the conventional techniques.

REFERENCES
-1
10 [1] H. Krim and M. Viberg, “Two decades of array signal processing research: the
0 0.2 0.4 0.6 0.8 1
parametric approach,” Signal Processing Magazine, IEEE, vol. 13, no. 4, pp. 67–94,
Jul 1996.
[2] R. Schmidt, “Multiple emitter location and signal parameter estimation,” IEEE
Fig. 5. DOA estimation performance versus correlation coefficient. Trans. Antennas Propag., vol. 34, no. 3, pp. 276–280, 1986.
[3] B. Friedlander and A. Weiss, “Direction finding in the presence of mutual coupling,”
IEEE Trans. Antennas Propag., vol. 39, no. 3, pp. 273–284, 1991.
MLP only works for two signals which brings a strong limitation [4] Z. Liu, C. Zhang, and P. S. Yu, “Direction-of-arrival estimation based on deep
for practicality. In contrast, the proposed DeepMUSIC framework neural networks with robustness to array imperfections,” IEEE Trans. Antennas
Propag., vol. 66, no. 12, pp. 7315–7327, Dec 2018.
can handle multiple target scenario and it exhibits a reasonable [5] L. Wu, Z. Liu, and Z. Huang, “Deep Convolution Network for Direction of Arrival
performance. Estimation With Sparse Prior,” IEEE Signal Process. Lett., vol. 26, no. 11, pp.
We can also see from Fig. 3-4 that, for high SNR regimes, i.e., 1688–1692, Nov 2019.
[6] D. Hu, Y. Zhang, L. He, and J. Wu, “Low-Complexity Deep-Learning-Based DOA
SNR≥ 20 dB, the performance of the DL-based methods and the Estimation for Hybrid Massive MIMO Systems with Uniform Circular Arrays,”
spectral MUSIC algorithm maxes out and does not improve. One of IEEE Wireless Commun. Lett., pp. 1–1, 2019.
the main reasons of this is due to reaching the angular resolution [7] H. Huang, Y. Peng, J. Yang, W. Xia, and G. Gui, “Fast beamforming design via
limit2 γ = | − 60 − 60|/212 ≈ 0.02, which limits the performance of deep learning,” IEEE Trans. Veh. Technol., vol. 69, no. 1, pp. 1065–1069, Jan
2020.
spectral approaches except Root-MUSIC. Moreover, the performance [8] Y. Wang, M. Liu, J. Yang, and G. Gui, “Data-Driven Deep Learning for Automatic
loss for DeepMUSIC and MLP is due to the loss of precision in Modulation Recognition in Cognitive Radios,” IEEE Trans. Veh. Technol., vol. 68,
the deep networks. This is because, being biased estimators, deep no. 4, pp. 4074–4077, April 2019.
[9] A. M. Elbir and K. V. Mishra, “Joint Antenna Selection and Hybrid Beamformer
networks do not provide unlimited accuracy. This problem can be Design using Unquantized and Quantized Deep Learning Networks,” IEEE Trans.
mitigated by increasing the number of units in various network layers. Wireless Commun., pp. 1–1, 2019.
Unfortunately, it may lead to the network memorizing the training [10] A. Massa, D. Marcantonio, X. Chen, M. Li, and M. Salucci, “DNNs as Applied to
Electromagnetics, Antennas, and Propagation-A Review,” IEEE Antennas Wireless
data and perform poorly when the test data are different than the ones
Propag. Lett., vol. 18, no. 11, pp. 2225–2229, Nov 2019.
in training. To balance this trade-off, we used noisy data-sets with [11] L. Wu and Z. Huang, “Coherent svr learning for wideband direction-of-arrival
several SNRTRAIN levels during training so that the network attains estimation,” IEEE Signal Process. Lett., vol. 26, no. 4, pp. 642–646, April 2019.
reasonable tolerance to corrupted/imperfect inputs. While similar [12] Q. Li, X. Zhang, and H. Li, “Online direction of arrival estimation based on deep
learning,” in 2018 IEEE International Conference on Acoustics, Speech and Signal
performance degradation is also observed in [5], [6], no justification Processing (ICASSP), April 2018, pp. 2616–2620.
is provided for this issue. [13] A. M. Elbir, K. V. Mishra, and Y. C. Eldar, “Cognitive radar antenna selection
In Fig. 5, the DOA estimation performance is presented with the via deep learning,” IET Radar, Sonar & Navigation, vol. 13, pp. 871–880, 2019.
[14] A. M. Elbir, S. Mulleti, R. Cohen, R. Fu, and Y. C. Eldar, “Deep-sparse array
same simulation settings when there are K = 2 correlated target
 σ12 ρ  cognitive radar,” in IEEE International Conference on Sampling Theory and
signals as Γ = where ρ is the correlation coefficient. Applications, 2019, pp. 1–5.
2
ρ σ2
[15] J. Cheng, J. Wu, C. Leng, Y. Wang, and Q. Hu, “Quantized CNN: A unified
We can see that DeepMUSIC closely follows the MUSIC algorithm
approach to accelerate and compress convolutional networks,” IEEE Trans. Neural
and provides less RMSE when fully correlated case (i.e., ρ = 1). Netw. Learn. Syst., vol. 29, no. 10, pp. 4730–4743, 2018.
[16] P. Stoica and A. Nehorai, “MUSIC, maximum likelihood, and Cramér-Rao bound:
The resolution limit γ can also be viewed as the angular search step size of the
2
Further results and comparisons,” IEEE Transactions on Acoustics, Speech, and
MUSIC algorithm [2]. Signal Processing, vol. 38, no. 12, pp. 2140–2150, 1990.
[17] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.

You might also like