You are on page 1of 13

I

E
E
E
P
r
o
o
f
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY 1
Phase Noise in Asynchronous SC-FDMA Systems:
Performance Analysis and Data-Aided Compensation
1
2
Ahmad Gomaa, Student Member, IEEE, and Naofal Al-Dhahir, Fellow, IEEE 3
AbstractCarrier frequency offset (CFO) and phase noise 4
(PN) are major oscillator impairments in direct-conversion trans- 5
ceivers, and single-carrier frequency-division multiple access 6
(SC-FDMA) is the uplink transmission scheme in the Long-Term 7
Evolution (LTE) standard. We derive a new analytical expression 8
for the normalized mean square error (NMSE) in asynchronous 9
SC-FDMA systems under CFO and joint transmitreceive PN. 10
The derived NMSE expression reveals an interesting cross-layer 11
relationship between the subcarrier mapping scheme at the 12
medium-access-control layer and the immunity to CFO and PN at 13
the physical layer. Furthermore, we propose an iterative reduced- 14
complexity joint decoding and PN compensation scheme that does 15
not require any pilots in PN tracking and exploits the low-pass 16
nature of the PN process without assuming a specic PN model. 17
Simulation results show the effectiveness of our proposed digital 18
baseband compensation scheme in PN mitigation. 19
Index TermsPhase noise (PN), single-carrier frequency- 20
division multiple access (SC-FDMA). 21
I. INTRODUCTION 22
P
HASE NOISE (PN) is one of the major radio-frequency 23
(RF) front-end impairments [1], [2] resulting from the 24
imperfections and inaccuracies of the crystal oscillators fab- 25
rication process. Several techniques [3][7] have been proposed 26
for estimatingandcompensatingfor PNinorthogonal frequency- 27
division multiplexing (OFDM) systems. The impact of carrier 28
frequency offset (CFO) and PNon channel estimation is investi- 29
gated in [8]. In [5][7], iterative detection and PNcompensation 30
techniques are proposed with pilot subcarriers used to estimate 31
the PN-induced common phase error (CPE) as an initialization 32
for the PN compensation process. In [6], each OFDM symbol is 33
divided into subblocks where the PN process is assumed quasi- 34
static over each subblock. A Kalman lter for PN tracking is 35
proposed in [7] where knowledge of the PN statistical model is 36
assumed. 37
A common assumption in all of the given approaches is 38
the availability of pilot subcarriers multiplexed in each data 39
Manuscript received June 7, 2013; revised October 1, 2013 and November 23,
2013; accepted November 29, 2013. This work was supported by the Qatar
National Research Fund through the National Priorities Research Program
under Grant NPRP 09-062-2-035. This paper was presented in part at the
IEEE Global Communications Conference, Houston, TX, USA, December 59,
2011, and at the IEEE International Conference on Communications, Ottawa,
ON, Canada, June 1015, 2012. The review of this paper was coordinated by
Dr. J. P. Coon.
A. Gomaa is with Broadcom Corporation, Sunnyvale, CA 94085 USA
(e-mail: agomaa@broadcom.com).
N. Al-Dhahir is with the Department of Electrical Engineering, University of
Texas at Dallas, Richardson, TX 75083 USA (e-mail: aldhahir@utdallas.edu).
Digital Object Identier 10.1109/TVT.2013.2294014
OFDM symbol. Unfortunately, this assumption is not valid for 40
single-carrier frequency-division multiple-access (SC-FDMA) 41
systems adopted in Long-Term Evolution (LTE) uplink [9] 42
to maintain low peak-to-average power ratio (PAPR). Instead, 43
at the beginning of each uplink frame, the pilots are trans- 44
mitted in a dedicated training symbol followed by six data 45
SC-FDMA symbols void of pilots in the LTE uplink [9]. These 46
training symbols are shown to be sufcient for channel tracking 47
even under high-mobility scenarios. However, they may not 48
be sufcient for tracking other time-varying impairments such 49
as PN whose samples vary from one time instant to another 50
within a single SC-FDMA symbol. Hence, PN tracking is a 51
challenging task for the LTE uplink where consecutive training 52
symbols are six symbols apart with no pilots inserted in the 53
data symbols in between. In OFDM systems, the PN tracking 54
task is easier due to the pilot subcarriers multiplexed with data 55
subcarriers. In [10], maximum-likelihood estimation of CFO 56
in the LTE uplink is derived. The PN maximum a posteriori 57
(MAP) estimate is derived in [11] for single-carrier single-user 58
systems employing turbo equalization without pilots. The MAP 59
solution was shown in [11] to be a second-order phase-locked 60
loop for constant amplitude modulation. Furthermore, the PN 61
statistical model was assumed available, and the extension to 62
higher order modulation is not straightforward. In this paper, 63
we tackle the problem of nonpilot-aided PN compensation in 64
SC-FDMA systems without knowledge of the PN model. We 65
propose an iterative joint data decoding and PN compensation 66
scheme that exploits the low-pass nature of the PN process 67
without assuming a priori knowledge about the exact PN model 68
to avoid model mismatch problems. In addition, no restrictions 69
on the constellation size or shape are imposed. Unlike existing 70
approaches, by writing the system model in an alternative form, 71
we were able to move the channel equalizer out of the iterative 72
loop to reduce the complexity. We also show that our proposed 73
PN compensation technique is robust to channel estimation 74
errors. Comparison with [11] shows that the performance of 75
our approach is close to that of the MAP solution with signif- 76
icantly reduced complexity and wider applicability to different 77
constellation sizes. The main motivation behind our approach 78
is to relax the design requirements of the user-equipment (UE) 79
oscillator allowing for inexpensive oscillators where PN is 80
compensated at the baseband level using our proposed PN 81
compensation method. 82
We derive a new analytical expression for the NMSE under 83
joint transmitreceive PN and CFO. Furthermore, based on the 84
derived NMSE expression, we compare the robustness of both 85
localized and distributed subcarrier mapping schemes under 86
CFO and PN. The impact of PN was investigated in [12] for 87
0018-9545 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
I
E
E
E
P
r
o
o
f
2 IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY
single-carrier single-user systems. However, the derived ex- 88
pression in [12] assumed no channel equalization and had 89
functional dependence on the instantaneous channel realization, 90
which varies randomly over time. However, in this paper, we 91
include channel equalization effects and take the statistical 92
expectation over the channel realization. The exact expectation 93
of the ratio rather than the approximate ratio of expectations is 94
derived. Furthermore, the SC-FDMA performance has been re- 95
cently analyzed [13] under receive-only PN. However, transmit 96
PN is more serious and requires more attention because the UE 97
oscillator is expected to be less accurate than that of the base 98
station (BST). 99
The remainder of this paper is organized as follows. The sys- 100
tem model is described in Section II, and the NMSE expression 101
is given in Section III with the detailed derivation relegated to 102
the Appendix. In Section IV, we simplify the derived NMSE AQ1 103
expression and analytically compare the performances of local- 104
ized and distributed subcarrier mapping schemes under CFO 105
and PN. Our proposed PN compensation scheme is described 106
in Section V. Simulation results are presented in Section VI, 107
and this paper is concluded in Section VII. 108
Notations: Lowercase and uppercase bold letters denote 109
vectors and matrices, respectively, and A(m, n) denotes the 110
element in the mth row and the nth column of A. The jth 111
element of a is denoted by a(j), and a(j : k) denotes the 112
portion of a starting at index j and ending at index k j. I and 113
F denote, respectively, the identity and fast Fourier transform 114
(FFT) matrices, and their subscripts denote theirs sizes. Ois the 115
all-zero matrix. For matrices,

A FAF
H
. In addition, ()
H
, 116
()

, and ()
T
denote the matrix Hermitian, complex conjugate, 117
and transpose operations, respectively, whereas ()
1
denotes 118
both scalar reciprocal and matrix inverse. E[], [ [, ,|, and | 119
denote the statistical expectation, absolute, ceiling, and oor 120
functions, respectively. Moreover, diag(. . .) denotes a diago- 121
nal matrix whose diagonal is the argument; whereas diag(A) 122
denotes a column vector containing the diagonal entries of A. 123
II. SYSTEM MODEL 124
A. Signal Model 125
We consider the uplink SC-FDMA transmission scenario 126
with one receive antenna at BST and N
u
single-antenna users. 127
The total number of subcarriers is N where each user is exclu- 128
sively assigned M < N subcarriers such that each subcarrier is 129
not assigned to more than one user in the same time slot. The 130
subcarrier mapping of the lth user is dened by the N M 131
matrix S
l
whose elements are zeros, except for a single 1 in 132
each column. The indexes of the rows containing these 1s are 133
denoted by the set J
l
and correspond to the locations of the 134
assigned subcarriers, and the cardinality of J
l
is M, l. The tth 135
SC-FDMA time-domain (TD) symbol generated by the lth user 136
is given by 137
g
t
l
= F
H
N
S
l
F
M
x
t
l
(1)
where x
t
l
is the M 1 data vector generated by the lth user in 138
the tth SC-FDMA symbol with E[x
l
x
H
l
] =
l
I
M
, and
l
is the 139
received power from the lth user. Then, a cyclic-prex of length 140
L
p
greater than or equal to the channel memory is appended to 141
the vector g
l
before transmission. 142
B. PN and CFO Models 143
CFO results from inaccuracies of the crystal oscillators used 144
at the transmitter and the receiver, causing the carrier frequency 145
to deviate from its nominal value. Furthermore, the power 146
spectrum density (PSD) of the generated carrier signal typically 147
follows the Lorentzian shape [14], causing energy leakage 148
from each subcarrier into its neighbors, resulting in intercarrier 149
interference (ICI). This impairment causes the oscillator phase 150
to be noisy where this noise follows the Wiener randomprocess. 151
In the presence of CFO and joint transmitreceive PN, the 152
received TD signal is given by 153
y
t
= P
t
r
N
u

l=1
Q
t
l
H
t
l
P
t
l
g
t
l
+z
t
(2)
where H
t
l
is the N N circulant channel matrix whose rst 154
column contains the zero-padded channel impulse response 155
(CIR) between the lth user and the BST, and z
t
is the complex 156
additive white Gaussian noise (AWGN) vector with a single- 157
sided PSD of N
o
(in watts per hertz). The CFO between the lth 158
user and the BST is modeled by the N N diagonal matrix as 159
follows: 160
Q
t
l

= diag
_
_
exp
_
j2m
l
N
__
t(N+L
p
)+N1
m=t(N+L
p
)
_
(3)
with
l

= (f
l
/f
sub
), where f
l
is the difference in hertz 161
between the carrier frequencies of the lth user and the BST, 162
and f
sub
denotes the subcarrier frequency spacing in hertz. The 163
transmit and receive PN processes are modeled [14] by the 164
multiplication of the TD signals by the diagonal matrices P
t
l
165
and P
t
r
, respectively, given by 166
P
t
l

=diag
_
exp
_
j
l,t
0
_
, exp
_
j
l,t
1
_
, . . . , exp
_
j
l,t
N1
__
P
t
r

=diag
_
exp
_
j
r,t
0
_
, exp
_
j
r,t
1
_
, . . . , exp
_
j
r,t
N1
__
(4)
where
l,t
n
and
r,t
n
are the PN samples perturbing the transmit- 167
ted and received signals, respectively, at the nth sample of the 168
tth SC-FDMA symbol. The PN is modeled by the following 169
rst-order autoregressive [15] processes
1
: 170

l,t
n
=
l,t
n1
+
l,t
n

r,t
n
=
r,t
n1
+
r,t
n
(5)
where
l,t
n
A(0,
2
l

=(2
l
)/(Nf
sub
)) and
r,t
n
A(0,
2
r

= 171
(2
r
)/(Nf
sub
)) are independent Gaussian-distributed ran- 172
dom variables. Without loss of generality, we assume that 173

l,0
0
=
r,0
0
= 0, l. The parameters
l
and
r
denote the two- 174
sided 3-dB linewidths of the Lorentzian-shaped PSD of the 175
oscillators feeding the lth user and the BST, respectively. For 176
1
We assume this model to simplify the NMSE derivation. However, we do
not assume it for PN compensation in Section V.
I
E
E
E
P
r
o
o
f
GOMAA AND AL-DHAHIR: PN IN ASYNCHRONOUS SC-FDMA SYSTEMS 3
simplicity, we drop the SC-FDMAsymbol index t in Sections III 177
and IV-Bwhere we analyze the performance under CFOand PN. 178
III. NORMALIZED MEAN SQUARE ERROR DERIVATION 179
Denoting by k the index of the user of interest, we compute 180
its decision statistic in four steps. First, we apply the N-point 181
FFT to y
t
. Second, we select its assigned M subcarriers by 182
applying the selection matrix S
H
k
. Third, we apply frequency- 183
domain (FD) equalization to mitigate the channel frequency 184
selectivity effects. Fourth, we apply the M-point inverse FFT 185
to go back to the TD for data detection. We do not use joint- 186
subcarrier processing at the third step, i.e., the equalization 187
matrix is diagonal; therefore, the second and third steps can be 188
interchanged. Following this procedure, we write the decision 189
statistic of the kth user as follows: 190
x
k
= U
kk
x
k
+
N
u

l=1
l=k
U
kl
x
l
+ z (6)
with the three terms in (6) given by 191
U
kk
=F
H
M
S
H
k
D
k

P
r

Q
k

H
k

P
k
S
k
F
M
U
kl
=F
H
M
S
H
k
D
k

P
r

Q
l

H
l

P
l
S
l
F
M
z =F
H
M
S
H
k
D
k
F
N
z
where D
k
diag(

H
H
k
(n, n)/[

H
k
(n, n)[
2
+
1
k

N1
n=0
) is the 192
diagonal MMSE FD equalization matrix of the kth user with 193

= (
k
/N
o
) being the SNR of the kth user. As dened 194
in Section I,

P
r
F
N
P
r
F
H
N
,

Q
l
F
N
Q
l
F
H
N
, and

Q
k
195
F
N
Q
k
F
H
N
. Moreover,

H
k
F
N
H
k
F
H
N
is the N N diago- 196
nal FD channel matrix. Note that the diagonal and off-diagonal 197
entries of the matrix U
kk
correspond to the desired and ICI 198
terms, respectively. The matrix U
kl
corresponds to the interuser 199
interference (IUI) fromthe lth user into the kth user. The NMSE 200
of the kth user is dened as follows: 201
NMSE
k

E[e
H
e]
M
k
=

E[Tr(ee
H
)]
M
k
100 (%) (7)
where the error vector e

= x
k
x
k
and Tr() denotes the 202
matrix trace. Assuming that the data vectors of different users 203
are independent with zero mean, that the users data symbols 204
are independent, and that the data and noise vectors are inde- 205
pendent, we write 206
E
_
Tr(ee
H
)

=
k
E
_
Tr
_
U
/
kk
(U
/
kk
)
H
__
. .

=t
1
+
N
u

l=1
l=k

l
E
_
Tr
_
U
kl
U
H
kl
_
. .

=t
2
+E
_
Tr( z z
H
)

. .

=t
3
(8)
where U
/
kk

= U
kk
I
M
. The rst term, which is denoted by 207
t
1
, in (8) is evaluated as follows: 208
t
1
=E
_
Tr
_
U
/
kk
(U
/
kk
)
H
__
=M 2Re(E[Tr (U
kk
)]
. .

=t
1A
) +E
_
Tr
_
U
kk
U
H
kk
_
. .

=t
1B
(9)
where Re() denotes the real part of its argument. In the sequel, 209
we derive the quantities t
1A
, t
1B
, t
2
, and t
3
in (8) and (9). Using 210
the identity Tr(AB) = Tr(BA) for any of the two matrices A 211
and B, we start with t
1A
in (9) and write 212
t
1A

= E[Tr (U
kk
)] = E
_
Tr
_

P
r

Q
k

H
k

P
k
S
k
D
k
_
(10)
where S
k

= S
k
S
H
k
is a diagonal matrix. Using the fact that 213
diagonal matrices are commutative under multiplication and 214
assuming that the channel and PN processes are statistically 215
independent, we write 216
E[Tr (U
kk
)] = E
_
Tr
_
E[

P
r
]

Q
k

H
k
E[

P
k
]D
k
S
k
_
. (11)
Since both

H
k
and D
k
are diagonal matrices, we rewrite (11) 217
as follows: 218
E[Tr (U
kk
)] = Tr
_
E[

P
r
]

Q
k

_
E
_
diag(

H
k
) (diag(D
k
))
T
_
E[

P
k
]
_
S
k
_
(12)
where E[] and Tr() are linear operators and hence can be 219
exchanged. Furthermore, the notation denotes the element- 220
wise Hadamard product. We assume that the CIR follows the 221
well-known zero-mean circular symmetric complex Gaussian 222
distribution and so does

H
k
(n, n) which is a weighted summa- 223
tion of the CIR taps. 224
Lemma 1: Let f(x) and g(y) denote two functions of the 225
jointly Gaussian random variables x and y, respectively, with 226
the operator [ denoting the conditional expectation. Then 227
E
x,y
[f(x)g(y)] =E
x
E
y[x
[f(x)g(y)]
=E
x
_
f(x)E
y[x
[g(y)]

=
_
x
f(x)
_
y
g(y)p(y[x)dydx (13)
where p(y[x) is the conditional probability density function of 228
y given x. Since x and y are jointly Gaussian, p(y[x) is also 229
Gaussian whose conditional mean and variance [16] are 230
E[y[x] =E[y] +
cov[y, x]
var[x]
(x E[x])
var[y[x] =var[y]
[cov[y, x][
2
var[x]
(14)
where var[x]

=E[[xE[x][
2
] and cov[y, x]

=E[(yE[y])(x 231
E[x])

]. 232
I
E
E
E
P
r
o
o
f
4 IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY
Denoting

H
k
(m, m) and

H
k
(n, n) by y and x, respectively, 233
we exactly (unlike [13] and [17]) compute the entries of the 234
matrix T
k

= E[diag(

H
k
)(diag(D
k
))
T
] as follows: 235
T
k
(m, n) =E
x,y
_
yx

[x[
2
+
1
k
_
= E
x
E
y[x
_
yx

[x[
2
+
1
k
_
=E
x
_
x

E
y[x
[y]
[x[
2
+
1
k
_
m, n. (15)
Since

H
k
(m, m) and

H
k
(n, n) are jointly Gaussian, we use 236
Lemma 1 and write 237
E
y[x
[y] =E[y] +
E[yx

]
E[[x[
2
]
(x E[x]) =
R
k
(m, n)
R
k
(n, n)
x
(16)
T
k
(m, n) =
R
k
(m, n)
R
k
(n, n)
E
x
_
[x[
2
[x[
2
+
1
k
_
=
R
k
(m, n)
R
k
(n, n)

_
z=0
z
z +
1
k
exp (z/E[z])
E[z]
dz
=
R
k
(m, n)
R
k
(n, n)
(1 + p
n
k
exp (p
n
k
) Ei (p
n
k
))
m, n (17)
where z

=[x[
2
is exponentially distributed and E[z] =E[[

H
k
(n, 238
n)[
2
] =R
k
(n, n). Furthermore, p
n
k

=(
k
R
k
(n, n))
1
, R
k
(m, 239
n)

= E[

H
k
(m, m)

H

k
(n, n)] is the FD channel autocorrelation 240
matrix, and Ei(p
n
k
)

=
_
p
n
k

(exp(x)/x)dx is the exponential 241


integral function [18]. Then, the nal expression of t
1A
is 242
given by 243
t
1A
=Tr
_
E[

P
r
]

Q
k
_
T
k
E[

P
k
]
_
S
k
_
(18)
E
_

P
r/k

=F
N
diag
_
_
_
exp
_
m
2
r/k
2
__
N1
m=0
_
_
F
H
N
(19)
where we used the fact that E[e
jy
] = e

2
/2
[19] for y 244
N(0,
2
). Next, we evaluate the quantity t
1B
in (9) as follows: 245
t
1B

=E
_
Tr
_
U
kk
U
H
kk
_
=E
_
Tr
_

P
r

Q
k

H
k
W
k

H
H
k

Q
H
k

P
H
r
D
k
D
H
k
S
k
_
(20)
W
k

=E
_

P
k
S
k

P
H
k

= F
N
E
_
P
k
_
F
H
N
S
k
F
N
_
P
H
k

F
H
N
=F
N
__
F
H
N
S
k
F
N
_
V
k
_
F
H
N
(21)
and V
k

= E[diag(P
k
)(diag(P
k
))
H
] whose entries are given 246
in (30). Inspecting the structure of W
k
in (21), we nd that 247
F
H
N
W
k
F
N
is a Toeplitz matrix [20] because V
k
is Toeplitz 248
and F
H
N
S
k
F
N
is circulant. Furthermore, Toeplitz matrices with 249
large sizes can be accurately approximated as circulant matrices 250
[20]; hence, for large FFT sizes, the matrix W
k
can be ap- 251
proximated by its main diagonal while setting the off-diagonal 252
entries to zero. Using this approximation, we interchange the 253
approximately diagonal matrix W
k
and the diagonal matrix

H
k
254
and proceed as follows: 255
t
1B
E
_
Tr
_

P
r

Q
k
W
k

H
k

H
H
k

Q
H
k

P
H
r
D
k
D
H
k
S
k
_
=E
_
Tr
_

P
r

Q
k
W
k
__

Q
H
k

P
H
r
_
Y
k
_
S
k
_
(22)
where Y
k

=E[diag(

H
k

H
H
k
)(diag(D
k
D
H
k
))
T
] whose entries 256
Y
k
(m, n)=E[[

H
k
(m, m)[
2
[

H
k
(n, n)[
2
/([

H
k
(n, n)[
2
+
1
k
)
2
] 257
are exactly (unlike [17]) computed in (31) using the relation 258
E[[y[
2
[x] =var[y[x]+[E[y[x][
2
in addition to Lemma 1, which 259
yields 260
E
_


H
k
(m, m)

2
[

H
k
(n, n)
_
=
[R
k
(m, n)[
2
[R
k
(n, n)[
2


H
k
(n, n)

2
+
R
k
(m, m)R
k
(n, n) [R
k
(m, n)[
2
R
k
(n, n)
. (23)
Since W
k
is approximated by a diagonal matrix, it can be 261
moved and multiplied by Y
k
as follows: 262
t
1B
E
_
Tr
_

P
r

Q
k
__

Q
H
k

P
H
r
_
(W
k
Y
k
)
_
S
k
_
=E[Tr ((C
k
C

k
) L
k
S
k
)]
=

nJ
k
N1

m=0
E
_

C
k
(1, ((mn))
N
[
2
_
L
k
(m, n) (24)
where C
k

=

P
r

Q
k
, L
k

= W
k
Y
k
, and((mn))
N

= (m 263
n) mod N is the module-N operation that originates from the 264
circulant structure of the matrix C
k
. Furthermore 265
C
k
(1, ((mn))
N
)
=
1

N
f
H
((mn))
N
diag (P
r
Q
k
) (25)
E
_
[C
k
(1, ((mn))
N
)[
2
_
=
1
N
f
H
((mn))
N
E
_
diag (P
r
Q
k
) (diag(P
r
Q
k
))
H
_
. .

=V
r
f
((mn))
N
. (26)
Then, it can be easily shown that the entries of V
r
are those 266
given in (28). Substituting (26) back into (24), we get the nal 267
expression of t
1B
as follows: 268
t
1B

1
N

nJ
k
N1

m=0
f
H
((mn))
N
V
r
f
((mn))
N
L
k
(m, n) (27)
where f
((mn))
N
is the ((mn) mod N) column of F
N
and 269
V
r
(m, n) = exp
_
[mn[
2
r
2
+
j2(mn)
k
N
_
(28)
L
k
=F
N
__
F
H
N
S
l
F
N
_
V
l
_
F
H
N
Y
k
(29)
V
l
(m, n) = exp
_
[mn[
2
l
2
_
(30)
I
E
E
E
P
r
o
o
f
GOMAA AND AL-DHAHIR: PN IN ASYNCHRONOUS SC-FDMA SYSTEMS 5
Y
k
(m, n) =
[R
k
(m, n)[
2
[R
k
(n, n)[
2
(1 + p
n
k
+ p
n
k
(p
n
k
+ 2) exp (p
n
k
) Ei (p
n
k
))

R
k
(m, m)R
k
(n, n) [R
k
(m, n)[
2
[R
k
(n, n)[
2
((1 + (1 + p
n
k
) exp (p
n
k
) Ei (p
n
k
))) (31)
m, n = 0, 1, . . . , N 1. Next, we derive the quantity t
2
in (8) 270
as follows, assuming that the channels experienced by different 271
users are independent: 272
t
2

=E
_
Tr
_
U
kl
U
H
kl
_
=E
_
Tr
_

P
r

Q
l

H
l
W
l

H
H
l

Q
H
l

P
H
r
D
k
S
k
_
(32)
where W
l

= E[

P
l
S
l

P
H
l
] is given in (21), and D
k

= E[D
k
D
H
k
] 273
is a diagonal matrix with the diagonal entries 274
D
k
(n, n) =E
_


H
k
(n, n)

2
_


H
k
(n, n)

2
+
1
k
_
2
_

_
=
1
R
k
(n, n)
(1 + (1 + p
n
k
) exp (p
n
k
) Ei (p
n
k
)) . (33)
Exploiting that

H
l
is diagonal, we rewrite (32) as follows: 275
t
2
= Tr
_
E
_

P
r

Q
l
_
W
l
E
_
diag(

H
l
)
_
diag(

H
l
)
_
H
__


Q
H
l

P
H
r
D
k
S
k
_
_
. (34)
Observing that R
l
= E[diag(

H
l
)(diag(

H
l
))
H
] and dening 276
G
l

= W
l
R
l
, we write 277
t
2
=Tr
_
E
_

P
r

Q
l
G
l

Q
H
l

P
H
r
D
k
S
k
_
=Tr
_
E
_
F
N
P
r
Q
l
F
H
N
G
l
F
N
Q
H
l
P
H
r
F
H
N
D
k
S
k
_
=Tr
_
F
N
_
F
H
N
G
l
F
N
E
_
diag(P
r
Q
l
) (diag(P
r
Q
l
))
H
__
F
H
N
D
k
S
k
_
(35)
where the last equality holds because matrix P
r
Q
l
is diagonal. 278
Inspecting the denition of V
r
in (26), we write 279
t
2
=Tr
_
F
N
_
F
H
N
G
l
F
N
V
r
_
F
H
N
D
k
S
k
_

= Tr(W
r
D
k
S
k
)
=

nJ
k
W
r
(n, n)D
k
(n, n). (36)
Substituting the entries of D
k
from (33) into (36), we get the 280
nal expression of t
2
as follows: 281
t
2
=

nJ
k
W
r
(n, n)
R
k
(n, n)
(1 + (1 + p
n
k
) exp (p
n
k
) Ei (p
n
k
)) (37)
where the diagonal matrices W
r
and W
l
are given by 282
W
r
=F
N
__
F
H
N
(W
l
R
l
)F
N
_
V
r
_
F
H
N
W
l
=F
N
__
F
H
N
S
l
F
N
_
V
l
_
F
H
N
.
Finally, we evaluate t
3
[which is the noise contribution to the 283
NMSE in (8)] as follows: 284
t
3

= E
_
Tr( z z
H
)

= N
o
Tr(D
k
S
k
) = N
o

nJ
k
D
k
(n, n).
(38)
Substituting the entries of D
k
from (33) into (38), we get the 285
nal expression of t
3
as follows: 286
t
3
=

nJ
k
N
o
R
k
(n, n)
(1 + (1 + p
n
k
) exp (p
n
k
) Ei (p
n
k
)) . (39)
It is worth mentioning that, unlike [17], the entries of T
k
and 287
Y
k
are computed exactly using Lemma 1. 288
IV. NORMALIZED MEAN SQUARE ERROR EXPRESSION 289
SIMPLIFICATION AND SUBCARRIER MAPPING IMPACT 290
A. NMSE Expression Simplication 291
To gain insight into the NMSE expression given in 292
Section III, we simplify it by considering the high-SNR sce- 293
nario with only transmit PN,
2
i.e.,

P
r
=

Q
k
= I
N
, k, and 294
D
k

H
k
I
N
. In uplink transmissions, receive PN is signi- 295
cantly smaller than transmit PN since the BST oscillator quality 296
is typically better than that of the user terminal. Under the 297
given assumptions, we rewrite the NMSE expression terms as 298
follows: 299
t
1A
=E
_
Tr(

H
k

P
k
S
k
D
k
)

= E
_
Tr(

P
k
S
k
D
k

H
k
)

Tr
_
E[

P
k
]S
k
_
=
M
N
N1

m=0
exp
_
m
2
k
2
_
M (40)
where the second equality follows from Tr(AB) = Tr(BA) 300
and the approximation 1/N

N1
m=0
exp(m
2
k
/2) 1, which 301
holds under practical values of
2
k
ranging from 10
6
to 10
3
. 302
Next, we rewrite t
1B
as follows: 303
t
1B
=E
_
Tr
_

H
k
W
k

H
H
k
D
k
D
H
k
S
k
_
=E
_
Tr
_
W
k

H
k

H
H
k
D
k
D
H
k
S
k
_
Tr(W
k
S
k
) =

nJ
k
W
k
(n, n) (41)
where W
k
is dened in (21), and the commutative property 304
of diagonal matrices is used. In Section III, we showed that 305
W
k
can be approximated by a diagonal matrix. Inspecting 306
(21) and (30), we observe that the diagonal of W
k
is the 307
circular convolution between the diagonal of S
k
and the FFT 308
of the sequence exp([m[
2
k
/2)
,N/2|1
m=N/2|
, which is nothing 309
but the Lorentzian spectrum. Furthermore, the diagonal of S
k
310
2
Receive PN effects can be modeled as transmit PN [5] by lumping their
effects and summing their effective 3-dB linewidths.
I
E
E
E
P
r
o
o
f
6 IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY
is nonzero only at the subcarrier indexes of the kth user. 311
We use the FFT property that element-wise multiplication in 312
TD corresponds to circular convolution in FD. Recalling from 313
Section II-B that the PN PSD is Lorentzian shaped, we con- 314
clude that the sequence W
k
(n, n) in (41) is the circular 315
convolution between the users spectrum and the PN spectrum. 316
This circular convolution quanties the PN-induced intrauser 317
interference. Similarly, we rewrite t
2
as follows: 318
t
2
=E
_
Tr
_

H
l
W
l

H
H
l
D
k
S
k
_
= E
_
Tr
_
W
l

H
H
l

H
l
D
k
S
k
_

nJ
k
W
l
(n, n)D
k
(n, n)E
_


H
l
(n, n)

2
_
=

nJ
k
W
l
(n, n)
R
l
(n, n)
R
k
(n, n)
(1 + (1 + p
n
k
) exp (p
n
k
) Ei (p
n
k
))

nJ
k
W
l
(n, n)
R
l
(n, n)
R
k
(n, n)
Ei (p
n
k
) (42)
where D
k
(n, n) is given in (33), and the last approximation 319
follows from the high-SNR assumption where p
n
k
0
+
and 320
Ei(p
n
k
) . If the channels experienced by different users 321
are identically distributed, i.e., R
l
= R
k
, then 322
t
2

nJ
k
W
l
(n, n)Ei (p
n
k
) . (43)
Similar to W
k
, the sequence W
l
(n, n) represents the cir- 323
cular convolution between the PN spectrum and the lth user 324
spectrum. Since the PN spectrum is not an impulse, the spec- 325
trum energy leakage of the lth user causes IUI. The quantity 326
t
2
quanties the lth users energy leakage at the kth users 327
subcarriers. Applying the high-SNR approximation to t
3
, we 328
write the simplied NMSE expression as follows: 329
NMSE
2
k

1
M
_

nJ
k
W
k
(n, n)
_
1
. .
IntrauserInterference
+
1
M

l,=k

nJ
k
w
l
W
l
(n, n)
. .
Interuserinterference
+
1
M

nJ
k
p
n
k
Ei (p
n
k
)
. .
Noisecontribution
(44)
where w
l
= (
l
R
l
(n, n)/
k
R
k
(n, n))Ei(p
n
k
). Furthermore, 330
Ei(p
n
k
) can be approximated by Ei(p
n
k
)0.577+log(p
n
k
) 331
p
n
k
for p
n
k
0.3 or, equivalently, for
k
5 dB where log() is 332
the natural logarithm function. 333
B. Localized Versus Distributed Subcarrier Mapping 334
In light of the simplied NMSE expression in (44), we 335
investigate the effect of the users subcarrier mapping at the 336
medium-access-control layer on its immunity to PN at the phys- 337
ical layer. We compare the localized and distributed subcarrier 338
mapping schemes. We assume that only the subcarrier n
k
is 339
assigned to the kth user. Then, we consider two scenarios 340
where the neighboring subcarrier, i.e., n
k
+ 1, is assigned to 341
the same user in one scenario and to the lth user (where 342
l ,= k) in the second scenario. Clearly, the rst and second 343
scenarios represent, respectively, the localized and distributed 344
subcarrier mapping schemes [21]. In the presence of PN, the 345
energy of the (n
k
+ 1)th subcarrier leaks into its neighbor, 346
the (n
k
)th subcarrier, causing intrauser interference and IUI in 347
the rst (localized) and second (distributed) scenarios, respec- 348
tively. The quantities t
1B
and t
2
were shown in Section IV-A to 349
quantify the contributions of the intrauser interference and IUI, 350
respectively, to the NMSE expression. To compare the localized 351
and distributed schemes, we compare the intrauser interference 352
and IUI on the kth user and rewrite the quantities t
1B
and t
2
353
in (41) and (43), respectively, with J
k
containing only n
k
as 354
follows: 355
t
1B
=W
k
(n
k
, n
k
)
t
2
= W
k
(n
k
, n
k
)Ei (p
n
k
k
) (45)
where we assume that both users have the same channel au- 356
tocorrelation function and the same PN parameters, i.e., R
l
= 357
R
k
and W
l
= W
k
. Inspecting (9), we observe that the diago- 358
nal entries of U
kk
U
H
kk
are represented in t
1B
in addition to the 359
off-diagonal entries accounting for the intrauser interference. 360
For t
1B
to account only for the intrauser interference, we set the 361
n
k
th diagonal entry in S
k
to zero yielding S
k
= S
l
where only 362
the (n
k
+ 1)th diagonal entry is nonzero; hence, W
l
= W
k
. 363
Furthermore, the high-SNR assumption implies that p
n
k
k
0
+
, 364
Ei(p
n
k
k
) 1, and hence, t
2
t
1B
. Based on the earlier 365
discussion and assuming that the received power levels from 366
all users are the same, i.e., no nearfar effects, we conclude 367
that, for the NMSE performance metric, IUI is more harmful 368
than intrauser interference. Therefore, the localized subcarrier 369
mapping is more immune to PN than the distributed subcarrier 370
mapping. In Section VI, we illustrate this result numerically. 371
V. ITERATIVE REDUCED-COMPLEXITY JOINT CHANNEL 372
DECODING AND PHASE NOISE COMPENSATION 373
Here, we assume no CFO and focus on PN compensation. 374
Since no pilots are inserted in the data SC-FDMA symbols 375
to keep the PAPR low, we use the decoded symbols as pilots 376
for PN estimation and compensation. Signal detection and 377
channel equalization are rst described in Section V-A. Next, 378
our proposed algorithm for iterative joint decoding and PN 379
compensation is presented in Section V-B, and its initialization 380
is investigated in Section V-C. 381
A. Reduced-Complexity Signal Detection 382
The PN process has a low-pass nature with practical spec- 383
tral widths smaller than the coherence bandwidth of practical 384
frequency-selective channels. In other words, the signicant 385
portion of the PN PSD lies within the channel coherence 386
bandwidth. This practical condition validates the approximation 387
that the PN and channel effects can be interchanged; hence, 388
I
E
E
E
P
r
o
o
f
GOMAA AND AL-DHAHIR: PN IN ASYNCHRONOUS SC-FDMA SYSTEMS 7
the PN and channel matrices can be exchanged [2], [5] and (2) 389
without CFO is rewritten as follows: 390
y
t
=P
t
r
N
u

l=1
H
t
l
P
t
l
g
t
l
+z
t

N
u

l=1
H
t
l
P
t
r
P
t
l
g
t
l
+z
t

=
N
u

l=1
H
t
l
P
t
C, l
g
t
l
+z
t
(46)
where P
t
C,l

= P
t
r
P
t
l
is the composite transmitreceive PN 391
matrix given by 392
P
t
C,l
=diag
_
exp
_
j
C,t
0
_
, exp
_
j
C,t
1
_
, . . . , exp
_
j
C,t
N1
__
(47)

C,t
n
=
l,t
n
+
r,t
n
=
l,t
n1
+
r,t
n1
. .

=
C,t
n1
+
l,t
n
+
r,t
n
. .

=
C,t
n
. (48)
Since
l,t
n
A(0, 2
l
/Nf
sub
) and
r,t
n
A(0, 2
r
/Nf
sub
), 393
then
C,t
n
A(0, 2(
l
+
r
)/Nf
sub
). With the approxima- 394
tion in (46), the transmit and receive PN effects are lumped 395
together and treated as transmit PN only whose effective 3-dB 396
linewidth is
C,l
=
l
+
r
. We exchanged the channel matrix 397
with the receive rather than transmit PN matrix for two reasons. 398
First, the former yields a more accurate approximation because 399
the receive PN process has a narrower spectral width and 400
hence better satises the given condition. Second, lumping both 401
transmit and receive PN at the transmitter enables us to move 402
the channel equalizer out of the iterative channel decoding 403
and PN compensation loop, as shown in Section V-B, which 404
signicantly reduces the hardware complexity of our proposed 405
approach. Denoting by k the index of the user of interest, we 406
obtain its decision statistic by applying the N-point FFT to y
t
407
and selecting its assigned M subcarriers by applying the matrix 408
S
H
k
. Hence, we write the decision statistic of the kth user as 409
follows: 410
Y
t
k
= S
H
k
F
N
y
t
= S
H
k

H
t
k

P
t
C,k
S
k
F
M
x
t
k
+Z
t
k
(49)
where Z
t
k
= S
H
k
F
N
z
t
+S
H
k
F
N

l,=k
H
t
l
P
t
C,l
g
t
l
represents 411
the noise plus the IUI caused by the PN-induced leakage be- 412
tween users subcarriers. Furthermore,

H
t
k
= F
N
H
t
k
F
H
N
is the 413
N N diagonal FD channel matrix, and

P
t
C,k
= F
N
P
t
C,k
F
H
N
414
is the N N circulant PN matrix. The ICI is induced by 415
the matrix

P
t
C,k
due to its nondiagonal (circulant) structure. 416
However, for practical PN levels,

P
t
C,k
is a banded matrix 417
with few signicant diagonals resulting in ICI only from few 418
neighboring subcarriers. Since S
H
k
S
k
= I
M
, we write 419
Y
t
k
= S
H
k
S
k
S
H
k

H
t
k

P
t
C,k
S
k
F
M
x
t
k
+Z
t
k
(50)
where S
k
S
H
k
is a diagonal matrix whose diagonal entries are 420
zeros, except for those whose indexes lie in J
k
. Since both

H
t
k
421
and S
k
S
H
k
are diagonal, we can interchange them as follows: 422
Y
t
k
=S
H
k

H
t
k
S
k
. .

H
t
k
S
H
k

P
t
Ck
S
k
. .

P
t
C,k
F
M
x
t
k
+Z
t
k


H
t
k

P
t
C,k
F
M
x
t
k
+Z
t
k
(51)
where

H
t
k
and

P
t
C,k
are submatrices of

H
t
k
and

P
t
C,k
, respec- 423
tively, whose row and column indexes lie in J
k
. Note that

H
t
k
424
is an M M diagonal matrix, whereas

P
t
C,k
is an M M 425
Toeplitz [20] matrix. The next detection step is to apply FD 426
equalization to equalize the channel effects in the matrix

H
t
k
. 427
Applying the MMSE equalizer, we get 428
Y
t
k,eq
= E
t
k
Y
t
k
(52)
where E
t
k
= ((

H
t
k
)
H

H
t
k
+ 1/
k
)
1
(

H
t
k
)
H
S
H
k
D
k
S
k
. Note 429
that

H
t
k
is a diagonal matrix and so is E
t
k
. Next, we apply the 430
M-point inverse FFT to go back to TD to detect the kth users 431
data. Meanwhile, we remove the bias factor
t
k
induced by the 432
MMSE equalization to get 433
Y
t
k,TD
=
1

t
k
F
H
M
Y
t
k,eq
= F
H
M

P
t
C,k
F
M
x
t
k
+Z
t
k,TD
P
t
C, k
x
t
k
+Z
t
k,TD
(53)
where P
t
C,k

= F
H
M

P
t
C,k
F
M
. The vector Z
t
k,TD
contains the 434
noise and residual ICI due to MMSE equalization, and
t
k
435
is the average of the diagonal entries of the diagonal matrix 436
E
t
k

H
t
k
. Exploiting the fact that Toeplitz matrices of large sizes 437
can be approximated as circulant matrices [20], we approxi- 438
mate the Toeplitz matrix

P
t
C,k
by a circulant one. Hence, we 439
approximate matrix P
t
C,k
by a diagonal matrix since circulant 440
matrices are diagonalized by the FFT matrix. Note that

P
t
C,k
is 441
a submatrix of

P
t
C,k
= F
N
P
t
C,k
F
H
N
, where P
t
C,k
is a diagonal 442
matrix whose diagonal entries are complex exponentials of unit 443
magnitude as in (4). Hence, the magnitudes of the diagonal 444
entries of P
t
C,k
are also approximately unity. Intuitively, the 445
matrix P
t
C,k
can be viewed as a PN matrix directly multiplying 446
the complex data in x
t
k
as in (53). In SC-FDMA, the complex 447
data in x
t
k
are in TD; hence, adding the PN samples directly to 448
their phases is intuitively appealing. Finally, Y
t
k,TD
in (53) is 449
equivalently rewritten as follows: 450
Y
t
k,TD
= X
t
k
p
t
k
+Z
t
k,TD
(54)
where X
t
k
is an M M diagonal matrix whose diagonal 451
entries are given by vector x
t
k
, whereas p
t
k
is an M 1 vector 452
containing the diagonal entries of the matrix P
t
C,k
. 453
B. Iterative Processing 454
We describe our proposed iterative channel decoding and PN 455
compensation approach shown in Fig. 1. For the tth SC-FDMA 456
symbol, the following procedure is executed. 457
Initialization: Initialize the estimate of the diagonal PN ma- 458
trix P
t
C,k
with the M M diagonal matrix

P
t,0
C,k
, and initialize 459
the iteration counter with i = 0. 460
I
E
E
E
P
r
o
o
f
8 IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY
Fig. 1. Iterative channel decoding and PN compensation.
The ith iteration: 461
1) Using

P
t,i
C,k
and based on (53), we compensate for PN in 462
Y
t
k,TD
and obtain 463
Y
t,i
k,cmp
=
_

P
t,i
C,k
_
1
Y
t
k,TD
x
t
k
+
_

P
t,i
C,k
_
1
Z
t
k,TD

=x
t
k
+Z
t
k,cmp
(55)
where (

P
t,i
C,k
)
1
is easily calculated since

P
t,i
C,k
is an M 464
M diagonal matrix. 465
2) Compute the log-likelihood ratios (LLRs) of the code 466
bits corresponding to the mth entry of Y
t,i
k,cmp
, 0 m 467
M 1, as follows: 468
L
q,i
m
= log
_
_
_
_

S
1
exp
_
1

2
m

Y
t,i
k,cmp
(m)

2
_

S
0
exp
_
1

2
m

Y
t,i
k,cmp
(m)

2
_
_
_
_
_
(56)
where 0 q Q1, Q is the number of code bits 469
represented by any constellation symbol, and 2
Q
is the 470
constellation size. Furthermore,
2
m
= E[[Z
t
k,cmp
(m)[
2
], 471
and S
1
and S
0
are the sets of signal constellation symbols 472
where the qth bit is equal to 1 and 0, respectively. The 473
LLR in (56) can be approximated by 474
L
q,i
m

1

2
m
_
min
S
1

Y
t,i
k,cmp
(m)

2
min
S
0

Y
t,i
k,cmp
(m)

2
_
. (57)
3) Deinterleave the LLRs obtained in Step 2 and pass them 475
to the channel decoder that updates them using the chan- 476
nel code structure. Then, interleave the updated LLRs and 477
denote them by

L
q,i
m
. 478
4) Using

L
q,i
m
, compute soft estimates for the signal con- 479
stellation symbols in x
t
k
as follows: 480
x
t,i
k
(m) = E
_
x
t
k
(m)

=
2
Q

b=1

b
P
_
x
t
k
(m) =
b
_
(58)
where
b
is a summation variable that takes on all values 481
of the constellation points, and P(x
t
k
(m) =
b
) is the 482
probability that x
t
k
(m) equals
b
calculated as follows: 483
P
_
x
t
k
(m) =
b
_
=
Q1

q=0
P (c
q
m
= c
q
b
)
=
Q1

q=0
1
2
_
1 + (2c
q
b
1) tanh
_

L
q,i
m
2
__
(59)
where tanh() is the hyperbolic tangent function and c
q
m
, 484
c
q
b
0, 1, represents the qth code bit corresponding to 485
the signal constellation symbols x
t
k
(m) and
b
, respec- 486
tively. The second equality in (59) is obtained by observ- 487
ing that

L
q,i
m
can be written as log(P(c
q
m
= 1)/P(c
q
m
= 488
0)). 489
5) An updated estimate of the PN vector p
t
k
can be obtained 490
by a simple pointwise division of the vector Y
t
k,TD
by 491
the vector x
t,i
k
obtained in (58). However, this approach 492
does not exploit the low-pass nature of the PN process 493
rendering the PN samples in p
t
k
correlated. Two possible 494
approaches can be followed to exploit the correlation 495
between the PN samples. We call the rst approach 496
FD processing where the PN vector p
t
k
is written in 497
terms of its FD transform p
t
k
as follows: p
t
k
= F
H
M
p
t
k
. 498
Furthermore, most of the entries of p
t
k
are negligible due 499
to the low-pass nature of the PN process. Hence, (54) can 500
be written as follows: 501
Y
t
k,TD
= X
t
k
F
H
M,LP
p
t
k,LP
+Z
t
k,TD
(60)
where p
t
k,LP
contains D M signicant entries of p
t
k
, 502
and F
H
M,LP
is an M D tall matrix containing the 503
columns of F
H
M
corresponding to the signicant entries 504
of p
t
k
. Next, the linear least squares estimate (LLSE) of 505
p
t
k,LP
is obtained as follows: 506

p
t
k,LP
=
_
A
H
A
_
1
A
H
Y
t
k,TD
(61)
where A =

X
t,i
k
F
H
M,LP
, and

X
t,i
k
is a diagonal matrix 507
whose diagonal entries are given by the estimate symbols 508
x
t,i
k
(m) obtained in (58). The size of the matrix A
H
A 509
is D D; hence, its inverse can be obtained easily since 510
D M (typically, D = 3). Then, an updated estimate of 511
the PN vector p
t
k
is obtained as follows: 512
p
t,i
k
= F
H
M,LP

p
t
k,LP
. (62)
The second approach is called subblock processing 513
where the TD vector Y
t
k,TD
is divided into C subblocks 514
each of length G = M/C. Due to the low-pass nature of 515
the PN process, the PN samples are assumed constant 516
over each subblock. Hence, we write the cth subblock 517
vector as follows: 518
Y
t
k,c

=Y
t
k,TD
(cG : (c+1)G1)=p
t
k
(c)x
t
k,c
+Z
t
k,c
(63)
where x
t
k,c
= x
t
k
(cG : (c + 1)G1), Z
t
k,c
= 519
Z
t
k,TD
(cG : (c + 1)G1), and the scalar p
t
k
(c) 520
I
E
E
E
P
r
o
o
f
GOMAA AND AL-DHAHIR: PN IN ASYNCHRONOUS SC-FDMA SYSTEMS 9
represents the PN sample perturbing the cth block. Next, 521
we compute the LLSE of p
t
k
(c) as follows: 522
p
t,i
k
(c) =
_
x
t,i
k,c
_
H
Y
t
k,c
| x
t,i
k,c
|
2
2
(64)
where | |
2
denotes the l
2
-norm, and x
t,i
k,c
is constructed 523
from the estimates obtained in (58) for the cth subblock. 524
Then, the estimate p
t,i
k
(c) is assigned to the middle of 525
the corresponding subblock, and the PN vector p
t
k
is 526
estimated by inter/extrapolating p
t,i
k
(c)
C1
c=0
over the M 527
samples of the tth SC-FDMA symbol. Note that the low- 528
pass property of the PN process is exploited in both the 529
FD-processing and subblock processing approaches 530
to reduce the number of unknowns to be estimated. 531
Hence, more averaging is done in (62) and (64), which 532
helps in reducing the error propagation effect due to the 533
inaccuracies of the symbols estimates in (58). 534
6) Construct the M M diagonal matrix

P
t,i+1
C,k
whose 535
diagonal entries are given by the PN vector estimate p
t,i
k
536
calculated in Step 5 using either the FD-processing 537
approach or the subblock processing approach. 538
7) Set i = i + 1 and check the stopping criterion. If met, exit 539
the algorithm, else go to Step 1. Several stopping criteria 540
can be used, e.g., the maximum number of iterations or 541
the cyclic-redundancy check. 542
We emphasize that modeling both transmit and receive PN 543
as transmit-only PN moved the equalization step outside the 544
iterations, which signicantly reduced the overall complexity. 545
C. Initialization 546
Here, we investigate the PN initialization needed for the 547
proposed iterative approach in Section V-B. Specically, we 548
show how to choose the diagonal matrix

P
t,0
C,k
. If pilots were 549
inserted in SC-FDMA symbols, they would be used to estimate 550
the CPE for the initialization process as done in [5][7] and [22] 551
for OFDM systems. The CPE is the average of the PN samples 552
perturbing the SC-FDMA symbol. However, the model adopted 553
in this paper assumes no pilots within the data symbols. To 554
overcome this problem, we exploit the correlation between PN 555
samples perturbing consecutive SC-FDMA symbols. Hence, 556
we choose the PN initialization of the tth SC-FDMA sym- 557
bol to be a function of the estimated PN in the (t 1)th 558
SC-FDMA symbol. Two choices are proposed in this paper. 559
First, we choose the average of the PN samples estimated in 560
the (t 1)th SC-FDMA symbol to initialize the PN of the tth 561
SC-FDMA symbol as follows: 562

P
t,0
C,k
=
_
1
M
1
T
M
p
t1
k
_
I
M
(65)
where 1
M
is the M 1 all-one vector and p
t1
k
is the PN 563
vector estimate obtained at the last iteration of the (t 1)th 564
SC-FDMA symbol detection. The second choice is the last 565
sample of the estimated PN vector as follows: 566

P
t,0
C,k
= p
t1
k
(M)I
M
. (66)
Fig. 2. Comparison between analytical (solid) and simulated (dashed) NMSE
with localized and distributed mapping schemes.
The PNof the rst SC-FDMAsymbol is initialized with

P
t,0
C,k
= 567
I
M
, t = 0. This is a reasonable choice since a training symbol 568
usually precedes the data symbols in all frames, as in the LTE 569
standard [9], for timing and phase synchronization. 570
VI. SIMULATION RESULTS 571
Here, we present numerical results for our derived NMSE 572
expression and our proposed PN compensation approach. Un- 573
less otherwise is stated, we model SC-FDMA systems with 574
N = 512, L
p
= 32, f
sub
= 15 KHz, M = 96, and the total 575
number of guard subcarriers on both sides is 212, as specied 576
in the LTE uplink standard [9] for 5-MHz bandwidth. We use 577
16-QAM modulation for all simulations, except for that in 578
Fig. 7, where we use QPSK. We also use rate-1/2 convolutional 579
coding with MAP decoding. 580
A. NMSE Expression Verication 581
Here, we numerically verify the derived NMSE expression 582
by Monte Carlo simulations. For all users, we use the ex- 583
ponentially decaying multipath channel model with six un- 584
correlated paths and 1-dB decaying factor per path. We use 585
the localized subcarrier mapping scheme. We assume that all 586
users have the same received SNR level and transmit PN and 587
CFO parameters, i.e.,
l
= ,
l
= , and f
l
= f, l. In 588
Fig. 2, we verify the observation made in Section IV-B about 589
the effect of subcarrier mapping on the NMSE. We compare 590
the analytical and simulated NMSE with both distributed and 591
localized subcarrier mapping schemes under various CFO and 592
PN levels. In addition to the accuracy of the analytical NMSE 593
expression, we notice that the localized subcarrier mapping 594
scheme is more immune to CFO and PN than the distributed 595
subcarrier mapping scheme, as discussed in Section IV-B. In 596
Fig. 3, we numerically verify the NMSE expression for small 597
N = 128 to validate our approximation of Toeplitz matrices 598
with large size as circulant matrices. The accuracy of the NMSE 599
expression derived in Section III is evident even for SNR levels 600
as low as 8 dB and for small FFT sizes as small as N = 128. 601
Note that the NMSEs of both mapping schemes are identical in 602
the case of no CFO/PN because there is neither IUI nor ICI. 603
I
E
E
E
P
r
o
o
f
10 IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY
Fig. 3. Comparison between analytical (solid) and simulated (dashed) NMSE
with joint transmitreceive PN and CFO for N = 128.
Fig. 4. AWGN performance of subblock approach using no initialization
(dash dotted), initialization in (65) (dashed), and initialization in (66) (solid).
We also observe that the NMSE gap between both schemes 604
increases as the CFO/PN level increases. The derived NMSE 605
expression can be used to investigate the nearfar effect by 606
allowing different receive SNR levels
l
for different users. 607
B. Performance Evaluation of the Proposed PN 608
Compensation Algorithm 609
Here, we simulate the performance of our proposed iterative 610
compensation algorithm. The data frame consists of six data 611
SC-FDMA symbols over which the channel is quasi-static as 612
in the LTE uplink subframe [9]. The transmit and receive PN 613
levels are set to
l
= 110 Hz l and
r
= 12 Hz, which 614
correspond to oscillator PSDs of 47.6 and 57.2 dBc/Hz, 615
respectively, at 1-kHz frequency offset from the oscillator car- 616
rier frequency. We choose
l
to be greater than
r
= 12 since 617
users oscillators are less accurate than the BSTs oscillator. 618
We use the localized subcarrier mapping scheme and assume 619
enough guard subcarriers between users; hence, we ignore 620
the PN-induced leakage between users edge subcarriers. In 621
Fig. 4, we show the frame-error-rate (FER) performance of 622
ve iterations of our subblock processing-based approach 623
without initialization, i.e.,

P
t,0
C,k
= I
M
, t, and with the two 624
initializations proposed in (65) and (66). Clearly, both proposed 625
initializations show signicant performance improvement over 626
Fig. 5. Performance of FD processing (dashed) and subblock processing
(solid) with the initialization in (66) for the SCM macro suburban channel.
Fig. 6. Performance of various algorithms for 16-QAM using known (solid)
and estimated (dashed) channel for SCM macro urban channel.
the case without initialization [23]. Furthermore, the initializa- 627
tion in (66) outperforms that in (65) because the last PN sample 628
of the (t 1)th SC-FDMA symbol is the closest one to the PN 629
samples of the tth SC-FDMA symbol. Linear interpolation is 630
used and the number of subblocks C is varied over iterations 631
as follows. In the early iterations, the SC-FDMA symbol is 632
divided into few subblocks since the symbols estimates are still 633
inaccurate; hence, more averaging is needed. In later iterations 634
with improved data symbols estimates, the SC-FDMA symbol 635
is divided into more subblocks to improve the accuracy of the 636
PN interpolation. In the rst and second iterations, C is chosen 637
to be 1 and 2, respectively. Then, we increment C by 2 for 638
each subsequent iteration. Similarly, for the FD processing 639
approach, the parameter D is chosen to increase over iterations 640
where D = 1 in the rst iteration and then incremented by 2 641
for each subsequent iteration. In Figs. 5 and 6, we compare 642
the performances of our FD-processing-based and subblock 643
processing-based iterative approaches for the SCM [24] sub- 644
urban and urban macro channel models, respectively, using the 645
initialization in (66). The latter is shown to outperform the 646
former, which is intuitively appealing since FD processing 647
approximates the nonperiodic PN process by a periodic one 648
by keeping only few signicant FFT coefcients. This leads to 649
high estimation errors particularly at both edges of the PNblock 650
[7], [25]. The impact of channel estimation errors is shown 651
I
E
E
E
P
r
o
o
f
GOMAA AND AL-DHAHIR: PN IN ASYNCHRONOUS SC-FDMA SYSTEMS 11
Fig. 7. Comparison with MAP-based algorithm in [11] for QPSK using
known (solid) and estimated (dashed) channel for SCM macro urban channel.
also in Fig. 6 where the pilots preceding each data frame are 652
used to compute the standard LLSE of the channel frequency 653
response at each subcarrier. The pilots are generated using the 654
ZadoffChu sequences as in the LTE standard [9]. For channel 655
estimation, we assume that the training sequence is being used 656
to mitigate PN impacting the training sequence using any 657
of the existing techniques in the literature such as [4], [26], 658
and [27], followed by the PN-free channel estimation. Note 659
that the estimated PN samples during the training sequence 660
are different from PN samples during the consecutive data 661
symbols; hence, they cannot be used for data detection. Our 662
proposed PN compensation algorithms experience insignicant 663
SNR loss of 0.3 dB at FER = 10
2
due to channel estimation 664
errors; hence, they are robust under channel estimation errors. 665
In Fig. 7, we compare the performances of our proposed PN 666
compensation algorithms with the MAP-based one proposed in 667
[11] for QPSK with known and estimated channels. We observe 668
that our proposed algorithms and the MAP-based one almost 669
achieve the same performance with known and estimated chan- 670
nels. However, our algorithms have the advantage that they 671
are applicable to any modulation scheme and do not require 672
the optimization of any parameters unlike the MAP algorithm 673
in [11] where careful choices of some design parameters are 674
required to satisfy the stability conditions. Furthermore, our 675
subblock-based algorithm requires less complexity than the 676
MAP algorithm where the former and later require, respec- 677
tively, O(8M) and O(13M) real multiplications per iteration. 678
Moreover, our algorithm does not involve the evaluation of 679
several hyperbolic functions as in [11]. The performances 680
of our proposed algorithms versus the PN level are shown 681
in Fig. 8. 682
VII. CONCLUSION 683
We derived a new NMSE expression for SC-FDMA uplink 684
transmissions under CFO and joint transmitreceive PN taking 685
into account MMSE channel equalization effects. The resulting 686
expression is a function of the CFO and PN levels, the active 687
users received SNR levels and subcarrier assignments, and 688
the users channel delay proles. Based on the derived NMSE 689
expression, we compared the localized and distributed subcar- 690
Fig. 8. FER versus PN level for FD (dashed) and subblock (solid) algorithms
with initialization in (66) for SCM macro urban channel with SNR = 20 dB.
rier mapping schemes from the viewpoint of their immunity 691
to CFO and PN. We proposed an iterative reduced-complexity 692
algorithm for joint decoding and PN compensation in 693
SC-FDMA systems with the equalizer moved out of the 694
loop to reduce complexity. To keep the low-PAPR feature of 695
SC-FDMA systems, no pilots are multiplexed with data subcar- 696
riers. Hence, we do not use pilots for PN tracking. Instead, we 697
use the decoded data iteratively for PN estimation and com- 698
pensation. Furthermore, the PN low-pass nature is exploited 699
without assuming a priori knowledge about the exact PN model 700
to avoid model mismatch problems. Signicant performance 701
gains are demonstrated by the simulation results. 702
ACKNOWLEDGMENT 703
The statements made herein are solely the responsibility of 704
the authors. 705
REFERENCES 706
[1] F. Horlin and A. Bourdoux, Digital Compensation for Analog Front-Ends. 707
West Sussex, U.K.: Wiley, 2008. 708
[2] P. Mathecken, T. Riihonen, S. Werner, and R. Wichman, Performance 709
analysis of OFDM with wiener phase noise and frequency selective fading 710
channel, IEEE Trans. Commun., vol. 59, no. 5, pp. 13211331, 2011. 711
[3] Q. Zou, A. Tarighat, and A. H. Sayed, Compensation of phase noise in 712
OFDM wireless systems, IEEE Trans. Signal Process., vol. 55, no. 11, 713
pp. 54075424, Nov. 2007. 714
[4] P. Rabiei, W. Namgoong, and N. Al-Dhahir, A non-iterative technique for 715
phase noise ICI mitigation in packet-based OFDM systems, IEEE Trans. 716
Signal Process., vol. 58, no. 11, pp. 59455950, Nov. 2010. 717
[5] V. Syrjala and M. Valkama, Receiver DSP for OFDM systems 718
impaired by transmitter and receiver phase noise, in Proc. IEEE ICC, 719
2011, pp. 16. 720
[6] M. K. Lee, K. Yang, and K. Cheun, Iterative receivers based on subblock 721
processing for phase noise compensation in OFDMsystems, IEEE Trans. 722
Commun., vol. 59, no. 3, pp. 792802, Mar. 2011. 723
[7] S. Bittner, A. Frotzscher, G. Fettweis, and E. Deng, Oscillator phase 724
noise compensation using Kalman tracking, in Proc. IEEE ICASSP, 725
2009, pp. 25292532. 726
[8] Y. Chi, A. Gomaa, N. Al-Dhahir, and A. R. Calderbank, Training sig- 727
nal design and tradeoffs for spectrally-efcient multi-user MIMO-OFDM 728
systems, IEEE Trans. Wireless Commun., vol. 10, no. 7, pp. 22342245, 729
Jul. 2011. 730
[9] I. T. S. Sesia and M. Baker, LTE-The UMTS Long Term Evolution: From 731
Theory to Practice. Hoboken, NJ, USA: Wiley, 2009. 732
[10] P. Bertrand, Frequency offset estimation in 3G LTE, in Proc. IEEE Veh. 733
Technol. Conf. -Spring, 2010. AQ2 734
I
E
E
E
P
r
o
o
f
12 IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY
[11] M. Sabbaghian and D. Falconer, Joint turbo frequency domain equaliza- 735
tion and carrier synchronization, IEEE Trans. Wireless Commun., vol. 7, 736
no. 1, pp. 204212, Jan. 2008. 737
[12] A. Georgiadis, Gain, phase imbalance, and phase noise effects on error 738
vector magnitude, IEEE Trans. Veh. Technol., vol. 53, no. 2, pp. 443449, 739
Mar. 2004. 740
[13] G. Sridharan and T. J. Lim, Performance analysis of SC-FDMA in the 741
presence of receiver phase noise, IEEE Trans. Commun., vol. 60, no. 12, 742
pp. 38763885, Dec. 2012. 743
[14] T. Schenk, RF Imperfections in High-rate Wireless Systems: Impact and 744
Digital Compensation. Dordrecht, The Netherlands: Springer-Verlag, 745
2008. 746
[15] T. Pollet, M. V. Bladel, and M. Moeneclaey, BER sensitivity of OFDM 747
systems to carrier frequency offset and wiener phase noise, IEEE Trans. 748
Commun., vol. 43, no. 234, pp. 191193, Feb.Apr. 1995. 749
[16] S. M. Kay, Fundamentals of Statistical Signal Processing: Estimation 750
Theory. Englewood Cliffs, NJ, USA: Prentice Hall, 1993. 751
[17] A. Gomaa and N. Al-Dhahir, SC-FDMA performance in presence of 752
oscillator impairments: EVM and subcarrier mapping impact, in Proc. 753
IEEE GLOBECOM, 2011, pp. 15. 754
[18] M. Abramovitz and I. Stegun, Handbook of Mathematical Functions with 755
Formulas, Graphs, and Mathematical Tables. New York, NY, USA: 756
Dover, 1964. 757
[19] J. G. Proakis, Digital Communication., 4th ed. New York, NY, USA: 758
McGraw-Hill, 2001. 759
[20] R. Gray, Toeplitz and Circulant Matrices: A Review. Hanover, MA, 760
USA: Now, 2006. 761
[21] H. Myung and D. Goodman, Single Carrier FDMA: A New Air Interface 762
for Long Term Evolution. Hoboken, NJ, USA: Wiley, 2008. 763
[22] D. Petrovic, W. Rave, and G. Fettweis, Common phase error due to phase 764
noise in OFDM-estimation and suppression, in Proc. IEEE Int. Symp. 765
PIMRC, 2004. AQ3 766
[23] A. Gomaa and N. Al-Dhahir, Blind phase noise compensation for 767
SC-FDMA with application to LTE-uplink, in Proc. IEEE ICC, 2012, 768
pp. 43054309. 769
[24] Spatial channel model for mimo simulations, 3rd Generation Part- 770
nership Project (3GPP), Sophia-Antipolis Cedex, France, TR 25.996, 771
Sep. 2003. 772
[25] S. Bittner, E. Zimmermann, and G. Fettweis, Exploiting phase noise 773
properties in the design of MIMO-OFDM receivers, in Proc. IEEE 774
WCNC, 2008, pp. 940945. 775
[26] H. Mehrpouyan, A. Nasir, S. Blostein, T. Eriksson, G. Karagiannidis, 776
and T. Svensson, Joint estimation of channel and oscillator phase noise 777
in MIMO systems, IEEE Trans. Signal Process., vol. 60, no. 9, pp. 4790 778
4807, Sep. 2012. 779
[27] J. Tao, J. Wu, and C. Xiao, Estimation of channel transfer function 780
and carrier frequency offset for OFDM systems with phase noise, IEEE 781
Trans. Veh. Technol., vol. 58, no. 8, pp. 43804387, Oct. 2009. 782
Ahmad Gomaa (S09) received the B.S. and M.S. 783
degrees in electronic and communication engineer- 784
ing from Cairo University, Giza, Egypt, in 2005 and 785
2008, respectively, and the Ph.D. degree in electrical 786
engineering from the University of Texas at Dallas, 787
Richardson, TX, USA, in 2012. 788
He is currently with Broadcom Corporation, 789
Sunnyvale, CA, USA. His research interests include 790
compressive sensing applications in digital commu- 791
nications, sparse lter design, channel equalization, 792
and compensation of radio-frequency impairments at 793
the baseband. 794
Naofal Al-Dhahir (F08) received the Ph.D. degree 795
in electrical engineering from Stanford University, 796
Stanford, CA, USA, in 1994. 797
He is currently a Jonsson Distinguished Profes- 798
sor of Engineering with the University of Texas 799
at Dallas, Richardson, TX, USA. He is the au- 800
thor of over 220 papers and is the holder of over 801
30 issued U.S. patents. His current research inter- 802
ests include broadband wireless and wireline trans- 803
mission, multiple-inputmultiple-output orthogonal 804
frequency-division transceivers, baseband process- 805
ing to mitigate radio-frequency/analog impairments, and compressive sensing. 806
Dr. Al-Dhahir is the co-recipient of several Best Paper Awards. 807
I
E
E
E
P
r
o
o
f
AUTHOR QUERIES
AUTHOR PLEASE ANSWER ALL QUERIES
AQ1 = Please provide Appendix.
AQ2 = Please provide page range for Ref. [10].
AQ3 = Please provide page range for Ref. [22].
END OF ALL QUERIES

You might also like