Professional Documents
Culture Documents
Digital Predistortionfor Power Amplifier Basedon Sparse Bayesian Learning Peng 2016
Digital Predistortionfor Power Amplifier Basedon Sparse Bayesian Learning Peng 2016
fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCSII.2016.2534718, IEEE
Transactions on Circuits and Systems II: Express Briefs
Abstract—In this paper, a sparse Bayesian learning (SBL) memory polynomial (GMP) model have shown high per-
algorithm is applied to estimate the coefficients of the power formance in the previous researches [1], [2]. However, one
amplifier (PA) behavioral models and inverse models from the drawback of such models is the case that the number of model
view of probability. With this sparse learning method, the needed
number of samplings can be reduced significantly. In addition, coefficients and condition number of the data matrix grow
it also provides researchers with ideas that obtain the needed rapidly with the increase of polynomial order and memory
subspace of pre-selected model. The performance of the algorithm depth. Orthogonal polynomials used in some literatures is
is validated experimentally on a gallium nitride (GaN) PA, very effective to reduce the condition number [3]. However, it
and signal used to test the proposed approach is LTE signal. brings additional computation in the process of constructing
A comparison with the state-of-the-art estimation algorithm in
open-loop digital predistortion (DPD) system is also presented, basis functions. Moreover, it requires the input signal to be s-
and the vast majority of tests show that the number of model mooth and ergodic. Consequently, researchers are motivated to
coefficients is reduced at least 50 percent. reduce the model dimension without sacrificing performance.
Index Terms—Digital predistortion, power amplifier, sparse Recently, researchers have proposed some methods to prune
Bayesian learning, memory polynomial, parameter pruning, LTE. the models [4]–[6], however, their flexibilities are poor for
the particular structures they need. In other works [7], [8],
pruning methods based on compressed-sensing theory have
I. I NTRODUCTION been utilized and shown good performance.
The feasibility of pruning PA behavioral models is analyzed
S a key component of the mobile communication system,
A power amplifiers are expected to achieve the highest
possible power efficiency under stringent linearity require-
in this paper and herein SBL algorithm which estimates the
sparse model coefficients from the view of probability is
also presented. By employing this proposed approach, the
ments. In order to improve the PA’s efficiency, some highly needed number of samplings can be reduced significantly.
efficient amplifier structures such as Doherty, linear amplifi- More importantly, on the basis of sparse assumptions, the
cation using nonlinear components (LINC), envelope tracking needed subspace of pre-selected model can be obtained with-
(ET) are widely used. However, all of these technologies out sacrificing performance. The performance of the proposed
are inevitably bringing about strong nonlinearity. Among the method is verified by simulations first, and then it is assessed
existing linearization techniques, DPD has become the most on GaN amplifier that is driven by LTE signal.
popular linearization technology for its good performance. This paper is organized as follows. First, details of SBL
DPD has been proved to be one of the most effective are introduced in section II. In section III, advantages of the
linearization techniques in dealing with the mildly nonlinear Bayesian method and comparison between SBL and the exit-
system. To compensate for the strong nonlinearity, higher poly- ing pruning algorithms are presented. Then the experimental
nomial order, longer memory depth and more samplings are results are shown in section IV. The conclusions are presented
required. Thus, the condition number of data matrix tends to in section V.
be particularly bad. With the growth of the condition number,
the computational complexity increases, and the accuracy of
estimation result is more sensitive to the feedback noise. Thus II. S PARSE BAYESIAN L EARNING FOR D IGITAL
identification algorithms with higher performance need to be P REDISTORTION
put forward. A. PA/DPD Behavioral Modeling as Linear Regression with
Volterra series (VS) model especially its simplified variants Sparse Assumption
such as memory polynomial (MP) model and generalized In the current DPD systems, the vast majority behavioral
Manuscript received June 26, 2015; revised September 15, 2015; accepted models are derived from the approximations of an implicit
January 5, 2016. This work was supported by the National Natural Science nonlinear dynamic equation [9]. These models can be roughly
Foundation of China under Grant 61271036 and 61571080. This brief was divided into two categories—VS-based models and orthogonal
recommended by Associate Editor C. Li.
The authors are with the School of Electronic Engineering, U- polynomial models. In VS-based models, it may happen that
niversity of Electronic Science and Technology of China, Chengdu some bases are linearly dependent, that means the improve-
611731, China (e-mail: junpeng554@foxmail.com; sbhe@uestc.edu.cn; bing- ment of modeling performance that offered by one basis can be
wen.wang@hotmail.com; daizj ok@126.com; jingzhou.pang@foxmail.com).
Color versions of one or more of the figures in this brief are available substituted by some others. Hence, some terms in these models
online at http://ieeexplore.ieee.org. can be pruned. Again, in orthogonal polynomial models, each
Copyright (c) 2016 IEEE. Personal use of this material is permitted.
However, permission to use this material for any other purposes must be obtained from the IEEE by sending an email to pubs-permissions@ieee.org.
1549-7747 (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCSII.2016.2534718, IEEE
Transactions on Circuits and Systems II: Express Briefs
term of the series makes independent contributions, but there and the subscript ’2’ is often been omitted. Because of the
is no further information provided to determine the accurate discontinuity at zero and local minimal problem, it is a difficult
nonlinear order and memory depth. That is to say, some optimization challenge to solve this equation directly. While,
of the terms can be pruned because their contributions are an alternative strategy to the sparse approximation problem is
small enough to be ignored. From another perspective, some a Bayesian approach which encourages the sparsity by placing
experiments have shown that there are redundant terms in PA a sparseness-promoting prior on θ.
behavioral models [4]–[6]. To wrap up, the sparse assumption
of the model parameters before estimation is reasonable.
1) Consider the PA/DPD Modeling Problem from Bayesian
In this research, The indirect learning architecture is used to
Perspective: As mentioned above, samplings and noises are
design digital predistorter. The output signal and input signal
all independent, hence the likelihood of the complete sampling
are used to generate the predistorter directly. Considering a
sets can be written as:
MP model, the procedure can be expressed as follows:
1
K−1
X L−1 p(t|θ, σ 2 ) = (2πσ 2 )−M exp (− 2 kt − Φθk2 ) (7)
X 2σ
x(n) = wk,l y(n − l)|y(n − l)|k (1)
k=0 l=0
Estimating θ and σ 2 by looking for the maximum value of the
likelihood function can lead to severe over-fitting. To avoid
Denote in matrix form as:
this, a useful way is Bayesian approach which is to seek a
x = Y · w = (Yr + Yn ) · w (2) full posterior density function for θ and σ 2 by imposing some
M ×1 N ×1
additional constraint on the parameters.
Where x ∈ C and w ∈ C are the vector of input
signal and parameters that need to be estimated, respectively. The Bayesian approach is typically divided into two cat-
K and L are refer to the nonlinear order and memory depth egories: (i) maximum a posterior (MAP) estimation using a
respectively. M is the length of samplings, N is the length of fixed prior distribution. (ii) empirical Bayesian approaches that
model parameters that equal to K ×L. Y ∈ CM ×N is the data employ a flexible, parameterized prior that is learned from the
matrix of output signal which divides into two parts, the pure samplings [10]. For the first method, how to choose a sparse
output signal Yr and feedback loop noise Yn . Then rewrite it prior distribution on θ is very critical, and the prior is always
in a more widely used form. determined by the degree of sparsity and the characteristic of
x = Yr · w + ne (3) distribution. In PA linearization systems, the sparsity level of
PA/DPD model coefficients is closely related to the nonlinear
where ne is random noise. behavior of PA and pre-selected model. Namely, the sparsity
During the characterization step, it is assumed that the level is hard to be determined before estimation. In view of
samplings are independent, and the noise which is generated this, a flexible and parameterized prior distribution is chose
in the sampling process is further assumed to be mean-zero in this work, particularly, with the relevance vector machine
Gaussian with variance σ 2 . By now the parameter identifica- (RVM) [11]. A hierarchical prior has been involved in RVM,
tion problem is converted to a linear regression problem with which has high efficient computation for it allows convenient
sparse assumption. conjugate-exponential analysis.
B. Sparse Bayesian Learning Algorithm To begin, A simple prior probability distribution constraints
As signals discussed before are plural form, (3) is trans- on θ is set under the Bayesian framework. On the popular and
formed into real domain here to simplify analysis and calcu- practical side, a zero-mean Gaussian prior distribution over
lation. each element of θ is set.
t = Φθ + (4) 2N
Y
p(θ|α) = N (θi |0, αi−1 ) (8)
where t, ∈ R2M ×1 , θ ∈ R2N ×1 , Φ ∈ R2M ×2N . The i=1
transformation relationship between (4) and (3) is shown
below. where αi−1 is the variance of a Gaussian density function.
Further, to complete the specification of this hierarchical prior,
a Gamma prior is considered over α and σ −2 .
real(x) real(ne )
t= =
imag(x) imag(ne ) 2N
Y
(5) p(α) = Γ(αi |a, b) (9)
real(w) real(Yr ) −imag(Yr ) i=1
θ= Φ=
imag(w) imag(Yr ) real(Yr )
p(σ −2 ) = Γ(σ −2 |c, d) (10)
then estimating θ is tantamount to solving
In the following calculations, these parameters (a, b, c, d) are
min kθk0 s.t. kΦθ − tk2 ≤ ε (6) set to zero to obtain uniform hyperpriors (over a logarithmic
scale) [11]. Then the posterior distribution over θ is given by
Where ε is the acceptable estimation error. k · k0 is commonly
referred to as the l0 -norm, which counts the number of p(t|θ, σ 2 )p(θ|α)
p(θ|t, α, σ 2 ) = (11)
elements that are nonzero. k · k2 is referred to as the l2 -norm, p(t|α, σ 2 )
1549-7747 (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCSII.2016.2534718, IEEE
Transactions on Circuits and Systems II: Express Briefs
SINGH ET AL.: DIGITAL PREDISTORTION FOR POWER AMPLIFIER BASED ON SPARSE BAYESIAN LEARNING 3
Where p(t|α, σ 2 ) can be calculated by a normalising integral: Algorithm 1 sparse Bayesian learning
Z input: t, Φ, halting criterion output: θ
p(t|α, σ 2 ) = p(t|θ, σ 2 )p(θ|α)dθ (12)
initialize:
Then the posterior can be expressed as a Gaussian distribution, σ 2 = 1e−6 initialise σ 2 to a sensible value.
2
where the covariance and mean are: αm = kΦT tkkΦ mk
2 kΦ
mk
−2 −σ 2 initialise with one basis
m
1549-7747 (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCSII.2016.2534718, IEEE
Transactions on Circuits and Systems II: Express Briefs
80 TABLE I
SBL
LS
RUNNING TIME COMPARISON OF SBL AND CS- BASED ALGORITHMS
60
40
Running time
X: 350
NMSE(dB)
R&S FSP7
1copy coefficients
-36
-38
-40 Fig. 3. VSG-VSA test bed and the PA works at 3.5 GHz.
-42
-44 5 10 15 20 25 30
sparsity of the model coefficients by equation (13), and it is solved by a fast RVM algorithm
in this work, Detailed analysis of this algorithm shows that
Fig. 2. Modeling performance comparison of SBL and CS-based algorithms. it has complexity O(N S 2 ). The running time of OMP and
CoSaMP derive from literature [13]. Be mindful that there are
some mathematical techniques could be employed to reduce
processing procedure. however this decomposition may be the calculation cost of OMP and CoSaMP, yet the general,
complicated and inaccurate, especially when the models have the running time in this table is concluded in standard way.
high nonlinear order and long memory depth. This weakness Meanwhile, time consumption on two concrete identification
limits their scope of use. task are also listed in Table I. From these results we can
In the literature [8], the authors have mentioned that CS- see that when the size of data matrix is small, the SBL
based PA behavioral model estimation algorithm do not require algorithm is not as efficient as CS-based algorithm. However,
the columns of the data matrix (here is Φ) are orthogonal, but the computation cost of SBL is comparable to that of CS-based
this requires a sufficient number of samplings as a guarantee. algorithms when the size of the data matrix is large enough.
And in this proposed method there is no similar restrictions on
the data matrix. Another advantage of the proposed SBL is that
it does not need to input the pre-selected sparsity level. SBL IV. E XPERIMENT
can obtain the sparse solution automatically. At the same time, In this section, open-loop DPD with SBL estimation algo-
the sparsity level also can be adjusted by changing the value rithm is used to compensate a GaN PA which works at 3.5
of halting criterion. When the training set is large enough, GHz. The detail of this PA can be found in [14]. In addition,
the performance of SBL and CS-based algorithms are nearly linearization performance comparisons between SBL-DPD and
the same. But when the training set is very small, as 1000 LS-DPD are also presented with this PA. Peak output power
samplings here, the performance of SBL is still very good, and gain of this PA are 43 dBm and 40 dB, respectively.
while the results of CS-based algorithms are not stable and The input of this PA is LTE signal, which is a 16-QAM
sensitive to the input sparsity level. In this condition, the OFDM signal with 1200 occupied subcarriers within 18 MHz
results of SBL, OMP and CoSaMP are plotted in Fig.2. occupied bandwidth and 20 MHz channel bandwidth. PAPR
It is clear that choosing a suitable sparsity level is very of this signal is 10.93 dB. The test bed and DPD structure
critical in CS-based methods. An alternative solution is to are shown in Fig.3 with the PA under test jointly. At first,
estimate the coefficient multiple times with different input original signal that generated in MATLAB is fed to PA through
sparsity level, and then choose the optimal sparsity level by SMJ100A, and FSP7 provides the output samplings to PC.
comparing the results. But that means in the process of CS- Then, processing the output signal and generating predistorted
based method the parameters of DPD will be updated many signal in MATLAB. ACPR and EVM are used to evaluate the
times and the output signal will be sampled repeatedly. This nonlinearity of PA from the point of frequency domain and
is not convenient and requires a lot of processing time. In time domain, respectively.
addition, when the input sparsity is fixed, the needed running Changes of PA characteristic between characterization and
time of these algorithms are presented in the second column linearization steps have a negative impact on the performance
of Table I. The running time of SBL algorithm is dominated of DPD. To overcome this problem, Multiple-Step Iterative
1549-7747 (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
-60
3.44 3.46 3.48 3.5 3.52 3.54 3.56
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information:
Frequency (GHz) DOI 10.1109/TCSII.2016.2534718,
x 10 IEEE 9
SINGH ET AL.: DIGITAL PREDISTORTION FOR POWER AMPLIFIER BASED ON SPARSE BAYESIAN LEARNING 5
2.4 10
1 Original input
LS LS
-46 2 Original output
SBL SBL 0
EVM (%)
-48 3
-30
1.8
-40
-49
4
1.6 -50
-50 1
-60
3.44 3.46 3.48 3.5 3.52 3.54 3.56
1.4 Frequency (GHz) x 10
9
10 20 30 40 10 20 30 40
1549-7747 (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.