Professional Documents
Culture Documents
fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JLT.2020.3029271, Journal of
Lightwave Technology
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 1
Manuscript submitted XX, XX. This work was supported in part by the R&D Y. Yoshida is with Network System Research Institute, National Institute of
contract (FY2017~2020) “Wired-and-Wireless Converged Radio Access Information and Communications Technology (NICT), Koganei, Japan (e-mail:
Network for Massive IoT Traffic” for radio resource enhancement (JPJ000254) yuki@nict.go.jp).
by the Ministry of Internal Affairs and Communications (MIC), Japan. K. Kitayama is with GPI, Japan, and also with Network System Research
Institute, NICT, Japan (e-mail: kitayama@gpi.ac.jp).
P. Zhu is with The Graduate School for the Creation of New Photonics
Industries (GPI), Hamamatsu, Japan (e-mail: zpk@ gpi.ac.jp).
0733-8724 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of Glasgow. Downloaded on November 02,2020 at 02:18:50 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JLT.2020.3029271, Journal of
Lightwave Technology
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 2
TABLE I
SUMMARY OF A-ROF, ADX-ROF, AND D-ROF/CPRI
RoF-based FH scheme A-RoF ADX-RoF (this work) D-RoF/CPRI
1 2
Req. FH bandwidth Down to 1M∙R 1.5M∙R ~ 3M∙R 30M∙R
Fidelity of radio waveform lowest medium highest
WDM A-RoF: 0
DSP latency <500ns + framing latency 3 framing latency
FDM A-RoF [16]: <1µs
WDM A-RoF: WDM TRx array TRx array
Source of FH cost ADX DSP block, 1 TRx
FDM A-RoF: high-speed DAC & ADC, 1 TRx (or 1 ultra-broadband TRx)
1
Here only payload is calculated.
2
Assume PAM4 is used as optical interface.
3
framing latency is different from CPRI case, since data has been significantly compressed.
most existing ADX techniques are still verified by simulation Moreover, such conversion and line (i.e., optical channel)
or offline experiment, in which the processing delay of coding/modulation are jointly optimized, forming a converged
compressor/decompressor is difficult to evaluate. Real-time wired-wireless link [25].
hardware demonstration is of particular importance to quantify RRU
processing
Baseband
…
decode
ADX latency.
ADX
RF front end
O/E
UE1
In a previous work [25], we introduced spatial-domain
ADX
FH network
A/D
M×
…
compression in the multi-user and/or massive MIMO scenarios, E/O
BBU (CU)
and showed the potential CR of 1/10 with EVM<1% by the UEP
space-time ADX concept. Recently, a preliminary field- Fig. 1. Schematic of FH based on ADX-RoF.
programmable gate array (FPGA) based ADX was reported yet
with only 5.5MHz throughput [21]. Note that if circuit The radio waveforms encapsulated in digital optical signals
throughput is not sufficient, large buffering latency would be are then transported over FH network. To efficiently
induced when handling large wireless signal bandwidth (e.g., accommodate many cell sites, next-generation FH might be an
5G NR) and/or heavy traffic load. Therefore, both low latency Ethernet-like network, e.g., a layer-2 network [4, 22]. The
and high throughput has to be realized for a future-proof ADX- ADX-RoF concept with digital optical format is highly
RoF based FH. compatible with packetization, multiplexing and switching. The
In this paper, we present the first real-time ADX enabled signal received at baseband unit (BBU) is processed by ADX
40km RoF transport, with verified low latency and high decoding for MIMO waveform reconstruction, followed by
throughput. We provide detailed ADX hardware design physical-layer DSP.
targeting low-latency, high-throughput and high-fidelity, which Table I shows a summary of A-RoF, ADX-RoF and D-
leads to a real-time ADX supporting 16-channel MIMO with RoF/CPRI based FH schemes. We assume the RRU is equipped
61.44MHz per-channel throughput. Based on the ADX with M antennas, while M/4 radio signal streams each with
prototyped on a single-chip field-programmable radio platform sampling rate of R are received; single-mode fiber is used for
(i.e., Xilinx RFSoC), we experimentally demonstrate 16- FH. ADX-RoF can be regarded as an intermediate of A-RoF
channel MIMO analog reception of 5G NR-bandwidth signals, and CPRI. It can significantly ease optical FH due to bandwidth
real-time ADX processing, and RoF transport over 40km in the efficiency close to A-RoF, while having higher signal fidelity
1.55µm band. With <500ns ADX latency overhead, <1.5% thanks to the high-fidelity ADX DSP and reliable digital optical
average EVM is achieved for 16-channel MIMO 5G NR-class format. Moreover, in 5G and beyond, ADX-RoF instead of
radio signals. CPRI could provide a more practical performance reference for
The rest of the paper is organized as follows. In Section II, designing A-RoF schemes.
we describe the concept of fronthaul based on ADX-RoF. Figs. 2(a) and (b) show 2 possible ADX architectures: since
Section III presents in detail the key approaches of hardware MIMO data can be represented by a 2-dimensional (space and
design to realize low-latency, high-throughput and high-fidelity Ch. 1
compress.
…
…
…
compress. compress.
analog-to-digital converters (ADC). Subsequently, baseband
MIMO data are converted to digital stream by the ADX, with Ch. M Ch. K Ch. K
0733-8724 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of Glasgow. Downloaded on November 02,2020 at 02:18:50 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JLT.2020.3029271, Journal of
Lightwave Technology
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 3
…
optimal from the information theoretic point of view. However, x4 …
…
… WM×K
yK
in this work, we mainly discuss the latter architecture from Ch M xM
Ch K
…
…
of optical link (e.g., by multi-level PAM), which completes the
…
x4 WM×K yK Ch K
concept and makes joint optimization of compression and …
optical link feasible. Ch M xM FIFO
Update
III. HARDWARE DESIGN OF LOW-LATENCY, HIGH- (b)
THROUGHPUT, HIGH-FIDELITY ADX
As discussed in Section I, it is crucial to quantify processing (auxiliary)
Spatial
latency and throughput of FH ADX by real-time hardware Filter
fidelity ADX for real-time ADX-RoF based on FPGA hardware Fig. 3. (a) Schematic of FH spatial compressor based on subspace tracking SF.
platform. (b) Schematic of designed high-throughput spatial compressor hardware.
matrix X, L being the number of signal samples in each 𝑾(𝑡) = 𝑾(𝑡 − 1) + [𝒙(𝑡) − 𝑾(𝑡 − 1)𝒚(𝑡)]𝒈(𝑡)𝑯 (5)
compression operation. The best-known solution (in the least Although a lot of theoretical efforts have been made on FST
square sense) is to perform singular value decomposition techniques [28-36], to the authors’ knowledge, few works have
(SVD) of X such as by principal component analysis (PCA) investigated the latency or throughput of FST techniques from
algorithm [27], but its feasibility for FH is hindered by the high a hardware perspective. In the following, we present the design
complexity and latency due to batch-type processing. procedure of FST-based SF circuit.
We point out that it is possible to achieve low-complexity, First, in order to select a suitable algorithm for hardware
even real-time implementable K-L transform for FH spatial implementation, we investigate computation details of
compression, owing to the feature of this task: it’s not necessary candidate FST algorithms. In particular, we focus on the latency
to obtain individual principal components; instead, extracting and multiplier consumption of filter update part, since the
the signal subspace as a whole suffices (i.e., only a part of former largely impacts on the throughput of SF while the latter
eigen-structure is needed [28]). Based on this idea, we have is the dominant measure of complexity.
proposed adaptive FH spatial compression (or “spatial filter”, The summary of analysis is given in Table II. For the latency
SF) with complexity of O(MK) per iteration base on fast of filter update (column 5), the number of division (column 3)
subspace tracking (FST) [25]. and square root (column 4) operations have the most significant
Fig. 3(a) shows the schematic of FST-based SF, which impact. This is because to achieve high computation accuracy
consists of a filtering/compression part and a feedback filter and fidelity of SF, division is usually implemented by Radix-2
update part. The input MIMO signal at each time index is algorithm [37] and square root is usually implemented by
modeled as an M-by-1 vector x, while the SF is a M-by-K matrix CORDIC algorithm [38]. As a result, their latencies can be >10
W. The SF outputs the compressed K-by-1 (P≤K<M) vector times higher than that of other arithmetic operations like
𝒚 = 𝑾𝑯 𝒙. Larger K corresponds to more spatial diversity gain multiplication and addition. On the other hand, the number of
(thus improved fidelity), but also increased complexity of SF multipliers used in filter update are given in column 7.
and CR [25]. Coefficients of SF W are updated by different FST Conventionally, the complexity of FST was evaluated by the
algorithms, such as by iteratively solving a certain minimization order of the number of (complex) multiplications. However, for
problem, e.g., Eq. (1) [28, 31], where E(*) and ||*|| denote hardware implementation, more accurate analysis is needed.
expectation and 2-norm respectively. Our estimation indicates that although all FST algorithms have
0733-8724 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of Glasgow. Downloaded on November 02,2020 at 02:18:50 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JLT.2020.3029271, Journal of
Lightwave Technology
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 4
TABLE II
SUMMARY OF COMPUTATION DETAILS OF STATE-OF-THE-ART FAST SUBSPACE TRACKING (FST) ALGORITHMS
Num. of sequential Num. of Estimated latency of filter Num. of complex multipliers for Estimated num. of DSP48
Algorithm Ref.
division square root update (unit: clock cycle) filter update** consumed (M=16, K=4)
PAST [28] 1 0 18* 2MK+2.5K2+K 688
OPAST [29] 2 1 40 3MK+2K2+1.5M+2.5K 1032
CPAST [30] 1 0 26 3MK+5.5K2+2K 1152
FAPI [31] 3 1 66 2MK+1.5M+5K2+6.5K 1032
NIC [32] 1 0 19 2MK+3.5K2+K 752
FDPM [33] 2 1 58 5.5MK+M+2K 1504
FSDPM [34] 2 2 63 2MK+3M+1.5K 728
FOOJA [35] 1 1 58 6MK+2.5K 1576
SOOJA [36] 2 1 63 2MK+3M+1.5K 728
Assuming 1 division needs 10 clock cycles [34]; 1 square root needs 16 clock cycles [35]; 1 complex multiplier equals 4 real multipliers.
*Experimental value. **Excluding “forward filtering” block which consumes the same complex multipliers (MK) for all algorithms.
Input: x(t) (M×1) 𝑯 in Fig. 3(b). The forward filtering part (i.e., matrix-vector
Forward 𝒚 𝑡 = 𝑾(𝑡 − 1) 𝒙 𝑡
(a) Filtering multiplication) operates at a latency of 2 clock cycles (CC) or
Z-2 (2 clk
cycles) 𝑡 = 𝑷(𝑡 − 1)𝒚(𝑡) Z-2
≈17ns with a throughput of FPGA clock rate. PAST-based filter
𝑡 = 𝒙 𝑡 −𝑾 𝑡−1 𝒚 𝑡
(2 clk cycles) update part works at a throughput of 6.144MHz. SF is updated
(2 clk cycles) every T input sample period. T is positively correlated with
𝑡 = 𝛽 + 𝒚(𝑡)𝑯 𝑡
Z-13 (2 clk cycles)
throughput of the spatial compressor, and negatively correlated
Z-2 with converging speed of the filter update part. We selected
𝑾 𝑡 = 𝑾 𝑡 − 1 + 𝑡 𝒈(𝑡)𝑯
(3 clk cycles) T=10 for the throughput of 61.44MHz. The filter update part
Z-11
1
converges within ≈102T samples, or ≈20µs in our
𝑷 𝑡 = 𝑇𝑟𝑖 𝑷 𝑡 − 1 − 𝒈 𝑡 (𝑡)𝑯
𝛽 𝒈 𝑡 = 𝑡 (𝑡) implementation, which is still much faster than the variation
(3 clk cycles) (10+1 clk cycles)
speed of MIMO channel (typically, ms-level [40]). Thus, the SF
(b) circuit can deal with dynamic cases.
6.4 P=4
P=4 (w/o SF) Next, the internal precision of the SF circuit is designed.
3.2 P=2 There is a trade-off between computation precision and FPGA
P=2 (w/o SF) resource consumption. Also, it is found that the precision
1.6
P=1
EVM (%)
P=1 (w/o SF) requirement of matrices W and P is higher than other internal
0.8
variables. In this work, 25-digit precision is designed for
0.4 matrices W and P, and 18-digit precision is used for other
0.2 internal variables.
Based on the discussions above, we designed a 16-by-4 SF
0.1
20 25 30 35 40 45 50 circuit with processing latency of 17ns and per-channel
Wireless SNR (dB) throughput of 61.44MHz at 122.88MHz clock. The detailed
Fig. 4. (a) Block diagram of designed PAST-based filter update circuit of the block diagram of designed filter update part is shown in Fig.
spatial compressor (or SF). Matrices, vectors and scalars are marked by bold 4(a). Before hardware prototyping, we assess the performance
upper-case letters, bold lower-case letters and regular letters. Operator Tri{*}
indicates that only the lower triangular part of the matrix is calculated, and its
of the designed SF circuit by Xilinx Vivado 2018.3 FPGA
Hermitian transposed version is copied to the upper triangular part [25]. Simulator. The 16-channel inputs of the SF circuit are generated
β=0.9999. (b) Simulation results of EVM versus wireless SNR of the designed in Matlab through P OFDM signal generation, P×16 MIMO
SF circuit. Wireless SNR: ratio of the total power of 16-channel signals to the
total power of added noise. Dashed lines: demodulated EVM without SF
channel propagation, noise adding, and 12-bit digitizing.
compression. Spatial multiplexing MIMO technology without precoding is
assumed. Independent and identically distributed (i.i.d.)
the same order of complexity O(MK), the difference in Rayleigh fading is assumed for MIMO channel, emulating a
consumed FPGA multipliers (i.e., DSP48) can be >800. As TABLE III
reference, medium-class FPGAs may have only about NUMEROLOGY OF THE TEST 5G-NR-LIKE OFDM SIGNAL
600~2000 DSP48s in total. Parameter Value / description
Based on the analysis, we select PAST as the basic FST Num. of independent streams (P) 1~4
algorithm for SF hardware implementation, which has the Modulation format 1024QAM
lowest latency of filter update and lowest complexity. IFFT size 4096
However, the direct implementation of PAST (based on Fig. Num. of data subcarriers 3300
Cyclic prefix length (samples) 512
3(a)) results in a throughput of only 5.5 MHz [21], which is still
Subcarrier spacing (kHz) 15
much smaller than 5G NR bandwidth [39]. It thus requires Signal bandwidth (MHz) 50
buffers (e.g., FIFO) at ADX inputs with high latency Payload length 2 OFDM symbols
proportional to the wireless packet size. In practice, high circuit MIMO training symbol (TS) QPSK-OFDM [39], space-time
format block coding (STBC) structure [41]
throughput is crucial to eliminate such buffering latency. To
achieve 5G NR-class throughput, we propose to decouple the
forward filtering part and feedback filter update part, as shown
0733-8724 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of Glasgow. Downloaded on November 02,2020 at 02:18:50 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JLT.2020.3029271, Journal of
Lightwave Technology
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 5
…
measuring the total power of 16-channel signals, calculating the Im ADPCM
SBF
noise power based on the wireless SNR value, and then (c) ADPCM
generating and adding white Gaussian noise component to each
Fig. 6. (a) Magnitude response of low-pass (solid) and high-pass (dotted) filter
sample of each channel. decompression and demodulation are of 2-band 16-tap Johnston type-B QMF. (b) Schematic of ADPCM encoder
based on Matlab. The numerology of test signals is given in with halved-rate prediction. Reg.: register. (c) Schematic of the entire temporal
Table III. 768-sample preamble (copy of a portion of payload) compressor.
is added in front of TS to assist the convergence of SF. The high-pass SBFs.
result of EVM versus wireless SNR is shown in Fig. 4(b). The The second approach is to improve the throughput of single
EVM curves without any compression are also plotted as ADPCM circuit. As shown in Fig. 5, the throughput of ADPCM
reference. The SF circuit shows negligible compression penalty is limited by the latency in 2 loops: (i) (inner) loop of
until the EVM region of <0.4%, which nevertheless has an quantization--gain adaptation and (ii) (outer) loop of
insignificant impact regarding the EVM target of 2.5% for quantization--prediction. To reduce latency in loop (i), a
1024QAM. simplified gain-adaptive quantizer (compared with standard
B. Temporal compressor design 16
4
ADPCM is shown in Fig. 5. However, existing ADPCM
2
hardware has throughput of only <1MHz (commercial) [43] or
1
~7MHz (home-made) [44]. This would induce severe buffering
latency for wideband 5G NR-class radio signals. 0.5
-27 -21 -15 -9 -3 0 3 9 15 21
To achieve high-throughput temporal compressor, 2
(b) Relat. power of input signal (dB)
approaches are proposed. The first is to use a bank of
40
subbanding filters (SBF) before quantizer. In audio field, SBF
Largest power diff. (dB)
0733-8724 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of Glasgow. Downloaded on November 02,2020 at 02:18:50 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JLT.2020.3029271, Journal of
Lightwave Technology
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 6
processing
DAC ADC
Baseband
ADX dec.
Line dec.
1 DAC ADC
VOA PD
IF interface
IF interface
DAC
cable
ADC ≈4Gb/s
SMF
Line coding
OOK
RAM
2
ADX
GTY
MZM
x16 GTY cable
…
(i)
P
DAC ADC
laser
DAC ADC DAC→ADC
cables
(ii)
RFSoC
Emulated offline RFSoC (real-time)
Fig. 8. Experimental setup. IF: intermediate frequency. RAM: random access memory. Inset (i): photo of RFSoC-based prototype; (ii): Measured spectra of 3 IF
channels among 16.
G.726 audio ADPCM) is designed [47], in which exponential without exceeding 1024QAM threshold. Such a wide dynamic
and logarithm operations on the signal are removed. The range is important for our ADX architecture, since the temporal
quantization latency is reduced from 5 CC [44] to 2 CC. To compressor processes SF outputs, which have different power
reduce latency in loop (ii), we halve the rate of prediction from among channels. Fig. 7(c) depicts the largest power difference
once a sample to once 2 samples (Fig. 6(b)). Overall, 30.72MHz among 4 output channels of 16-by-4 SF. Specifically, 500
ADPCM throughput can be achieved at fClk=122.88MHz. MIMO channel realizations were simulated to obtain
Combined with 2-band SBF, 61.44MHz throughput is 500×4=2000 power values and the “largest power difference”
achieved. In addition, the temporal compressor circuit has μs- is the difference between the largest value and the smallest
level convergence time, which is also much shorter than the value. The largest power differences of cases of 1~4 signal
MIMO channel coherence time. streams are 33.1dB, 21dB, 15.4dB and 7.4dB respectively,
In addition, after the simplification described above, we which can all be handled by the >36dB dynamic range of the
design higher-resolution compressor to support 5G format like temporal compressor. In Figs. 7(a)-7(c), the signal bandwidth is
1024QAM. We use Matlab model to seek suitable quantizer 50MHz as in Table III. In addition, Fig. 7(d) shows the EVM
resolution for 1024QAM. Fig. 7(a) shows Matlab simulation performance versus signal bandwidth, varied by changing the
results of single-channel NR-like OFDM signal compression by number of data subcarriers and maintaining 15kHz subcarrier
using the designed 61.44MHz-throughput temporal spacing. Although the current temporal compressor architecture
compressor. Performance of low-throughput ADPCM [44] with becomes quite different from conventional ADPCM, certain
FIFO are also plotted. It is seen that 1024QAM cannot be bandwidth adaptivity can still be enjoyed as EVM is improved
supported by conventional 5 or 6-bit ADPCM [42, 44]. This when the signal bandwidth is less than 30MHz.
highlights the importance of our high-resolution design.
Considering proper EVM margin, 8-bit resolution is selected. IV. EXPERIMENTAL SETUP, RESULTS AND DISCUSSION
Based on the discussions above, we designed the 61.44MHz By cascading the designed SF and temporal compressor, a
temporal compressor circuit under the clock rate of 16-channel ADX with 61.44MHz per-channel throughput is
122.88MHz. Its schematic is shown in Fig. 6(c). The internal designed. The ADX is then prototyped on the real-time
precisions are 18-digit and 20-digit. FPGA simulation result of hardware. The resource consumption is given in Table IV. The
8bit temporal compressor is also plotted in Fig. 7(a). The hardware selected in this work is Xilinx RFSoC (XCZU29DR-
performance compromise appeared in 6-bit quantization case is 2FFVF1760E), which is a single-chip field-programmable
due to the techniques used to achieve NR-class throughput radio platform that integrates 16-channel, GHz-bandwidth
(reduced prediction rate of ADPCM and reduced effectiveness digital-to-analog converter (DAC) and analog-to-digital
of differential operation after SBF). converter (ADC) arrays, digital radio frequency (RF) chain and
The performance of the temporal compressor circuit is FPGA. By leveraging this platform, we were able to
evaluated by Vivado FPGA simulator. Fig. 7(b) shows EVM demonstrate in real time not only the ADX but also the 16-
versus input signal power of the 8-bit temporal compressor, channel analog radio reception with real effects of DAC and
where 0-dB relative power of input signal corresponds to the ADC [55]. Moreover, the on-chip integrated DACs and ADCs
lowest EVM. Input power variation of >36dB can be tolerated are tightly synchronized with good guarantee, which largely
resolves the synchronization, timing offset and jitter issues
TABLE IV
RESOURCE CONSUMPTION OF THE REAL-TIME ADX usually existed in prototypes using discrete
LUT Block RAM DSP48
components/devices [49].
We conducted ADX-RoF-based FH transport experiment
Spatial compressor 21027 15 1164
Temporal compressor 72741 0 288 based on the real-time ADX prototype. The setup is depicted in
Total 93768 15 1452 Fig. 8. In offline emulation, P independent 5G NR-like OFDM
Available 425280 1080 4272 streams were transmitted over P×16 MIMO i.i.d. Rayleigh
Utilization 22% 1.4% 34% fading channel, resulting in 16-channel signals. The signal
0733-8724 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of Glasgow. Downloaded on November 02,2020 at 02:18:50 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JLT.2020.3029271, Journal of
Lightwave Technology
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 7
-20
P=4, Exp. Stream 1 Stream 2 2.6E-05
BtB
BtB
1.3E-05 40km 8 -22
8 P=4, Sim. 40km
20*log10(EVM) (dB)
BtB (BER<1e-7) -24
P=2, Exp. 6.4E-06 1024QAM
40km (BER<1e-7) -26
P=2, Sim.
EVM (%)
3.2E-06 4 -28
4
1024QAM
BER
1.6E-06 -30
EVM (%)
-32
8.0E-07
2 2 -34
4.0E-07
Stream 3 Stream 4 -36
2.0E-07 -38
1 1.0E-07 1 -40
-19 -18.5 -18 -17.5 -17 -19 -18.5 -18 -17.5 -17
(a) Rx Optical Power (dB) (b) Rx Optical Power (dB)
0.5
20 30 40 50
(a) Wireless SNR (dB) (b)
Fig. 9. Experimental results. (a) EVM vs. wireless SNR. Matlab floating-point
simulation results are also plotted. (b) Constellations of 4 signal streams
(wireless SNR=50dB).
0733-8724 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of Glasgow. Downloaded on November 02,2020 at 02:18:50 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JLT.2020.3029271, Journal of
Lightwave Technology
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 8
40km
RF chain delay from RAM to ADX latency
ADX (1.98µs) (131ns) (T rigger pulse)
Line coding (16ns)
Fig. 11. Experimental results: latency break-down of the demonstrated ADX-RoF link.
6 link BER without exceeding 1024QAM EVM threshold. In this work, we focus on ADX-RoF in uplink fronthaul,
Thus, one can adopt simple link FEC or even omit FEC as a considering fading, noise, and DAC/ADC effects. For the
joint design of wireless part (compression) and wired part downlink on the other hand, if BBU-side digital precoding and
(channel coding). In addition, Fig. 10(c) shows complementary Option 8 split are assumed, in principle MIMO data
cumulative distribution function (CCDF) of EVM when compression can be modelled as a problem of multivariate
received optical power is -17dBm after 40km ADX-RoF compression [23]. In this case, similar ADX technique is also
transport. Although the impact of MIMO fading channel has applicable for downlink, and the achievable CR depends on the
been included in this demonstration, EVM distribution was number of MIMO layers, specific precoding technique and
reasonably concentrated. All measured EVMs met 1024QAM modulation order. Compared with the uplink, the spatial
requirement. compression in the downlink may not be done blindly, since
Finally, the ADX latency experimentally measured by BBU already possesses MIMO channel information via
Vivado Integrated Logic Analyzer (ILA) is shown on the left reference signals. Detailed investigation of downlink ADX-
side of Fig. 11. The ADX had a latency of 131ns (i.e., 16 CC), RoF will be a subject of future work.
including 16ns from spatial ADX, 74ns from SBF, 33ns from
ADPCM, and 8ns register. Moreover, according to Vivado V. CONCLUSION
FPGA simulation, the latency of ADX decompression at BBU RoF technologies have been actively investigated in the
is 359ns (i.e., 44 CC). In total, the ADX-induced one-way application of FH in C-RAN. Targeting 5G and beyond FH, this
latency overhead is (131+359)ns<500ns. Fig. 11 also shows paper focuses on an ADX-RoF solution based on low-latency
latency break-down of the ADX-RoF link (except the offline MIMO data compression, from a hardware design perspective.
BBU part), including measured RF chain delay (1.98µs), line We have proposed and discussed in detail the ADX hardware
coding latency (16ns), GTY latency (~260ns), and fiber design, which targets low-latency, high-throughput, high-
propagation delay of ~196.15µs as shown by the captured fidelity, and large-scale MIMO support. With the newly-
waveforms in optical BtB and 40km transmission cases. It is designed ADX prototyped on a single-chip field-programmable
clear that our real-time ADX has negligible latency overhead radio platform, we have succeeded in real-time 16-channel
compared with either fiber propagation delay or 3GPP one-way MIMO radio reception, ADX processing and RoF transport,
transport network latency budget of 250µs [5]. where the ≈4Gb/s OOK signal encapsulates 1024QAM radio
In the experiments, we demonstrated that even for the most signals with 5G-NR-class bandwidth at a compression ratio of
challenging case of 1024QAM (highest modulation order in 5G 13.3%. The decompressed EVM after 40km transport is well
NR), the real-time ADX can support its fronthauling with a CR below 3GPP 1024QAM threshold, while one-way ADX latency
of only 13.3% and ultra-low processing latency of <500ns. To overhead (including compression and decompression latency)
support this challenging scenario, not only high-fidelity ADX was <500ns. The real-time demonstration results have shown a
was needed, but also high-performance DACs and ADCs were good potential of ADX-RoF for next-generation FH, as well as
used [51, 52]. If the DACs/ADCs with lower effective number other applications such as dedicated indoor access network
of bits (ENoB) are employed, it would be beneficial for the cost based on distributed antenna system (DAS) [48].
and power consumption of the wireless system. Meanwhile, the
received signal quality prior to FH transport would be degraded. REFERENCES
In this case, the wireless system may adopt lower-order [1] Y. Yoshida, “Mobile Xhaul evolution: enabling tools for a flexible 5G
modulation format, also allowing more tolerance to ADX Xhaul network,” in Proc. OFC, San Diego, USA, 2018, paper Tu2K.1.
compression error. Thus, simpler ADX with smaller CR can be [2] X. Ge, S. Tu, G. Mao, C.-X. Wang, and T. Han. "5G ultra-dense cellular
used such as the 7-bit or 6-bit temporal compressor shown in networks." IEEE Wireless Commun., vol. 23, no. 1, pp. 72-79, 2016.
[3] 3GPP, “Study on scenarios and requirements for next generation access
Fig. 7(a), which further relaxes FH bandwidth requirement. On technologies,” TR 38.913, V15.0.0, 3GPP, Jun. 2018.
the other hand, an extreme scenario toward ENoB reduction is [4] C.-L. I, H. Li, J. Korhonen, J. Huang, L. Han, RAN revolution with NGFI
MIMO systems with few-bit (e.g., 1~3-bit) data converters [53, (xhaul) for 5G, J. Lightw. Technol., vol. 36, no. 2, pp. 541-550, 2018.
[5] 3GPP, “Study on new radio access technology: Radio access architecture
54]. The ADX design for such scenario is an interesting and interfaces,” TR 38.801, V14.0.0, Mar. 2017.
research direction. [6] CPRI V7.0, technical report, 2015.
0733-8724 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of Glasgow. Downloaded on November 02,2020 at 02:18:50 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JLT.2020.3029271, Journal of
Lightwave Technology
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 9
[7] N. Shibata, T. Tashiro, S. Kuwano, N. Yuki, Y. Fukada, J. Terada, and A. [30] A. Valizadeh and M. Karimi, “Fast subspace tracking algorithm based
Otaka, “Performance evaluation of mobile front-haul employing on the constrained projection approximation,” EURASIP Journal on
Ethernet-based TDM-PON with IQ data compression,” J. Opt. Comm. Advances in Signal Processing, ID 576972, 2009.
Netw., vol. 7, no. 11, pp. B16-B22, 2015. [31] R. Badeau, B. David, and G. Richard, “Fast approximated power
[8] M. Xu, F. Lu, J. Wang, L. Cheng, D. Guidotti, G.-K. Chang, “Key iteration subspace tracking,” IEEE Trans. Sig. Proc., vol. 53, no. 8, pp.
technologies for next-generation digital RoF mobile fronthaul with 2931-2940, 2005.
statistical data compression and multiband modulation,” J. Lightw. [32] Y. Miao and Y. Hua, “Fast subspace tracking and neural network
Technol., vol. 35, no. 17, pp. 3671-3679, 2017. learning by NIC,” IEEE Trans. Sig. Proc., vol. 46, no. 7, pp. 1967-1979,
[9] S. Kim, H. Chung, S. Kim, “Experimental demonstration of CPRI data 1998.
compression based on partial bit sampling for mobile front-haul link in C- [33] X. Doukopoulos and G. Moustakides, “The fast data projection method
RAN,” Proc. OFC, Anaheim, USA, 2016, pp. W1H.5. for stable subspace tracking,” in Proc. EUSIPCO, Antalya, Turkey,
[10] L. Ramalho, M. N. Fonseca, A. Klautau, C. Lu, M. Berg, E. Trojer, S. 2005.
Höst, An LPC-based fronthaul compression scheme, IEEE Commun. [34] R. Wang, M. Yao, D. Zhang, and H. Zou, “A novel orthonormalization
Lett., vol. 21, no. 2, pp. 318-321, 2017. matrix based fast and stable DPM algorithm for principal and minor
[11] H. Li, X. Li, and M. Luo, “Improving performance of differential pulse subspace tracking,” IEEE Trans. Sig. Proc., vol. 60, no. 1, pp. 466-472,
coding modulation based digital mobile fronthaul employing noise 2012.
shaping,” Opt. Express, vol. 26, no. 9, pp. 11407-11417, 2018. [35] S. Bartelmaos and K. Abed-Meraim, “Principal and minor subspace
[12] L. Zhang, A. Udalcovs, R. Lin, O. Ozolins, X. Pang, L. Gan, R. Schatz, tracking: algorithms & stability analysis,” in Proc. ICASSP, Toulouse,
M. Tang, S. Fu, D. Liu, W. Tong, S. Popov, G. Jacobsen, W. Hu, S. Xiao, France, 2006, vol. III, pp. 560-563.
and J. Chen, “Toward terabit digital radio over fiber systems: architecture [36] R. Wang, M. Yao, D. Zhang, and H. Zou, “Stable and orthonormal OJA
and key technologies,” IEEE Commun. Magazine, vol. 57, no. 4, pp. 131- algorithm with low complexity,” IEEE Sig. Proc. Lett., vol. 18, no. 4,
137, 2019. pp. 211-214, 2011.
[13] J. Zhang, M. Xu, J. Wang, F. Lu, L. Cheng, H. Cho, K. Ying, J. Yu, and [37] Xilinx, Divider Generator v5.1 LogiCORE IP Product Guide, 2016.
G.-K. Chang, “Full-duplex quasi-gapless carrier aggregation using FBMC [38] Xilinx, CORDIC v6.0 LogiCORE IP Product Guide, 2017.
in centralized radio-over-fiber heterogeneous networks,” J. Lightw. [39] 3GPP, “NR; Physical channels and modulation,” TS 38.211, V15.3.0,
Technol., vol. 35, no. 4, pp. 989-996, Feb. 2017. Sept. 2018.
[14] P. T. Dat, A. Kanno, and T. Kawanishi, “Radio-on-radio-over-fiber: [40] E. Bjornson, L. Van der Perre, S. Buzzi, and E. Larsson. “Massive
efficient fronthauling for small cells and moving cells,” IEEE Wireless MIMO in sub-6 GHz and mmWave: Physical, practical, and use-case
Commun., vol. 22, no. 5, pp. 67-75, 2015. differences,” IEEE Wireless Commun., vol. 26, no. 2, pp. 100-108, Apr.
[15] S. Ishimura, A. Bekkali, K. Tanaka, K. Nishimura, and M. Suzuki, 2019.
“1.032-Tb/s CPRI-equivalent rate IF-over-fiber transmission using a [41] G. Stuber, J. Barry, S. Mclaughlin, Y. Li, M. Ann Ingram, and T. Pratt,
parallel IM/PM transmitter for high-capacity mobile fronthaul links,” J. “Broadband MIMO-OFDM wireless communications,” Proceedings of
Lightw. Technol., vol. 36, no. 8, pp. 1478-1484, 2018. the IEEE, vol. 92, no. 2, pp. 271-294, Feb. 2004.
[16] X. Liu, H. Zeng, N. Chand, and F. Effenberger, “Efficient mobile [42] ITU-T Recommendation G.726, 1990.
fronthaul via DSP-based channel aggregation,” J. Lightw. Technol., vol. [43] https://www.oreganosystems.at/products/ip-cores/adpcm
34, no. 6, pp. 1556-1564, 2016. [44] P. Zhu, Y. Yoshida, and K. Kitayama, “Real-time FPGA demonstration
[17] B. G. Kim, H. Kim, and Y. C. Chung, “Impact of multipath interference of low-latency adaptive fronthaul compression based on adaptive
in the performance of RoF-based mobile fronthaul network implemented differential pulse code modulation,” in Proc. ACP, Hangzhou, China,
by using DML,” J. Lightw. Technol., vol. 35, no. 2, pp. 145-151, Jan. 2018, paper S3K.7.
2017. [45] M. Smyth, S. Smyth, “APT-X100: A low-delay, low bit-rate, sub-band
[18] C. Lim, Y. Tian, C. Ranaweera, T. A. Nirmalathas, E. Wong, and K.-L. ADPCM audio coder for broadcasting,” Proc. Audio Engineering
Lee, “Evolution of radio-over-fiber technology,” J. Lightw. Technol., vol. Society Conference: 10th International Conference, 1991.
37, no. 6, pp. 1647-1656, Mar. 2019. [46] U. Meyer-Baese. Digital signal processing with field programmable gate
[19] A. Kipnis, Y. Eldar, and A. Goldsmith, “Analog-to-digital compression: arrays. Vol. 65. Berlin: Springer, 2007.
A new paradigm for converting signals to bits,” IEEE Signal Processing [47] P. Zhu, Y. Yoshida, K. Kitayama, “FPGA demonstration of adaptive
Magazine, vol. 35, no. 3, pp. 16-39, Mar. 2018. low-latency high-fidelity analog-to-digital compression for beyond-5G
[20] 3GPP TS 36.104, V15.2.0, Mar. 2018. wireless-wired conversion,” in Proc. OECC/PSC, Fukuoka, Japan, 2019,
[21] P. Zhu, Y. Yoshida, and K. Kitayama, “FPGA demonstration of adaptive paper TuF3-4.
space-time compression towards high-fidelity, low-latency 5G fronthaul,” [48] J. Kim, M. Sung, E.-S. Kim, S.-H. Cho, and J. H. Lee, “4×4 MIMO
in Proc. ECOC, Dublin, Ireland, 2019, paper M1C.2. architecture supporting IFoF based analog indoor distributed antenna
[22] P. Assimakopoulos, J. Zou, K. Habel, J.-P. Elbers, V. Jungnickel, and N. system for 5G mobile communications,” Opt. Express, vol. 26, no. 22,
J. Gomes. "A Converged Evolved Ethernet Fronthaul for the 5G Era." pp. 28216-28227, Oct. 2018.
IEEE J. Selected Areas in Commun., vol. 36, no. 11, pp. 2528-2537, Nov. [49] B. Yang , Z. Yu, J. Lan, R. Zhang, J. Zhou, and W. Hong, “Digital
2018. beamforming-based massive MIMO transceiver for 5G millimeter-wave
[23] S.-H. Park, O. Simeone, O. Sahin, and S. Shamai, “Fronthaul compression communications,” IEEE Transactions on Microwave Theory and
for cloud radio access networks: signal processing advances inspired by Techniques, vol. 66, no. 7, pp. 3403-3418, Jul. 2018.
network information theory,” IEEE Signal Processing Magazine, vol. 31, [50] D. Tse and V. Pramod, “Fundamentals of wireless communication,”
no. 6, pp. 69-79, Nov. 2014. Cambridge university press, 2005.
[24] L. Theis, W. Shi, A. Cunningham, and F. Huszár. "Lossy image [51] Xilinx, “Understanding key parameters for RF-sampling data
compression with compressive autoencoders." arXiv preprint converters,” WP509, Feb. 2019.
arXiv:1703.00395 (2017). [52] Xilinx, “Zynq UltraScale+ RFSoC data sheet: DC and AC switching
[25] P. Zhu, Y. Yoshida, and K. Kitayama, “Adaptive space-time compression characteristics,” DS926, Jun. 2019.
for efficient massive MIMO fronthauling,” Optics Express, vol. 26, no. [53] S. Jacobsson, G. Durisi, M. Coldrey, U. Gustavsson and C. Studer,
18, pp. 24098-24113, 2018. “Throughput analysis of massive MIMO uplink with low-resolution
[26] E. Bjornson, E. G. Larsson, and M. Debbah, “Massive MIMO for ADCs”, IEEE Trans. Wireless Commun., vol. 16, no. 6, pp. 4038-4051,
maximal spectral efficiency: how many users and pilots should be Jun. 2017.
allocated?” IEEE Trans. Wireless Commun., vol. 15, no. 2, pp. 1293- [54] S.-N. Hong and S. Kim, “Machine learning-based nonlinear MIMO
1308, 2016. detector,” in Machine Learning for Future Wireless Communications,
[27] J. Choi, B. L. Evans, and A. Gatherer, “Space-time fronthaul compression John Wiley & Sons, 2020, pp. 181-195.
of complex baseband uplink LTE signals,” in Proc. ICC, Kuala Lumpur, [55] P. Zhu, Y. Yoshida, K. Kitayama, “<500ns Latency overhead analog-to-
Malaysia, 2016, pp. 1-6. digital-compression radio-over-fiber (ADX-RoF) transport of 16-
[28] B. Yang, “Projection approximated subspace tracking,” IEEE Trans. Sig. channel MIMO, 1024QAM signals with 5G NR bandwidth,” in Proc.
Proc., vol. 43, no. 1, pp. 95-107, 1995. OFC 2020, paper M2F.7.
[29] K. Abed-Meraim, A. Chkeif, and Y. Hua, “Fast orthonormal PAST
algorithm,” IEEE Sig. Proc. Lett., vol. 7, no. 3, pp. 60-62, 2000.
0733-8724 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of Glasgow. Downloaded on November 02,2020 at 02:18:50 UTC from IEEE Xplore. Restrictions apply.