Professional Documents
Culture Documents
Contents
Introduction W-CDMA Channel Estimation and Detection DSP Implementation ASIC Implementation Other Current Projects Future Work
D M
Algorithms Implementation issues
Implementation Issues
Important because
Real-time Low Power Mobility /Size
DSPs
Signal Processing Communications
ASICs / FPGAs
Speed / Size
W-CDMA
Direct Path
User 1 User 2
Direct Path
Channel Estimation
Need to know the Channel for proper detection
Delays and Amplitudes : Multiuser/path
Detection
Use knowledge of channel for detection of Data bits
W-CDMA Standards
Downlink
Channel Estimation - Common Pilot Detection : Rake Receivers/ Equalizers
Channel Estimation
Uplink
Time Multiplexed
Maximum Likelihood Subspace
Downlink
Continuous
LMS Based Adaptive
Multiuser Detection
Optimal
MLSE (Viterbi)
Sub-optimal
Linear
MAI Whitening
Decorrelating MMSE
Interference Cancellation
Serial SIC Parallel PIC
Neural Network
Base-Station Receiver
Antenna
Data
Multiuser Detector
Demodulator
Demux
Decoder
Pilot
Channel Estimator
y1
User 1 d1 '
User 2 d2
y2
MultiUser Detector
Channel Decoder
User 2 d2'
Filter
R(t)
yK
User K dK'
Channel Estimator
Send a time-multiplexed Preamble (Pilot). Channel properties extracted Compare with known pilot and estimate. Keep estimate for remaining data bits (static). Repeat preamble every frame, if no tracking.
Y Rbr Rbb-1 .
1 r.b L
1 bb
rb
R r bR Y
Offline
k U
R k
'R
() k U k U
H k2
R k
1 k 2
y(
H k
Real Square roots. Solving quadratic equation for least squares fit.
R D S ST
D R
ST
Az
Az
( l 1)
where x ( xk {0,2,2})
( l 1)
( S S ) Ax d ( l 1) d ( l 2 )
T
( l 1)
B (S S ) A
T
Dot Product:
l 1 k
j l z Bijx
l k
Computed iteratively
TI Tools Used
C Compiler ver 3.0 from Code Generation Tools Code Composer ver 4.02 for profiling
Floating point implementation found more feasible due to matrix inversions and square-roots.
Code optimized for the DSP Use of Specialized approximate instructions Approximate reciprocal square roots Approximate reciprocals
Use of Assembly Code for critical part. TI's C67 floating point benchmarks for Matrix-Vector Multiplication & Dot Product
120 100
28 1
80
60 40 20 0 0
10% improvement
34
100%
improvement
5 10 Number of users --> 15
130
16-bit Fixed Point C Code Code optimized for the DSP Use of Assembly Code for critical part TI's C62 fixed point assembly benchmarks for Dot Product
Flops Count
14 x 10
4
Users:K=15 SNR=6dB
2X speedup
Number of Flops
12
10
conventional
6
differencing
0 1 2 3 4 5 6 7 8
Real-Time Requirements
SNR=10dB Window Size=12
350 300
200
12users 150kb/s
150
100
50 8 9 10 11 12 13 14
NUMBER OF USERS
Lower Voltage operation 1.2 V in C5402 , useful for saving power consumption in the mobile.
ASIC Implementation
MOSIS Tiny-Chip (40-pin DIP)
8 synchronous users 12-bit fixed point implementation
6000 transistors
1.2 m CMOS technology 190kb/s for each user (@12.5MHz)
Advantages of ASICs
Highly paralleled instructions: 4 RISC IPC (instructions per cycle) accumulating while shifting, loading and storing recoding while loading
Application specific architecture faster I/O smaller on chip memory smaller ALU
zz
(l(l ) )
A
REG
z ( l 1)
d (l )
d (l )
d ( l 1)
SHIFT RECODER
L U
(L+L)A
Control Logic
z (l 1) z (l ) ( L LT ) Ax (l ) where x (l ) d (l ) d (l 1)
Chip Layout
2.0 mm Recoding logic Soft Decisions
CrossCorrelation
12-bit ALU
Output Valid
System Timing
Load R
1st Stage
Interference Cancellation
10100000
00100000
00100000
12-bit partial carry look- Three 16-bit full carry lookahead adder ahead adders 6K 1.5Mb/s layout 100K 3.0Mb/s VHDL synthesis
DSP-ASIC Comparison
8 users
Clock Precision Speed Complexity Design Cycle DSP (C6201) 200MHz 16-bit 300kb/s/user ASIC (Tiny Chip) 12.5MHz 12-bit 190kb/s/user
Simulation Testbed
Entire Chain of Algorithms
Simulink - RTW Rapid Prototyping Matlab to DSP
Copper Contest
Implementation of Multistage Detector using 0.15 micron Copper Technology
Future Work
Fixed Point Implementations on DSPs/ASICs
Uplink & Downlink Algorithms