You are on page 1of 114

Department of Information Technology Integrated Systems Laboratory

and Electrical Engineering


Semester Thesis
Noise Variance Estimation
for MIMO-OFDM Testbed
Dominik Bischoff
Advisors: Markus Wenk
Thomas Koch
Patrick Mchler
Winter Term 2008
expected signal
received signal
Preface
Abstract
This semester thesis deals with the problem of estimating the noise variance (or
equivalently the signal to noise ratio SNR) in a MIMO-OFDM system. In a rst part, the
properties of MIMO-OFDM systems are presented. In a next step, existing algorithms
are evaluated. Most of the applicable algorithms work in the frequency domain. The
performance of those algorithms is therefore highly dependent on the employed channel
estimator. As it is not desirable to increase the performance of the channel estimator
in a real system due to the costs in terms of throughput, a novel algorithm working in
the time domain is developed. The only prerequisite for this novel algorithm is that
periodic short preambles are available.
The performance of this proposed algorithm is evaluated in a MIMO-OFDM simulation
environment. The performance in the simulation is near the optimum and as the short
preambles are transmitted anyway, there is no loss in throughput.
In a next step, the algorithm is implemented in VHDL and mapped on a FPGA. The
hardware costs are small compared to the area occupied by the other MIMO-OFDM
signal processing blocks.
In the last part of this thesis, some measurements were conducted with the ofine and
the online testbed. In case of the ofine testbed, the algorithms performs better than
the previously employed constant 30dB estimator. There is some loss in performance in
the high SNR region due to transmit noise. A proposition is made how this problem
could be solved. The measurements with the online testbed show that the frequency
offset between the transmitting and the receiving board causes a problem. A possible
solution is presented but not yet implemented.
II
Overview
The thesis is split into the following chapters:
Task Description The ofcial task description for this semester thesis.
Introduction The terms MIMO and OFDM are explained and several
channel models are presented.
Literature Review Already existing papers with relevant information for this
thesis are presented.
Simulations The limits of an SNR estimator are elaborated.
Algorithm Design Existing algorithms are evaluated and a novel algorithm
is developed and presented.
Implementation The implementation of the novel algorithm on a FPGA is
described.
Measurements Some measurements of the algorithm with the ofine and
the online testbed are presented.
Summary, Conclusion A short summary of the thesis and an outlook are given.
And Outlook
III
Author: Dominik Bischoff dominikb@ee.ethz.ch
Advisors: Markus Wenk mawenk@iis.ee.ethz.ch
Thomas Koch koch@iis.ee.ethz.ch
Patrick Mchler maechler@iis.ee.ethz.ch
Supervisors: Hubert Kaeslin kaeslin@iis.ee.ethz.ch
Norbert Felber felber@iis.ee.ethz.ch
Professor: Wolfgang Fichtner chtner@iis.ee.ethz.ch
Acknowledgments
I thank the Integrated Systems Laboratory (IIS) at ETH Zurich for the opportunity
to realize this project and providing all the infrastructure. Special thanks go to my
advisors for offering help whenever needed but leaving me at the same time the
freedom to follow my own ideas wherever possible. As this thesis uses a lot of previous
work done by different persons (MIMO-OFDM simulation environment, ofine testbed,
online testbed), I also thank whomever was involved in developing them. I further
thank Hubert Kaeslin and Norbert Felber for the VHDL code samples from the VLSI 1
lecture that were extremely helpful while writing the hardware code. I also thank the
Communication Technology Laboratory (IKT) at ETH Zurich for allowing me to use
their measurement equipment. And nally, a special thank goes also to my family and
my friends that supported me during the whole time!
IV
Table of Contents
1 Task Description 1
2 Introduction 7
2.1 Why Using MIMO? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.1 Antenna Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.2 Spatial Multiplexing . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 Why Using OFDM? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.1 The Standard Approach . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.2 Frequency Division Multiplexing . . . . . . . . . . . . . . . . . . 9
2.2.3 Orthogonal FDM (OFDM) . . . . . . . . . . . . . . . . . . . . . . 10
2.2.4 OFDM: What Are Orthogonal Signals? . . . . . . . . . . . . . . . 10
2.2.5 OFDM: How to Find Orthogonal Signals . . . . . . . . . . . . . . 11
2.2.6 OFDM: Cyclic Prex . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.7 OFDM: Noise Considerations . . . . . . . . . . . . . . . . . . . . 12
2.2.8 Existing Systems Using OFDM . . . . . . . . . . . . . . . . . . . . 13
2.3 Channel Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3.2 SISO Channel Model . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3.3 OFDM Channel Model for C Channels (SISO) . . . . . . . . . . . 15
2.3.4 MIMO Channel Model for a 44 System . . . . . . . . . . . . . . 15
2.3.5 MIMO-OFDM Channel Model . . . . . . . . . . . . . . . . . . . . 16
2.3.6 The TGn Channels . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.4 Reconstruction of the Original Data . . . . . . . . . . . . . . . . . . . . . 18
3 Literature Review 19
3.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.2 Papers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.2.1 Aldana et al. 2000 . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.2.2 Athanasios et al. 2005 . . . . . . . . . . . . . . . . . . . . . . . . 20
3.2.3 Athanasios et al. 2006 . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2.4 Beaulieu et al. 2000 . . . . . . . . . . . . . . . . . . . . . . . . . 21
VI TABLE OF CONTENTS
3.2.5 Boumard 2003 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.2.6 Pauluzzi et al. 2000 . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.2.7 Ren et al. 2005 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.2.8 Ren et al. 2008 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.2.9 Schmidl et al. 1997 . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.2.10 Shin et al. 2001 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.2.11 Xu et al. 2005 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.2.12 Xu et al. 2005 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.2.13 Ycek et al. 2006 . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.3 Other Related Papers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4 Simulations 29
4.1 Description of the Simulation Environment . . . . . . . . . . . . . . . . . 29
4.2 Best and Worst Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.3 Perfect SNR Shifted . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5 Algorithm Design 37
5.1 Several Approaches and Why They Dont Work (...Too Well) . . . . . . . 37
5.1.1 Using Only the FFT Output . . . . . . . . . . . . . . . . . . . . . 37
5.1.2 Using the Channel Matrix . . . . . . . . . . . . . . . . . . . . . . 37
5.2 Proposed Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.2.1 General Idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.2.2 Mathematical Formulation and Analytical Results . . . . . . . . . 40
5.2.3 Simulation of the Proposed Algorithm . . . . . . . . . . . . . . . 52
5.2.4 The Inuence of the Number of Samples . . . . . . . . . . . . . . 52
5.2.5 The Mean Value of the Estimated SNR . . . . . . . . . . . . . . . 52
5.2.6 Frequency Offset . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.2.7 Ignore Frequency Offset and Save Hardware Costs . . . . . . . . 57
5.2.8 Limited Precision . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.2.9 Proposed Algorithm: Further Ideas and Simulations . . . . . . . . 60
6 Implementation 63
6.1 Requirements and Limitations . . . . . . . . . . . . . . . . . . . . . . . . 63
6.2 First Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
6.3 Second Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
6.4 Final Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
TABLE OF CONTENTS VII
6.4.1 SNR_EST_ENT . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
6.4.2 TOTAL_POWER_ENT . . . . . . . . . . . . . . . . . . . . . . . . . 70
6.4.3 AVERAGE_SIGNAL_ENT . . . . . . . . . . . . . . . . . . . . . . . 72
6.4.4 FULL_CYCLE_FINISHED_ENT . . . . . . . . . . . . . . . . . . . . 72
6.4.5 NUMBER_OF_FULL_CYCLES_ENT . . . . . . . . . . . . . . . . . 72
6.4.6 INITIALIZE_ENT . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
6.4.7 VALID_DATA_ENT . . . . . . . . . . . . . . . . . . . . . . . . . . 75
6.4.8 CONT_AV_SIG_ENT . . . . . . . . . . . . . . . . . . . . . . . . . 75
6.4.9 NR_DIVISION_ENT . . . . . . . . . . . . . . . . . . . . . . . . . . 75
6.4.10 Mapping Onto FPGA . . . . . . . . . . . . . . . . . . . . . . . . . 79
6.4.11 Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
7 Measurements 83
7.1 Measurements With Ofine Testbed . . . . . . . . . . . . . . . . . . . . . 83
7.1.1 DC Carrier Removal . . . . . . . . . . . . . . . . . . . . . . . . . 84
7.1.2 Four SNR Values Estimated but Only One Required . . . . . . . . 84
7.1.3 Scaling All Streams to Equal Noise . . . . . . . . . . . . . . . . . 87
7.1.4 Transmit Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
7.2 Measurements With Online Testbed . . . . . . . . . . . . . . . . . . . . . 88
8 Summary, Conclusion and Outlook 93
8.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
8.2 Conclusion and Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
Bibliography 100
VIII TABLE OF CONTENTS
List of Figures
2.1 Comparison of a single carrier spectrum and a FDM spectrum. . . . . . . 10
2.2 OFDM system using twice a DFT . . . . . . . . . . . . . . . . . . . . . . 11
2.3 A standard approach for a MIMO system with 4 transmitting and 4
receiving antennas. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.4 A channel model for a MIMO-ODFM system. . . . . . . . . . . . . . . . . 17
4.1 Perfect and constant SNR estimation (FDMLE channel estimator) . . . . 31
4.2 Perfect and constant SNR estimation (ideal channel estimator) . . . . . . 32
4.3 Ideal SNR estimator with offset . . . . . . . . . . . . . . . . . . . . . . . 34
4.4 Ideal SNR estimator with offset - zoomed version. . . . . . . . . . . . . . 35
5.1 Ren2008 and an adapted EVM algorithm . . . . . . . . . . . . . . . . . . 39
5.2 Mean SNR values for different M. . . . . . . . . . . . . . . . . . . . . . 51
5.3 Simulated BER for the proposed algorithm with M = 9 . . . . . . . . . . 53
5.4 Simulated BER for the proposed algorithm with changing M . . . . . . . 54
5.5 Simulated BER for the proposed algorithm with changing M - zoomed
version . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.6 Estimated mean SNR values for the proposed algorithm. . . . . . . . . . 56
5.7 Proposed algorithm with a frequency offset. . . . . . . . . . . . . . . . . 58
5.8 Proposed algorithm using absolute value of input signal. . . . . . . . . . 59
5.9 Proposed algorithm using limited precision. . . . . . . . . . . . . . . . . 61
6.1 Second approach, datapath of estimated signal power . . . . . . . . . 67
6.2 SNR_EST_ENT - top level design entity. . . . . . . . . . . . . . . . . . . . 69
6.3 TOTAL_POWER_ENT - calculating the power of a stream of data. . . . . 71
6.4 AVERAGE_SIGNAL_ENT - averaging all samples that belong together. . 73
6.5 NUMBER_OF_FULL_CYCLES_ENT and FULL_CYCLE_FINISHED_ENT -
counting subsignals. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
6.6 INITIALIZE_ENT - initializes the rest of the circuit as soon as the AGC
freezes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
6.7 VALID_DATA_ENT - monitors the state of the arriving samples. . . . . . . 76
X LIST OF FIGURES
6.8 CONT_AV_SIG_ENT - control for the estimated signal datapath. . . . . 77
6.9 A numerical example for the digital Non-Restoring division algorithm. . 78
6.10 NR_DIVISION_ENT - the division entity. . . . . . . . . . . . . . . . . . . 80
6.11 Overview over all signals for the nal estimator entity. . . . . . . . . . . 81
7.1 A picture of the MIMO-OFDM testbed with 4 antennas. . . . . . . . . . . 83
7.2 Measurement of the BER with the ofine testbed. . . . . . . . . . . . . . 85
7.3 Estimated SNR values with ofine testbed compared to expected SNR
values. The expected values were approximated by taking the best
performing curves from Fig. 7.2 for each output setting. . . . . . . . . . 86
7.4 A transmit noise model. . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
7.5 Estimated SNR for several transmit SNR values . . . . . . . . . . . . . . 89
7.6 Simulation showing the BER for several estimators with 30dB transmit
SNR. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
7.7 Frequency offset compensation in online testbed. . . . . . . . . . . . . . 92
List of Tables
6.1 SNR estimation block input and output signals. . . . . . . . . . . . . . . 64
6.2 Approximate hardware costs for approach 1. . . . . . . . . . . . . . . . . 65
6.3 Approximate hardware costs for approach 2. . . . . . . . . . . . . . . . . 66
6.4 Overview over the hardware costs for the implementation of the SNR-
estimator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
XII LIST OF TABLES
1 Task Description
Institut fr Integrierte Systeme
Integrated Systems Laboratory
Semester Thesis at the Departement of
Information Technology and Electrical Engineering
Autumn Term 2008
Dominik Bischo
Noise Estimation
for MIMO-OFDM Testbed
Advisors: Markus Wenk, ETZ J69.2, Tel. 632 57 27, mawenk@iis.ee.ethz.ch
Thomas Koch, ETZ J69.2, Tel. 632 54 33, koch@iis.ee.ethz.ch
Patrick Mchler, ETZ J69.2, Tel. 632 65 69, maechler@iis.ee.ethz.ch
Handout: September 15, 2008
Due: December 19, 2008
Three copies of the written report are to be turned in. All copies remain property of the
Integrated Systems Laboratory.
2 1 TASK DESCRIPTION
1 Project Description
In wireless systems, knowledge of the noise variance or the signal-to-noise ratio (SNR) helps
to improve performance. Especially, preprocessing and detection stage in the receiver benet
from the knowledge of the noise variance. So far, the multi-user MIMO-OFDM testbed devel-
oped at the Integrated Systems Laboratory (IIS) in close collaboration with the Communication
Technology Laboratory (CTL) lacks such a noise estimator. Currently, the MMSE receiver im-
plemented in the testbed uses a constant noise variance to carry out the MMSE algorithm.Fig. 1
shows the noise estimation block in a MIMO-OFDM system.
Transmitter
y = Hs + n
Receiver
MIMO
detection
(e.g. MMSE)
Channel
estimation
Noise
estimation
VGA
interface
Figure 1: Overview of a MIMO system highlighting the channel estimation and the noise esti-
mation blocks.
2 Noise Estimation
The estimation of the noise variance or the SNR can be carried out in time or frequency domain.
A good overview of SNR estimation in OFDM systems is given in [5]. Several frequency-domain
algorithms were presented in the open literature in the last few years, e.g., estimators based on
two training symbols (preamble) [2, 4] or for dierent noise statistics [6]. Most of the published
estimators work in frequency domain.
3 Goals
The main goal of this thesis is the analysis and implementation of a noise estimation block for
the MIMO-OFDM testbed in order to improve the MIMO detection stage in the testbed. The
following tasks should be accomplished during this project:
Evaluation and analysis of noise variance estimation algorithms (time domain, frequency
domain) in order to understand the impact on the error rate performance.
Integration of a noise estimator into the testbed.
2
3
4 Milestones
The following milestones should be achieved during this semester thesis. However, some mile-
stones can be added or skipped, depending on the projects status. The tentative calendar in
Fig. 2 shows all milestones.
1. Establish a project plan.
2. Get familiar with the noise estimation parameters, the literature on noise variance and
SNR estimation algorithms, and the Matlab simulation environment.
3. Implementation and evaluation of dierent noise variance estimation algorithms in Matlab
and on the oine testbed.
4. VHDL implementation of a noise variance estimation block on the Virtex-4 FPGA.
5. BER measurements by using the PropSim channel emulator to verify the proper operation
of the implemented algorithm.
6. Write the nal report
5 General Recommendations
The following are some recommendations for this semester thesis:
While coding VHDL, use the IIS standard coding style [3] documented by the Design
Zentrum (DZ) website [1].
VHDL coding is greatly simplied and accelerated using the Emacs editor and its famous
and widely adopted VHDL mode. This Emacs installation at the institute supports among
other powerful features VHDL syntax highlighting, signal and component declaration and
instantiation, code beautifying, and automated sensitivity list updates based on the VHDL
standard. Since most assistants at the IIS are quite familiar with this editor, they can read
and evaluate your VHDL code (and help to solve problems) much faster. Please consult
the corresponding FAQ under the following link:
http://www.dz.ee.ethz.ch/support/ic/emacs/index.en.html
6 Project Realization
6.1 Project Plan
Within the rst week of the project you will be asked to prepare a project plan. This plan
should identify the tasks to be performed during the project and set deadlines for those tasks.
The prepared plan will be a topic of discussion of the rst weeks meeting between the students
and the advisors. Note that the project plan should be updated constantly depending on the
projects status.
3
4 1 TASK DESCRIPTION
6.2 Meetings
Weekly meetings will be held between the student and the advisors. The exact weekly meeting
time and location will be determined to t the schedule of the assistants. These meetings will
be used to evaluate the status and progress of the project.
6.3 Reports
Documentation is an important and often overlooked aspect of engineering. One short interme-
diate report and one nal report (the semester thesis) are to be completed within this study.
Note that the intermediate report should be designed to be part of the nal report.
The common language of engineering is de facto English. Therefore, the intermediate and nal
report of the work is preferred to be written in English. Any form of word processing software
is allowed for writing the reports, nevertheless the use of L
A
T
E
X with Tgif (for block diagrams)
is strongly encouraged by the IIS sta.
First Intermediate Report This report should be written in such a way to become the rst
part of your nal report. It should contain general information about the topic, a description
of the problem, explanations of related terminology, and descriptions of similar approaches in
literature (with corresponding references to books, papers etc.).
Final Report The nal report has to be presented at the end of this project and two copies
need to be handed out and remain property of the IIS. These reports are only accepted when
the keys for the ETZ building have been properly returned. Note that this task description is
part of your Thesis and has to be attached to your nal report. A data disc (e.g., CD or DVD)
containing all essential les of your project should also be added to the nal report.
6.4 Presentation and Demonstration
There will be a presentation (15 min presentation and 5 min Q&A) at the end of this project
to present your results to a wider audience. The exact date has to be determined.
7 Calendar (Tentative)
References
[1] Design Zentrum website and VHDL naming conventions. World Wide Web electronic pub-
lication, 2008. http://www.dz.ee.ethz.ch, http://dz.ee.ethz.ch/support/ic/vhdl.
[2] S. Boumard. Novel Noise Variance and SNR Estimation Algorithm for Wireless MIMO
OFDM Systems. Global Telecommunications Conference, GLOBECOM 03. IEEE, 3:1330
1334, Dec. 2003.
[3] Hubert Kaeslin. Digital Integrated Circuit Design: From VLSI Architectures to CMOS Fab-
rication (Mps-Siam Series on Optimizatio). Cambridge Univ Press, 1 edition, May 2008.
4
5
Literature study and
simulation environment
Matlab and offline testbed
algorithm analysis and evaluation
Implementation of a noise
estimator block in VHDL
Measurements and verification
September October November December
Documentation
Tasks
1
2
6
3
4
5
Figure 2: Tentative Calendar
[4] GuanLiang Ren, YiLin Chang, and HuiNing Zhang. SNR estimation algorithm based on
the preamble for wireless OFDM systems. Science in China Series F: Information Sciences,
51(7):965974, Jul. 2008.
[5] He Shousheng and M. Torkelson. Eective SNR estimation in OFDM system simulation.
IEEE GLOBECOM 98, 2:945950, 1998.
[6] T. Yzek and Arslan H. MMSE noise power and SNR estimation for OFDM systems. IEEE
Sarno Conference, Princeton, March 2006.
Zurich, September 15 Prof. Dr. Wolfgang Fichtner
The thesis will not be accepted without returning the keys!
5
6 1 TASK DESCRIPTION
2 Introduction
2.1 Why Using MIMO?
MIMO stands for Multiple-Input Multiple-Output. In communication systems, this
usually means that several transmitting and receiving antennas are employed.
2.1.1 Antenna Arrays
A special case of MIMO systems are antenna arrays that have been in use for a long
time: Several antennas can be used with a specic phase and amplitude setting to
transmit the same signal. This setup produces a higher gain in a certain direction and
is called beamforming. It also increases the diversity of the channel: If there is negative
interference of the signal transmitted from one of the antennas at the receiver, then
there is a high probability that at least one signal transmitted from another antenna of
the array is decodable. Using antenna arrays does neither increase the used bandwidth
nor does it decrease the throughput of data [1].
2.1.2 Spatial Multiplexing
The main difculty today is that users demand higher data rates for their applications
whereas the usable spectrum is limited (both technically and by regulations). This is
due to the increase in the popularity of mobile applications as for example cell phones
or wireless internet access. Wireless systems do not provide the option of just adding
an additional cable as in wire or breoptics based systems. Therefore, the spectral
efciency needs to be increased in order to enable a higher throughput. But customers
do not only want fast data access - this access also needs to be reliable (QOS - quality of
service) [1].
MIMO systems seem to be able to solve that problem at least temporarily. Instead of
just transmitting one single signal over the air from the transmitter to the receiver (as
8 2 INTRODUCTION
done in most systems today), several independent signals are sent over the common
channel air by using multiple antennas for transmitting and receiving. The idea seems
fairly trivial - but sending signals in the same frequency band over a common channel is
generally not possible. This is because the signals interfere with each other and cannot
be easily decoded at the receiver [2].
In most applications, every signal sent from a transmitting antenna reaches the re-
ceiving antenna over multiple paths. This phenomenon called multipath propagation
is produced by electromagnetic waves that are reected off walls and other objects.
The signal arriving at the receiver is therefore generally a superposition of scaled and
delayed versions of the original signal. Multipath propagation is generally considered
as a nuisance as it distorts the signal and common systems try to circumvent it by
establishing a line of sight (LOS) connection [2].
Instead of seeing multipath propagation as a factor that decreases the system perfor-
mance, clever approaches use it as an advantage in MIMO systems. One can imagine
the following setup:
transmitter using antennas T
1
and T
2
receiver using antennas R
1
and R
2
T
1
and T
2
transmit different signals
both are placed inside a building - assuming no LOS for simplicity
A signal sent from T
1
and received at R
1
follows a different path compared to the signal
sent from T
2
and received at R
2
. The same is true for the signals from T
1
to R
2
and
from T
2
to R
1
. If one assumes that the different paths are known at the receiver, clever
calculations can remove the effect of the superposition and decode both streams. In
that case, the data rate would have been doubled without using additional spectrum.
Due to the spatial distribution of the antennas, the reliability of the link should be
increased at the same time [2].
The critical question is how to know what those different paths look like - or in other
words: How to nd the channel matrix? This is generally done in a training phase
where known signals are sent by the transmitter. This does of course decrease the
overall throughput as no real information is transmitted during that phase. This loss is
generally smaller than the additional capacity gained by using a second stream.
2.2 WHY USING OFDM? 9
This procedure is called spatial multiplexing as the criterion used to distinguish the
different streams is the space each stream has to travel through. The number of streams
is theoretically limited by the smaller number of antennas on either the transmitter or
the receiver side. As a tradeoff between detection complexity and additional throughput,
a practical upper limit seems to be four spatial streams to be used at the same time.
There is further a problem to position a high number of antennas in a wireless system.
Most of todays commercially available systems therefore only use two spatial streams.
It is further possible to use additional antennas on either the receiver or the transmitter
side to increase the diversity gain [2].
To use spatial multiplexing in outdoor systems where one has a direct LOS, other tricks
have to be used. One possibility is to use special antennas with a 90 degree shifted
polarization [2].
2.2 Why Using OFDM?
2.2.1 The Standard Approach
The standard approach to modulate information onto a carrier is by varying the
frequency, the phase or the amplitude. As the data rate increases, the time a single
symbol (one or several bits) is on air is decreased. In case of impulse noise or other
short period noise with high energy, it is likely that a symbol gets distorted to such a
high extent that it cannot be recovered. The shorter the period in which the symbol is
available, the higher is the probability that the symbol is fully destroyed by bursts of
noise [3].
2.2.2 Frequency Division Multiplexing
To solve this problem, one can use frequency division multiplexing (FDM). Instead
of using a single carrier that occupies the whole available frequency band, several
subcarriers are employed within the available frequency band. The data stream is
distributed over all available subcarriers. This increases the symbol period and therefore
10 2 INTRODUCTION
decreases susceptibility to noise bursts. It also adds additional immunity to narrow-
banded noise , as such noise only affects several of the subcarriers and not the entire
signal [3].
FDM comes at the cost of a lower data rate as a guard interval has to be inserted between
the different subcarriers and therefore a part of the available frequency spectrum is
wasted. FDM also adds some complexity to the hardware by using several streams. At
the same time it also removes some of the complexity by slowing down the bit rate of
each subcarrier [3].
(a) Single carrier spectrum (b) FMD spectrum
Figure 2.1: Comparison of a single carrier spectrum and a FDM spectrum [3].
2.2.3 Orthogonal FDM (OFDM)
If one can choose a set of subcarriers that are orthogonal to each other, then there is
no need to use a guard interval to separate the subcarriers. This would increase the
spectral efciency of the system [3].
2.2.4 OFDM: What Are Orthogonal Signals?
Two signals u(t) and v(t) are said to be orthogonal to each other iff:
< u, v >=
_

u(t) v(t) dt =
_
_
_
0 , if u = v,
const , if u = v.
2.2 WHY USING OFDM? 11
2.2.5 OFDM: How to Find Orthogonal Signals
There are several possible ways to create orthogonal signals. The solution presented
here uses the Discrete Fourier Transform (DFT). A hardware efcient implementation
of the DFT is the Fast Fourier Transform (FFT). All the sinusoids of the DFT form an
orthogonal basis. If a time discrete signal is transformed with the DFT, it is essentially
correlated with those base sinusoids. Furthermore, the DFT is invertible. Using the
inverse DFT (or the inverse FFT - IFFT), the original signal can be reconstructed [3].
The mathematical backgrounds are well described in [4]. A sample system is shown in
Fig. 2.2. The basic ideas behind that system are: A whole collection of source symbols
(complex) are considered to be in the frequency domain. They get translated by the
IDFT into the time domain. Those discrete samples are transformed into a continuous
signal that can be transmitted over the channel. The receiver samples the signal and
transforms it back into the frequency domain by the use of the DFT. If there is no noise
present and the channel is perfect, the symbols at the receiver are the same as the ones
that were transmitted.
As the base functions of the DFT overlap each other without interfering, the spectral
efciency of the signal is a lot higher than in the case of a simple FDM and approaches
the case of the single carrier system.
Figure 2.2: OFDM system using twice a DFT [4]. Note: Instead of using a IDFT and a
DFT (or a IFFT and a FFT), one can use two DFT. This is because the DFT and the IDFT
are very similar. In that case, several adaptions need to be done to the datapath of the
transmitter!
12 2 INTRODUCTION
2.2.6 OFDM: Cyclic Prex
In a multipath channel, several delayed versions of the original signal appear at the
receiver. One speaks of intersymbol interference (ISI) if a consecutive OFDM-symbol gets
distorted by the previous one. In a general case, only the rst few samples of the signal
get distorted. The problem can be solved by waiting a specic time between transmitting
two consecutive symbols. This guard interval (in time domain) is depending on the
channel [3].
The other problem is that a single OFDM symbol can interfere with itself. This is called
intrasymbol interference. The reason is the following: A convolution in time domain is
equivalent to a multiplication in the frequency domain iff the signal is either periodic
or innitely long. Both is not fullled for a standard OFDM system [3].
The solution is to make the OFDM symbol appear periodic. This is done by using a
cyclic prex (CP): The last few samples of the signal are copied at the beginning of the
signal where originally the guard interval would be. This cyclic prex only contains
redundant data and can therefore be discarded at the receiver - so there is no problem
with ISI [3].
Using a cyclic prex leads to a signicant simplication of the receiver: Instead of
having to remove a convolution in time (between the signal and the channel), it is only
necessary to remove a multiplication in frequency domain [3].
2.2.7 OFDM: Noise Considerations
The most common noise source in a wireless system is thermal noise - usually manifest-
ing itself as Additive White Gaussian Noise (AWGN). As the noise spectrum is uniform
in the frequency domain, this kind of noise has the same impairment on the overall
system as it has in a single carrier system [3].
Another common type of noise is impulse noise. This type of broadband noise is generally
only present during a short period. As described before, the OFDM system performs
better under impulse noise than a single carrier system [3].
Colored noise is difcult to handle as it doesnt have a constant spectrum as AWGN. A
simple solution for high noise environments is to lower the data rate [3].
2.3 CHANNEL MODELS 13
If there are other systems present, carrier interference can occur. An OFDM system can
handle that by disabling the affected subcarriers [3].
Another type of imperfection emerges fromthe local oscillator. There are two effects that
have to be considered: Phase noise (sometimes called phase jitter) and the frequency
offset. Phase noise originates from the fact that the oscillator frequency changes
randomly within a small range. The same argument in the frequency domain is that
the oscillator does not produce a single peak but rather a smeared out peak. Phase
noise affects every subcarrier. As the spectral width of a subcarrier is smaller than in
a single carrier system, phase noise affects OFDM systems more severly than single
carrier systems [3].
The frequency offset of an oscillator can be understood as the average frequency of the
oscillator. This frequency is generally slightly different from the expected frequency.
Clock quality, temperature and other effects are generally responsible for this offset. A
solution to this problem is to introduce pilot subcarriers for synchronization. It has to be
noted that introducing pilot subcarriers affects the maximum data rate negatively [3].
2.2.8 Existing Systems Using OFDM
Two of the most prominent systems using OFDM are ADSL (Asynchronous Digital Sub-
scriber Loop) and DVB-T (Digital Video Broadcast - Terrestrial). The rst is used for high
speed internet connections and the second for the European digital television [3].
A system that uses both MIMO and OFDM will be the next generation Wireless LAN
(WLAN 802.11n). The nal specications are not yet available - but there are already
existing devices on the market based on a draft (e.g. [5]). Those new devices promise
a signicantly higher data rate than previous generations.
2.3 Channel Models
2.3.1 Notation
The following notation will be used:
x(t) signal leaving the transmitter (time domain)
14 2 INTRODUCTION
X signal vector in frequency domain: Input of IFFT
y(t) signal reaching the receiver (time domain)
Y received signal vector in frequency domain: Output of FFT
h(t) channel impulse response (time domain)
H channel response matrix in frequency domain
n(t) additive noise (time domain)
N noise vector in frequency domain
T total number of transmitting antennas
number of the transmitting antenna
R total number of receiving antennas
r number of the receiving antenna
C total number of subcarriers
c number of the subcarriers
2.3.2 SISO Channel Model
The simplest possible system is a SISO (Single-Input, Single-Output) system. In time
domain, it can be written as:
y(t) = x(t) h(t) +n(t)
This is equivalent to the following notation in the frequency domain:
Y (f) = X(f) H(f) +N(f)
2.3 CHANNEL MODELS 15
2.3.3 OFDM Channel Model for C Channels (SISO)
An OFDM system can be represented by the following model in frequency domain:
[y
c=1
] = [H
c=1
] [x
c=1
] + [n
c=1
]
[y
c=2
] = [H
c=2
] [x
c=2
] + [n
c=2
]
[y
c=3
] = [H
c=3
] [x
c=3
] + [n
c=3
]
.
.
. =
.
.
.
.
.
. +
.
.
.
[y
c=C
]
. .
Y
C1
= [H
c=C
] [x
c=C
]
. .
X
C1
+ [n
c=C
]
. .
N
C1
Each line corresponds to one of the orthogonal tones.
2.3.4 MIMO Channel Model for a 44 System
To simplify the notation for a MIMO system with T transmitting and R receiving
antennas, it is assumed that T = R = 4. Such a general setup is shown in Fig. 2.3. It
is straight forward to change the number of transmitting or receiving antennas. The
T1
T2
T3
T4
R1
R2
R3
R4
Transmitter Receiver
h11
h21
h31
h41
h44
Figure 2.3: A standard approach for a MIMO system with 4 transmitting and 4 receiving
antennas.
16 2 INTRODUCTION
system shown in Fig. 2.3 can be written in the frequency domain the following way:
_

_
y
r=1
y
r=2
y
r=3
y
r=4
_

_
. .
Y
R1
=
_

_
h
r=1,=1
h
r=1,=2
h
r=1,=3
h
r=1,=4
h
r=2,=1
h
r=2,=2
h
r=2,=3
h
r=2,=4
h
r=3,=1
h
r=3,=2
h
r=3,=3
h
r=3,=4
h
r=4,=1
h
r=4,=2
h
r=4,=3
h
r=4,=4
_

_
. .
H
RT

_
x
=1
x
=2
x
=3
x
=4
_

_
. .
X
T1
+
_

_
n
r=1
n
r=2
n
r=3
n
r=4
_

_
. .
N
R1
It can be assumed that only one antenna is transmitting and all the others are not
sending any signal at all. In that case, the equation simplies to:
_

_
y
r=1
y
r=2
y
r=3
y
r=4
_

_
. .
Y
R1
=
_

_
h
r=1,=1
h
r=1,=2
h
r=1,=3
h
r=1,=4
h
r=2,=1
h
r=2,=2
h
r=2,=3
h
r=2,=4
h
r=3,=1
h
r=3,=2
h
r=3,=3
h
r=3,=4
h
r=4,=1
h
r=4,=2
h
r=4,=3
h
r=4,=4
_

_
. .
H
RT

_
x
=1
0
0
0
_

_
. .
X
T1
+
_

_
n
r=1
n
r=2
n
r=3
n
r=4
_

_
. .
N
R1
This can be further simplied to:
_

_
y
r=1
y
r=2
y
r=3
y
r=4
_

_
. .
Y
R1
=
_

_
h
r=1,=1
h
r=2,=1
h
r=3,=1
h
r=4,=1
_

_
. .
H
R1
x
=1
+
_

_
n
r=1
n
r=2
n
r=3
n
r=4
_

_
. .
N
R1
2.3.5 MIMO-OFDM Channel Model
As can be seen from the SISO OFDM channel model, the different OFDM subchannels
can be treated separately. This allows to formulate a simple model for a MIMO-OFDM
system: The whole system can be seen as a stack of C different MIMO systems. A
graphic showing such a system is presented in Fig. 2.4
2.3 CHANNEL MODELS 17
Y
Rx1
= H
RxT
X
Tx1
.
N
Rx1
+
subchannel 1
subchannel 2
subchannel 3
subchannel 4
Rx1xC RxTxC Tx1xC Rx1xC
subchannel C
Figure 2.4: A channel model for a MIMO-ODFM system.
2.3.6 The TGn Channels
In 2004, the Task Group N (TGn)
1
published a set of channel models applicable
to indoor MIMO WLAN systems. The model[s] can be used for both 2 GHz and
5GHz frequency band[s.]. There are six different channel models: A, B, C, D, E and
F. Model A is an optional model and should not be used for system performance
comparisons [6].
The following steps are taken for models B to F
2
:
Start with delay proles of models B-F.
Manually identify clusters in each of the ve models.
Extend clusters so that they overlap, determine tap powers (see Appendix A).
Assume PAS [power angular spectrum] shape of each cluster and corresponding
taps (Laplacian).
Assign AS [angular spread] to each cluster and corresponding taps.
Assign mean AoA [angle of arrival] (AoD [angle of departure]) to each cluster
and corresponding taps.
Assume antenna conguration.
Calculate correlation matrices for each tap.
1
IEEE P802.11 - TASK GROUP N;
http://www.ieee802.org/11/Reports/tgn_update.htm
2
quoted directly from [6] to show the complexity of the models
18 2 INTRODUCTION
The TGn also calculated the mean capacity in bits per second per Hz for all models. The
results show that model C has the lowest capacity of all proposed models. This suggests
that channel C is the most challenging of the channel models. This is the reason that
TGn C is used for the simulations in this thesis.
2.4 Reconstruction of the Original Data
The MIMO-OFDM channel model suggests that if the exact channel matrix and the
exact noise vector were known, the original data could be reconstructed perfectly. It
is obvious that in any real system with a limited amount of training data, one cannot
perfectly estimate neither the channel matrix nor the noise vector. The limitation of
available training data is justied by the loss of throughput by increasing the amount
of training data and the fact that any real wireless channel is time-varying. These
imperfections can cause errors in the detected symbols. By improving the performance
of the receiver, the amount of errors can be minimized. This thesis deals with the
estimation of the noise variance (or equivalently the SNR) as the noise variance is an
important parameter for decoding the received signals.
3 Literature Review
3.1 Method
This part of the thesis presents a selection of papers that might be relevant to the topic
of interest. The papers are sorted alphabetically by the family name of the author.
As the methods and parameters used for simulation vary highly between the different
papers, numerical comparisons of algorithms are omitted in this section.
An algorithm is suitable if the following points are satised:
better or equally accurate as other algorithms of similar setup and complexity
adaptable to MIMO-OFDM
well enough documented to be implementable in a reasonable amount of time
complexity of calculations within reasonable limits and therefore suitable for
hardware implementation
Any additions not present in the paper and added by the author of this thesis are written
in italics.
3.2 Papers
3.2.1 Aldana et al. 2000: Accurate Noise Estimates in Multicarrier
Systems
Aldana et al. [7] presented in their work two different algorithms to estimate the noise
variance in multicarrier systems. Those algorithms would therefore be suitable for
OFDM systems. The two presented algorithms do not use any known training signals.
20 3 LITERATURE REVIEW
The rst algorithm presented is the EM (Expectation Maximization) algorithm. The
algorithm is iterative and converges only slowly. Those two facts make this algorithm
unsuitable for application in a real system.
The second algorithm is a decision directed algorithm. Similar to the previous algo-
rithm, this one is suitable for OFDM signals, operates in the frequency domain and does
not need any training data.

N
k
= Y
k
H
k


X
k

2
k
=
1
L
L

n=1
|

N
k
|
2
SNR
QAM
=
|H
k
|
2
d
2
(M
2
1)
6
2
Y
k
is the received signal of the k-th tone. H
k
is the gain of subchannel k and assumed to
be known (or at least accurately guessed).

X
k
is the estimation of the transmitted symbol
of the k-th tone. Known training symbols might improve the quality of the estimated
SNR. M is the number of symbols (M-ary QAM) and L is the blocklength. d is the
distance between symbols. The authors come to the conclusion that their algorithm
does underestimate the true SNR and that in order to get reliable results, a look up
table (LUT) depending on the modulation scheme should be implemented.
3.2.2 Athanasios et al. 2005: SNR Estimation Algorithms in AWGN for
HiperLAN/2 Transceiver
Athanasios et al. [8] present two different algorithms for the HiperLAN/2 system that
employs OFDM. Both algorithms estimate the SNR in a 64-QAM system.
The rst algorithm is called MMSE (Minimum Mean Square Error). This algorithm uses
training signals a and works in the frequency domain.
a = {a
1
, a
2
, ..., a
L
}
C = Y a
H
E = |Y |
2
SNR =
|C|
2
|a|
2
E |C
2
|
3.2 PAPERS 21
The authors state that it is also possible to only use the real or the imaginary part of the
received data to reduce the complexity of the calculation, whereas the drop in precision
should be only minimal.
The second algorithm is called EVM (Error Vector Magnitude). It estimates the sent
symbols and calculates the average and the variance of them. It is not specied in detail
how those symbols should be estimated and the algorithm seems to exhibit a rather
poor performance compared to the MMSE algorithm.
3.2.3 Athanasios et al. 2006: SNR Estimation for Low Bit Rate OFDM
Systems in AWGN channels
Athanasios et al. [9] present two different algorithms for OFDM systems. The second
one is the MMSE algorithm already presented in [8].
The rst algorithm is called SNV (Squared Signal to Noise Variance). Again, this
estimator needs estimates of the received symbol and the performance seems to be
inferior to the MMSE algorithm.
3.2.4 Beaulieu et al. 2000: Comparison of Four SNR Estimators for
QPSK Modulations
Beaulieu et al. [10] present four different estimators for QPSK modulations in time
domain. X
i
is the in phase component and Y
i
is the quadrature component. The
algorithm with the best performance is:

2
= L
_
L

i=1
(|X
i
| |Y
i
|)
2
X
2
i
+Y
2
i
_
1
It has to be further investigated if and how this algorithm could be used for an OFDM
system. The same algorithm is also presented in the frequency domain by Hong et
al. [11].
22 3 LITERATURE REVIEW
3.2.5 Boumard 2003: Novel Noise Variance and SNR Estimation
Algorithm for Wireless MIMO OFDM Systems
Boumard [12] presents an algorithm to estimate the SNR in a 2x2 MIMO-OFDM system
in the frequency domain. The algorithm needs some well dened training symbols (two
per antenna - sent individually) and the results from a channel estimator. The algorithm
is able to calculate both the SNR per subcarrier and the overall SNR. The algorithm
seems to perform well as long as the channel is reasonably slow fading. It needs to be
further investigated, how this algorithm can be adapted for a 4x4 MIMO-OFDM system
with predened training symbols. The principal challenges are the use of given training
symbols and the expansion to a 4x4 system.
3.2.6 Pauluzzi et al. 2000: A Comparison of SNR Estimation
Techniques for the AWGN Channel
Pauluzzi et al. [13] present ve different SNR estimation techniques for PSK modulation
in an AWGN channel.
The rst algorithm is called SSME (Split Symbol Moments Estimator) and is only valid
for BPSK modulation.
The second algorithm is the ML (Maximum Likelihood) estimator. There are two
versions of that algorithm: One that uses known training symbols and one that uses
guesses of the transmitted symbols. The data-aided version seems to perform near
the optimum and the non-data-aided performs equally well for high SNRs. To use this
algorithm, it has to be adapted to the MIMO-OFDM system as the system used by Pauluzzi
et al. is quite different.
The third algorithm is the SNV estimator that is also presented in [14] and [9].
The fourth algorithm is the M
2
M
4
(Second- and Fourth-Order Moments) estimator.
This estimator seems to perform similar to the ML algorithm except in low SNR
environments, where it performs worse.
The fth algorithm presented is the SVR (Signal to Variance Ratio) estimator. It per-
forms signicantly worse than the ML estimator especially in high SNR environments.
3.2 PAPERS 23
3.2.7 Ren et al. 2005: A New SNRs Estimator for QPSK Modulations
in an AWGN Channel
Ren et al [15] present the M
2
M
4
algorithm from [13] and an improved version of
this algorithm. The improved version seems to perform better than the original and
also better than the ML in high noise environments (SNR < 0dB). As this region is not
suitable for fast wireless communication anyway, the algorithm doesnt offer any advantage
over the ML algorithm.
3.2.8 Ren et al. 2008: SNR Estimation Algorithm Based on the
Preamble for Wireless OFDM Systems
Ren et al. [16] analyze the algorithm presented by Boumard [12] and come to the
conclusion that the performance of this algorithm depends highly on the frequency
selectivity of the channel. They propose an improved version of Boumards algorithm to
solve that problem. The authors also present several simulations that seem to conrm
that fact.

W =
4
N

N1

k=0
_
Im
_
Y
0,k
c

0,k

k
|

H
k
|
__
2

S =

M
2

M
2
=
1
N
N

k=0
|Y
0,k
|
2
SNR
av
=

W
SNR
subch k
=
|

H
k
|
2

W
N is the size of the IFFT/FFT. Y
m,k
is the m-th symbol of the k-th subcarrier after the
FFT at the receiver. c
m,k
is the m-th symbol on the k-th subcarrier.

H
k
is the channel
coefcent estimate.
24 3 LITERATURE REVIEW
3.2.9 Schmidl et al. 1997: Robust Frequency and Timing
Synchronization for OFDM
Schmidl et al. [17] present a time domain approach for synchronizing transmitter and
receiver. As a by-product they suggest an SNR estimator working in the time domain.
This estimator works well for the SNR below 20 dB. Above this level, M(d
opt
) is so
close to 1 that an accurate estimate of the SNR can not be determined, but only that
the SNR is high.
3.2.10 Shin et al. 2001: Simple SNR Estimation Methods for QPSK
Modulated Short Bursts
Shin et al. [18] present two algorithms to estimate the SNR in a QPSK modulated
system.
The rst algorithm is the EVM algorithm also presented by Athanasios et al. [8]. The
algorithm is rather simple and doesnt need any estimates at all (at least for the QPSK
case and not too low SNR). The authors also attribute a higher accuracy to this algorithm
than in [8].
1. check if Re{Y } > 0 and if Im{Y } > 0
2. for a given time period, collect the values for each of the four regions
3. estimate the SNR by: SNR =
|average|
2
variance
4. repeat to get an average
As this algorithm is simple to implement and independent of any other hardware. It should
also be easy to transform to the OFDM case.
The second algorithm presented is the MMSE that is also presented by Athanasios et
al. [8]. Interestingly, the MMSE algorithm is considered to be inferior to the EVM
algorithm by Shin et al., whereas Athanasios et al. come to the opposite conclusion.
3.2 PAPERS 25
3.2.11 Xu et al. 2005: Subspace-Based Noise Variance and SNR
Estimation for OFDM Systems
Xu et al. [19] present a subspace based algorithm for SNR estimation in OFDM
systems. The algorithm is computationally quite complex: 1) Make an eigenvector
decomposition of the correlation matrix

R.
3.2.12 Xu et al. 2005: A Novel SNR Estimation Algorithm for OFDM
Xu et al. [20] present a broad range of algorithms. Among them are the ML, the MMSE
and the M
2
M
4
algorithms already presented in other papers.
Based on Boumards algorithm [12], they develop a new algorithm that should perform
better in time varying channels.
R
G
(l) =
1
J
J1

j=0
y(i, j) y

(i, l +j) (3.1)

S
G
R
G
(1) +
R
G
(1) R
G
(2)
3
(3.2)

N
G
=
1
J
J1

j=0
y(i, j) y

(i, j)

S
G
(3.3)
SNR =

S
G

N
G
(3.4)
y(i, j) is the j-th symbol on the i-th subcarrier.
3.2.13 Ycek et al. 2006: MMSE Noise Power and SNR Estimation for
OFDM Systems
Ycek et al. [21] propose to use an estimator with a two dimensional lter over
time and frequency. To reduce the calculational complexity, they propose to have
a rectangular window for the lter. The authors come to the conclusion that their
approach signicantly improves the SNR estimation in colored noise. The paper
continues work proposed in an earlier paper by the same authors [22]. If colored
26 3 LITERATURE REVIEW
noise should be a problem, this algorithm could be further investigated - despite its high
computational complexity.
3.3 Other Related Papers
The following papers were somehow related to the problem but were too far away from
the actual problem to be adapted with a reasonable amount of work:
Alagha 2001: Cramer-Rao Bounds of SNR Estimates for BPSK and QPSK Modu-
lated Signals [23]
This paper presents the theoretical bounds that can be achieved by the best
possible algorithm.
Benedict et al. 1967: The Joint Estimation of Signal to Noise from the Sum
Envelope [24]
This paper provides some basic theory about estimating noise in narrowband
AWGN systems.
He et al. 1998: Effective SNR Estimation in OFDM System Simulation [25]
Some basic principles about using OFDM without the DFT are presented. But more
important is the following quote: Disregarding the formof distortions/interferences,
by the virtual of the central limit theorem, the noise part in eqn. (10) tends to
approach a Gaussian process, and it has been shown that if n(t) is a Wide-Sense
Stationary (WSS) process, the noise part in eqn. (10) tends to be white. This
indicates that it might be reasonable to assume that SNR estimation has a higher
probability of success if done in frequency domain.
Further, a rather basic algorithm for SNR estimation is presented.
Jeruchim et al. 1989: Estimation of the Signal-to-Noise Ratio (SNR) in Commu-
nication Simulation [26]
A very basic paper providing some estimator theory.
Kerr 1966: On Signal and Noise Level Estimation in a Coherent PCM Chan-
nel [27]
A basic paper that is too far away from the actual problem to be of any direct use.
3.3 OTHER RELATED PAPERS 27
Trkboylari et al. 1998: An Efcient Algorithm for Estimating the Signal-to-
Interference Ratio in TDMA Cellular Systems [28]
A rather complex algorithm for TDMA systems.
Wiesel et al. 2002: Data-Aided Sigal-to-Noise-Ratio Estimation in Time Selective
Fading Channels [29]
A time selective channel model is presented and a generalized class of ML detec-
tors for that model is derived.
Wiesel et al. 2002: Non-Data-Aided Signal-to-Noise-Ratio Estimation [30]
A non data aided version of the ML detector is presented along with a M
2
M
4
estimator. Further, a non data aided iterative algorithm is presented.
Wiesel et al. 2006: SNR Estimation in Time-Varying Fading Channels [31]
The Cramer-Rao bound (CRB) is derived for data aided SNR estimation. It is
shown that the data aided CRB is the same for time constant and time varying
channels. But this doesnt mean that all the algorithms perform equally well in
time varying channels. A generalized ML detector is derived for a polynominal-
in-time, time-varying fading channel. This algorithm is iterative. If time variation
should be found to be a problem in the real system, it would probably be worth
to consider this algorithm - even though iterative behavior usually means high
computational costs.
28 3 LITERATURE REVIEW
4 Simulations
4.1 Description of the Simulation Environment
The simulation environment performs the following tasks for each sweep:
1. generate a data-stream in the time domain, consisting of:
2 short preambles (64 samples + 16 samples for the CP each)
2 long preambles (a total of 128 samples + 32 samples for the CP)
MIMO training (320 samples - 80 per transmitting antenna)
random data to transmit (64 samples + 16 samples for the CP)
2. transmit the data (apply channel matrix)
3. generate AWGN noise corresponding to the SNR setting (all channels equal
amount of noise)
4. add the generated noise to the received data
5. estimate SNR
6. congure receiver and decode data bits
7. calculate the BER
8. repeat steps 2 to 6 for all SNR steps
At the end of all sweeps, the average of the BER is calculated for each channel SNR.
It has to be noted, that a real system should send more data in order to increase
the throughput. This is not done here because the focus is on the SNR estimation.
In order to get reliable results with a reasonable amount of computation time, it is
preferred to increase the amount of sweeps rather than to increase the amount of data
per sweep. The estimated SNR is the average of the four SNRs calculated for each
receiving antenna.
30 4 SIMULATIONS
4.2 Best and Worst Cases
The simulation environment
1
was used to generate a plot of the BER using the exact
SNR of the channel as an SNR estimation. This curve is expected to be the lower bound
that can be achieved. To investigate the potential benet of a good SNR estimator,
several BER curves were plotted using constant SNR estimators. It was expected that
those curves touch the ideal curve at the points where the estimated SNR is equal to
the channel SNR. In all other cases, they should lead to a higher BER. This plot can be
seen in Fig. 4.1.
As can be seen from the plot, the constant SNR estimators perform at certain points
slightly better than the one using directly the SNR of the channel. How can it be that
the simplest of all SNR estimators performs at certain points better than the ideal
estimator? Is it a bug in the simulation environment? The answer can be found when
repeating the same simulation
2
- but this time using a perfect channel estimator instead
of the FDMLE channel estimator. In this case, the plot looks as expected (see Fig. 4.2).
The reason therefore seems to be that the channel estimator adds additional noise to
the signal. This is not surprising as the FDMLE channel estimator is fairly simple and
basically takes only one sample for each channel matrix entry (which still results in
the transmission of four OFDM symbols for a 4x4 system!). In order to get a perfect
SNR estimator, one therefore has to take into account that the estimated SNR has to be
lower (i.e. higher noise) than the actual channel SNR. From the intersections of the
constant 10dB and 20dB with the ideal case follows the assumption that one has to
estimate an approximately 2dB lower SNR than the actual channel SNR. It has further
to be noted that this 2dB difference has only a small inuence on the BER.
The constant SNR estimators perform better in the region where they overestimate the
channel SNR than in the region, where they underestimate the SNR (if the channel
SNR for example is 30dB, it is better to estimate 50dB than to estimate 10dB). It
therefore follows that a constant SNR estimator should be chosen in a way that it
always overestimates the actual channel SNR. By comparing the constant 50dB curve
with the ideal curve, an average of 5dB SNR can be compensated by using a good SNR
1
SNR range: 0-30 [dB](step: 1 [dB]) number of sweeps: 20000 (seed=0..9)
channel model: TGn C transmitting antennas: 4 receiving antennas: 4 number of tones: 64
channel estimator: FDMLE demapper: MMSE modulation: QPSK
2
SNR range: 0-30 [dB](step: 1 [dB]) number of sweeps: 20000 (seed=0..9)
channel model: TGn C transmitting antennas: 4 receiving antennas: 4 number of tones: 64
channel estimator: ideal demapper: MMSE modulation: QPSK
4.2 BEST AND WORST CASES 31
0 5 10 15 20 25 30
10
3
10
2
10
1
10
0
SNR (channel) [dB]
B
E
R


SNRest = SNR of channel
SNRest = 10dB
SNRest = 20dB
SNRest = 30dB
SNRest = 50dB
Figure 4.1: This simulation shows the differences between a SNR estimator using the
actual channel SNR and several constant SNR estimators. The used channel estimator
is FDMLE.
32 4 SIMULATIONS
0 5 10 15 20 25 30
10
4
10
3
10
2
10
1
10
0
SNR (channel) [dB]
B
E
R


SNRest = SNR of channel
SNRest = 10dB
SNRest = 20dB
SNRest = 30dB
SNRest = 50dB
Figure 4.2: This simulation shows the differences between a SNR estimator using the
actual channel SNR and several constant SNR estimators. The ideal channel estimator
(i.e. perfect channel knowledge) is used.
4.3 PERFECT SNR SHIFTED 33
estimator instead of a constant (high) SNR estimator. This is equivalent to a decrease
in the BER by about a factor of two (if the channel SNR is above 5dB). The benet is
lower if compared to the 30dB curve, but still signicant. It is therefore worth investing
some time to nd a good SNR estimator.
4.3 Perfect SNR Shifted
Fig. 4.1 suggests, that it is generally better to overestimate the SNR than to underes-
timate it. This is certainly true for large deviations of the actual channel SNR. The
effects of slightly over- or underestimating the channel SNR are explored
3
in Fig. 4.3
and 4.4.
Fig. 4.3 and Fig. 4.4 conrm that the the channel estimator adds approximately 2dB of
noise. They also show that approximately half a decibel is lost if the estimation is in
the range of -5...+1 dB of the actual channel SNR and that around one decibel is lost
for the range -6...+2 dB channel SNR.
The second interesting result from Fig. 4.3 and Fig. 4.4 is that the loss in performance
increases quite fast for higher deviations. If one assumes -2dB to be the optimal case,
then 3dB deviation result in half a decibel of performance loss, whereas 4dB deviation
lead to a full decibel of performance loss!
3
SNR range: 0-30 [dB](step: 1 [dB]) number of sweeps: 20000 (seed=0..9)
channel model: TGn C transmitting antennas: 4 receiving antennas: 4 number of tones: 64
channel estimator: FDMLE demapper: MMSE modulation: QPSK
34 4 SIMULATIONS
0 5 10 15 20 25 30
10
3
10
2
10
1
10
0
SNR (channel) [dB]
B
E
R


SNRest = SNR channel
SNRest = SNR channel + 6dB
SNRest = SNR channel + 3dB
SNRest = SNR channel + 2dB
SNRest = SNR channel + 1dB
SNRest = SNR channel 1dB
SNRest = SNR channel 2dB
SNRest = SNR channel 3dB
SNRest = SNR channel 4dB
SNRest = SNR channel 5dB
SNRest = SNR channel 6dB
SNRest = 50dB
Figure 4.3: This gure shows the simulation results of an SNR estimator using the
actual channel SNR with an offset of several decibels.
4.3 PERFECT SNR SHIFTED 35
21 21.5 22 22.5 23 23.5 24
10
2
SNR (channel) [dB]
B
E
R


SNRest = SNR channel
SNRest = SNR channel + 6dB
SNRest = SNR channel + 3dB
SNRest = SNR channel + 2dB
SNRest = SNR channel + 1dB
SNRest = SNR channel 1dB
SNRest = SNR channel 2dB
SNRest = SNR channel 3dB
SNRest = SNR channel 4dB
SNRest = SNR channel 5dB
SNRest = SNR channel 6dB
SNRest = 50dB
Figure 4.4: This gure shows the simulation results of an SNR estimator using the
actual channel SNR with an offset of several decibels. Detailed version of the plot in
Fig. 4.3.
36 4 SIMULATIONS
5 Algorithm Design
5.1 Several Approaches and Why They Dont Work (...Too
Well)
5.1.1 Using Only the FFT Output
The simplest approach would be using the output of the FFT directly - without any
correction terms from the channel matrix. This does generally not produce any reliable
results, as every tone on every possible channel generally experiences a different
inuence from the channel itself (phase shift and amplitude change - multiplication
with a complex channel matrix coefcient). To use an EVM-style algorithm, one would
have to apply the algorithm for every transmitter-receiver-tone combination. It would
therefore be necessary to send the same known symbol several times in series. This is
obviously not a good solution as a lot of potential channel capacity is wasted.
5.1.2 Using the Channel Matrix
Every approach that employs the inverse of the channel matrix is doomed: The channel
matrix is generally not invertible. Inverting the channel matrix can be circumvented by
rewriting the algorithm or using known training signals where no tone is sent by more
than one antenna at any moment.
But not only the inversion is a problem: Using the channel matrix itself is highly
problematic. To estimate the channel matrix in a 4x4 system, each antenna has to
transmit each tone once alone. It is then possible to ll in the channel matrix with the
values at the receiver. This results in four complete OFDM symbols that have to be sent
including their CP. Compared to other setup steps, this step is quite costly and should
therefore not be repeated - at least not in a 4x4 system.
If an algorithm - for example
1
the one presented by Ren et. al. [16] - uses this estimated
1
The same problem exists for Aldana et. al. [7], Athanasios et. al. [9], Boumard [12] and others.
38 5 ALGORITHM DESIGN
channel matrix, the measured noise is zero. This is because the estimation of the
channel matrix assumed that there is no noise. If then the signal power is divided by the
noise power, the result is a high number which has nothing to do with the actual SNR.
As mentioned before, it would be possible to get a better estimate of the channel matrix
- but this is no option in a real system. It is also not desirable to have an SNR estimator
that is dependent on the performance of the channel estimator. SNR estimators that
need the channel matrix are not generally bad - some of them (e.g. the one from Ren et.
al. [16]) have a performance near the optimum for a perfect channel estimator. They
can therefore be a valid solution if an extremely accurate channel estimator is used. A
plot
2
showing the performance of the Ren2008 and an adapted EVM algorithm can be
seen in Fig. 5.1.
2
SNR range: 0-30 [dB](step: 1 [dB]) number of sweeps: 20000 (seed=0..9)
channel model: TGn C transmitting antennas: 4 receiving antennas: 4 number of tones: 64
channel estimator: FDMLE/ideal demapper: MMSE modulation: QPSK
5.1 SEVERAL APPROACHES AND WHY THEY DONT WORK (...TOO WELL) 39
0 5 10 15 20 25 30
10
3
10
2
10
1
10
0
SNR (channel) [dB]
B
E
R


Ren2008 perfect channel estimator
Ren2008 FDMLE channel estimator
EVM ideal channel estimator
EVM FDMLE channel estimator
constant 50dB
channel SNR
Figure 5.1: This plot shows the high performance of the Ren2008 and an adapted EVM
algorithm for an ideal channel estimator. It further shows the bad performance when
using the FDMLE channel estimator. It is not entirely clear why the Ren2008 algorithm
performs bad at low channel SNR in combination with the ideal channel estimator. It
can further be noted that with the FDMLE channel estimator, both algorithms perform
slightly worse than the 50dB constant algorithm. This suggests that 50dB is not enough
to be the upper limit but it is close enough for the 0..30dB range.
40 5 ALGORITHM DESIGN
5.2 Proposed Algorithm
5.2.1 General Idea
The algorithm uses the short preambles transmitted in the training phase. The system
transmits a clearly dened number of short preambles (generally two or four). One
short preamble consists of a repeating signal part of 16 samples plus another 16 samples
for the CP. In the ideal case, this leads to a series of ve 16-sample-signals (subsignals)
per short preamble that are identical. For four short preambles, this results theoretically
in twenty identical subsignals that can be compared to estimate the signal power and
the noise power. It has to be noted that at least the rst subsignal is heavily distorted
due to the setup of lters and the automatic gain control (AGC) and therefore cannot be
used.
To estimate the SNR, an average of all available subsignals is taken. This average
signal should be nearly identical to the signal received without noise, as long as the
noise is additive and has a mean value near zero (this is the case for AWGN). Out of
this estimated subsignal, the signal power P
s,est
can be calculated. Using the original
received signal, the power of the signal plus noise P
s+n
can be calculated. Those results
can be used to estimate the SNR:
SNR
est
=
P
s,est
P
s+n
P
s,est
=
P
s,est
P
n,est
This algorithm will be denoted proposed algorithm to distinguish it from other algo-
rithms. The numbers provided are specic for the the used system but can easily be
adapted for other congurations.
5.2.2 Mathematical Formulation and Analytical Results
Original Signals
All formulas provided are written in the discrete time domain - i.e. directly after the
IDFT at the transmitter and directly before the DFT at the receiver.
5.2 PROPOSED ALGORITHM 41
16 sample subsignal c

[l] that is part of the short preamble transmitted by antenna


:
c

[l] =
_
_
_
c
,l
C , if l = 0...15,
0 , else.
The transmitted signal s

[k] can then be written as a concatenation of several instances


of the signal c

[l] where m is the number of transmitted short preambles:


s

[k] =
m5

i=0
c

[k i 16]
The received signal y
r
[k] for receiving antenna r is then the following:
y
r
[k] =
4

=1
(s

h
r,
)[k] +n[k]
(s

h
r,
)[n] =

k=
s

[k] h
r,
[k n]
=

k=
s

[k n] h

r,
[k]
n is assumed to be IID AWGN and h is the channel impulse response. Due to the
convolution, the received signal y
r
[k] is generally not periodic anymore.
The Received Signal Rewritten
It is shown that if the rst and the last 8 samples of y are cut away, the remaining signal
is periodic again. The important points are:
h
r,
[k] = 0 if k < 0 due to the causality of the channel.
The cyclic prex is 16 samples long and assumed to be chosen carefully to avoid
ISI. Therefore the impulse response h
r,
[k] is zero if k 8.
h is assumed to be constant during the whole transmission (slow enough fading
channel).
42 5 ALGORITHM DESIGN
The channel does in the worst case distort the rst 8 samples of the next 16-sample
subsignal. This is done in a periodic manner.
The sum of multiple signals with the same period is periodic again.
Those three facts lead to the conclusion, that if the rst and the last 8 samples are cut
away, the rest of the signal is periodic again. It is easy to see that this is true for all
receiving antennas. Every receiving antenna can therefore be treated individually.
This leads to a modied received signal y[k] that can be written in the following
way:
y[k] =
_
M1

i=0
z[k i 16]
_
+n[k]
The newly introduced signal z[l] is dened as:
z[l] =
_
_
_
z
,l
C , if l = 0...15,
0 , else.
It is possible to calculate the different components z
,l
but it is in this case not necessary.
M is the number of available 16-sample subsignals. The noise signal n[k] is generally a
truncated version of the former noise signal n[k] and can be dened (assuming AWGN)
the following way:
Re{n[k]} =
_
_
_
n
kr
R so that n
kr
N(0,

2
n
2
) , if l = 0...16 M 1,
0 , else (cut away).
Im{n[k]} =
_
_
_
n
ki
R so that n
ki
N(0,

2
n
2
) , if l = 0...16 M 1,
0 , else (cut away).
E
_
Re{n[k]}
2
+Im{n[k]}
2

=
2
n
The Received Signal as a Random Variable
Each sample of the received subsignal can also be interpreted as a random variable:
y[k] N
_
z[mod
16
(k)],
2
n
_
5.2 PROPOSED ALGORITHM 43
The Averaged Signal
In a next step, the average s[l] of all 16-sample subsignals in y[k] is calculated. If an
innite amount of such subsignals would be available, the average is expected to be
z[l], as the noise terms cancel out according to the law of large numbers:
s[l] =
1
M
M1

i=0
y[l + 16 i]
=
1
M
[y[l] +y[l + 16] +... +y[l + (M 1) 16]]
= z[l] +
1
M
M1

i=0
n[l + 16 i]
The Averaged Signal as a Random Variable
The expectation of this average signal is calculated:
E[ s[l]] = z[l] +
1
M
M1

i=0
E[n[l + 16 i]]
= z[l]
The following property was used:
E[X +Y ] = E[X] +E[Y ]
Further, the variance of the average signal is calculated.
var( s[l]) = E
_
( s[l] z[l])
2

=
1
M
2
E[(
M1

i=0
n[l + 16 i]
. .
N(0,M
2
n
)
)
2
]
=

2
n
M
The following formulas were used:
X, Y N(0,
2
) , IID
44 5 ALGORITHM DESIGN
X +Y N(0,
2
+
2
)
var(Z) =
2
z
= E[(Z E[Z])
2
]
The average signal s[l] can then be written as a random variable:
s[l] N
_
z[l] ,

2
n
M
_
This result is plausible as the mean value is as expected and the variance decreases
linearly with an increasing number of samples.
The Signal Power
In a next step, the signal power
3
is calculated:
P
s
= R
s s
[0]
=
15

i=0
| s[i]|
2
=
15

i=0
s[i] ( s[i])

In those formulas, s[i]

denotes the complex conjugate and R


s s
denotes the autocorrela-
tion function of the signal s[i].
The Signal Power as a Random Variable
The mean and the variance of this signal are calculated. In order to do this, the
following formulas for the noncentered chi-square distribution (random variable Z) and
the expectation of a random variables are used:
X
i
N(,
2
i
)
Z =
k1

i=0
_
X
i

i
_
2
3
It has to be noted that the power of s is only equal to the signal power for the limes M . The
algorithm assumes that the signal power is equal to the power of s for all M > 1. This is justied by
the fact that at the end the SNR is estimated and not calculated.
5.2 PROPOSED ALGORITHM 45

z
=
k1

i=0
_

i
_
2
mean(Z) = k +
z

2
z
= var(Z) = 2 (k + 2
z
)
E[a X
n
] = a E[X
n
]
var(a X) = a
2
var(X)
Out of those formulas it can be seen that the power of s is noncentered chi-square
distributed. This can be written the following way:
P
s
=

2
n
M
15

i=0
_
| s[i]|

M
_
2
. .
:=Z

Z
=
15

i=0
_
z[i]

n
_
2
=
M

2
n
15

i=0
|z[i]|
2
mean(Z) = 16 +
M

2
n
15

i=0
|z[i]|
2
One could argue, that this is not true, as |z[i]| is not Gaussian distributed. But this does
not matter as the square is taken anyway. The following property holds:
|z
2
| = |z|
2
The mean signal power is then written as:
mean(P
s
) =

2
n
M
mean(Z)
=

2
n
M
_
16 +
M

2
n
15

i=0
|z[i]|
2
_
=
16
2
n
M
+
15

i=0
|z[i]|
2
46 5 ALGORITHM DESIGN
This result makes sense, as it is exactly the signal power for M (many samples)
or for
n
0 (no noise). Next, the variance is calculated:
var(P
s
) =

4
n
M
2
var(Z)
=

4
n
M
2
2
_
16 +
2M

2
n
15

i=0
|z[i]|
2
_
=
32
4
n
M
2
+
4
2
n
M
15

i=0
|z[i]|
2
As before, the variance is zero as expected for the cases M (many samples) or for

n
0 (no noise). It is slightly confusing that the signal power has an inuence on the
variance of the signal power. The following example helps to clarify the situation. It is
assumed that the noise power is in the range [1, 1] (not AWGN anymore). If the signal
amplitude is equal to 1, then the resulting signal power is distributed in the range [0, 4].
If the signal amplitude is assumed to be 3, then the resulting signal power is distributed
in the range [4, 16]. It is therefore obvious that a higher average signal power leads to a
higher variance in the total signal power.
The Signal Plus Noise Power
In the next step, the total power is calculated.
P
y
= R
yy
[0]
=
M161

i=0
|y[i]|
2
=
M161

i=0
y[i] (y[i])

5.2 PROPOSED ALGORITHM 47


The Signal Plus Noise Power as a Random Variable
As before, the total power is noncentered chi-square distributed (with the same argu-
mentation for |y[i]| as before):
P
y
=
2
n
M161

i=0
|y[i]|
2

2
n
. .
:=Z

Z
=
M161

i=0
_
|z[mod
16
(i)]|

n
_
2
=
M

2
n
15

i=0
|z[i]|
2
mean(Z) = 16 M +
M

2
n
15

i=0
|z[i]|
2
var(Z) = 2
_
16 M + 2
M

2
n
15

i=0
|z[i]|
2
_
This leads to the following mean power value:
mean(P
y
) =
2
n
mean(Z)
= M(16
2
n
+
15

i=0
|z[i]|
2
)
This is the expected result, as it is the sum of the signal power and the noise power.
The variance can be calculated as:
var(P
y
) =
4
n
var(Z)
= M
2
n

_
32
2
n
+ 4
15

i=0
|z[i]|
2
_
As expected, the variance goes to zero for
n
0 (no noise). It is slightly confusing
to have a factor of M in front of the variance term. But again, an example shows the
reason: Assume that the noise is in the interval [1, 1]. The signal amplitude is assumed
to be 1. If only one sample is taken, the signal power is in the region [0, 4]. If two
samples are taken, the total signal power is in the region [0, 8] = 2 [0, 4].
48 5 ALGORITHM DESIGN
The Noise Signal
It is also possible to calculate the estimated noise signal n[k] directly:
n[k] = y[k]
M1

i=0
s[k 16 i]
= n[k]
1
M
_
M1

i=0
n[mod
16
(k) +i 16]
_
=
M 1
M
n[k]
1
M
M1

i=0, such that mod


16
(k)+i16=k
n[mod
16
(k) +i 16]
. .
(M1) samples
The Noise Signal as a Random Variable
This result can also be written as a random variable:
n[k] N
_
M 1
M
n[k] ,
(M 1)
M
2

2
n
_
This result is plausible as for M the estimated value is equal to the exact value.
The Noise Power
As before, the noise power is dened as:
P
n
= R
n n
The Noise Power as a Random Variable
Similar to the other cases, the noise power is chi-square distributed:
P
n
=
(M 1)
M
2

2
n

M161

i=0
| n[i]|
2
(M1)
2
n
M
2
. .
:=Z
5.2 PROPOSED ALGORITHM 49

z
=
M161

i=0
|
M1
M
n[k]|
2
(M1)
2
n
M
2
=
(M 1)

2
n

M161

i=0
|n[k]|
2
E[|n
2
|]=
2
n
= M 16 (M 1)
mean(Z) = M
2
16
var(Z) =
2
z
= 2 16 M(2M 1)
This leads to the following mean noise power value:
mean( n) = 16 (M 1)
2
n
For M , this results in a mean value of
2
n
per sample as expected. The variance
can be calculated as:
var( n) = 32
4
n

(M 1)
2
(2M 1)
M
3
Summary of the Mean Power Terms Normalized Per Sample
As an overview, the mean values of the different power terms are presented here -
averaged per sample:
mean(P
y
) =
2
n
+
1
16
15

i=0
|z[i]|
2
mean(P
s
) =

2
n
M
+
1
16
15

i=0
|z[i]|
2
mean(P
n
) =
2
n
M 1
M
Those results indicate that the following property is true:
P
n
= P
y
P
s
The property cannot easily be proven. Numerical examples strongly indicate that the
property holds - and the mean values indicate it too. This property is important as it is
therefore needless to calculate the estimated noise signal and hardware costs can be
50 5 ALGORITHM DESIGN
saved. It also makes sense out of a physical point of view: The total power is the power
of the signal plus the power of the noise. So if from this total power the signal power is
subtracted, the remaining power is the noise power.
Calculation of the SNR
The last step is to estimate the SNR. This is done in the following way:
SNR :=
P
s
M
P
y
P
s
M
It is interesting to see what the average SNR looks like:
mean SNR =
M mean(P
s
)
mean(P
y
) M mean(P
s
)
=
1 +
M
16
2
n

15
i=0
|z[i]|
2
M 1
=
1 +M SNR
true
M 1
= E[SNR]
It has to be noted that this result is not equal to the expectation of the SNR, as the
following equation is generally not true:
A, B : arbitrary random variables
E
_
A
B A
_
=
E[A]
E[B] E[A]
It is not easily possible to calculate the expectation value of the division of two non-
centered chi-square variables. Therefore, the approximated values of the mean SNR
were calculated for several M and various SNR, as they should show a tendency. The
results can be seen in Fig. 5.2. As expected, the results get better with a higher M and
higher channel SNR.
5.2 PROPOSED ALGORITHM 51
0 5 10 15 20 25 30
5
4.5
4
3.5
3
2.5
2
1.5
1
0.5
0
SNR (channel) [dB]
S
N
R


S
N
R
_
h
a
t

[
d
B
]
Figure 5.2: This gure shows the difference between the expected SNR and the mean
calculated SNR. The lowest curve is for M = 2. Each higher curve increases the value
of M by one - the highest curve is for M = 100.
52 5 ALGORITHM DESIGN
5.2.3 Simulation of the Proposed Algorithm
The proposed algorithm was tested using the simulation environment
4
. No nonidealities
were considered in this run. The noise was purely AWGN. The results of the simulation
are shown in Fig. 5.3. It can be seen that the algorithm performs near the optimum
for M = 9. The result of the SNR estimation is independent of the channel estimator,
whereas the BER depends on the estimated channel matrix!
5.2.4 The Inuence of the Number of Samples
In a next step, the inuence of the number of available subsignals M was investigated
5
.
Fig. 5.2 together with Fig. 4.4 suggest that the inuence of the number of subsignals
should be rather small - at least for M > 4. The results can be seen in Fig. 5.4 and
Fig. 5.5.
5.2.5 The Mean Value of the Estimated SNR
As mentioned before, Fig. 5.2 only shows an approximation of the estimated SNR. The
exact curves were calculated using the simulation environment
6
. The results can be seen
in Fig. 5.6. Qualitatively, the curves look the same which proves that the approximation
made is quite accurate. The most obvious difference is the offset difference of around
one decibel that can be seen by comparing the two gures.
4
SNR range: 0-30 [dB](step: 1 [dB]) number of sweeps: 20000 (seed=0-9)
channel model: TGn C transmitting antennas: 4 receiving antennas: 4 number of tones: 64
channel estimator: FDMLE demapper: MMSE modulation: QPSK
5
SNR range: 0-30 [dB](step: 1 [dB]) number of sweeps: 20000 (seed=0-9)
channel model: TGn C transmitting antennas: 4 receiving antennas: 4 number of tones: 64
channel estimator: FDMLE demapper: MMSE modulation: QPSK
6
SNR range: 0-30 [dB](step: 1 [dB]) number of sweeps: 20000 (seed=0-9)
channel model: TGn C transmitting antennas: 4 receiving antennas: 4 number of tones: 64
channel estimator: FDMLE demapper: MMSE modulation: QPSK
5.2 PROPOSED ALGORITHM 53
0 5 10 15 20 25 30
10
3
10
2
10
1
10
0
SNR (channel) [dB]
B
E
R


proposed algorithm (M=9)
const 50dB
channel SNR
Figure 5.3: This gure shows the simulation results that were obtained using the
proposed algorithm with M = 9. The simulated BER is close to the best possible BER
and is as discussed already better than taking the exact channel SNR.
54 5 ALGORITHM DESIGN
0 5 10 15 20 25 30
10
3
10
2
10
1
10
0
SNR (channel) [dB]
B
E
R


channel SNR
constant 50dB
proposed algorith (M=9)
proposed algorith (M=8)
proposed algorith (M=7)
proposed algorith (M=6)
proposed algorith (M=5)
proposed algorith (M=4)
proposed algorith (M=3)
proposed algorith (M=2)
Figure 5.4: This gure shows the simulation results that were obtained using the
proposed algorithm with different M. As expected, the performance is better for high
M.
5.2 PROPOSED ALGORITHM 55
22 22.5 23 23.5 24 24.5 25
10
2
SNR (channel) [dB]
B
E
R


channel SNR
constant 50dB
proposed algorith (M=9)
proposed algorith (M=8)
proposed algorith (M=7)
proposed algorith (M=6)
proposed algorith (M=5)
proposed algorith (M=4)
proposed algorith (M=3)
proposed algorith (M=2)
Figure 5.5: This gure shows the same results as Fig. 5.4. It can be seen that the BER is
near the optimum for M > 4 and even the results with smaller M are still acceptable
(losing less than 1dB in the worst case M = 2).
56 5 ALGORITHM DESIGN
0 5 10 15 20 25 30
5
4
3
2
1
0
1
SNR (channel) [dB]
m
e
a
n
(
S
N
R


S
N
R
_
h
a
t
)

[
d
B
]


proposed algorithm (M=9)
proposed algorithm (M=8)
proposed algorithm (M=7)
proposed algorithm (M=6)
proposed algorithm (M=5)
proposed algorithm (M=4)
proposed algorithm (M=3)
proposed algorithm (M=2)
Figure 5.6: This gure shows the simulated mean SNR values of the algorithm for
several M.
5.2 PROPOSED ALGORITHM 57
5.2.6 Frequency Offset
In this simulation
7
, the effects of a frequency offset between transmitter and receiver
were investigated. As can be seen in Fig. 5.7, the effects of a frequency offset seem
to be negligible as long as the offset is below 100ppm (parts per million). Even at
200ppm, the performance is nearly ideal for the whole investigated range. A frequency
offset above 200ppm results in a visible degradation of the performance that eventually
gets worse than the 50dB constant estimator.
5.2.7 Ignore Frequency Offset and Save Hardware Costs
It would be possible to ignore the frequency offset and save hardware costs at the same
time by using the absolute values of the input samples instead of the real and imaginary
parts:
y
n
= received sample, complex
y
n
e
2ift
= received sample with frequency offset, complex
|y
n
e
2ift
| = |y
n
| = absolute value of the received sample, real
This scenario was tested in a simulation
8
. The results can be seen in Fig. 5.8. The
loss is around one decibel compared to the optimal case. This approach is therefore
interesting if a high frequency offset is present or additional hardware costs are to be
saved.
5.2.8 Limited Precision
To efciently implement the algorithm in hardware, one can only use a xed amount of
bits per sample. The inuence of the number of effective bits on the SNR estimation
7
SNR range: 0-30 [dB](step: 1 [dB]) number of sweeps: 20000 (seed=0-9)
channel model: TGn C transmitting antennas: 4 receiving antennas: 4 number of tones: 64
channel estimator: FDMLE demapper: MMSE modulation: QPSK
8
SNR range: 0-30 [dB](step: 1 [dB]) number of sweeps: 20000 (seed=0-9)
channel model: TGn C transmitting antennas: 4 receiving antennas: 4 number of tones: 64
channel estimator: FDMLE demapper: MMSE modulation: QPSK
58 5 ALGORITHM DESIGN
0 5 10 15 20 25 30
10
3
10
2
10
1
10
0
SNR (channel) [dB]
B
E
R


const 50dB
channel SNR
proposed algorithm (M=9), 20ppm frequency offset
proposed algorithm (M=9), 50ppm frequency offset
proposed algorithm (M=9), 100ppm frequency offset
proposed algorithm (M=9), 200ppm frequency offset
proposed algorithm (M=9), 500ppm frequency offset
proposed algorithm (M=9), 1000ppm frequency offset
proposed algorithm (M=9), 10000ppm frequency offset
Figure 5.7: This gure shows the simulated BER values for M = 9 under different
frequency offset scenarios.
5.2 PROPOSED ALGORITHM 59
0 5 10 15 20 25 30
10
3
10
2
10
1
10
0
SNR (channel) [dB]
B
E
R


const 50dB
channel SNR
proposed algorithm ( input=|input| )
Figure 5.8: This gure shows the simulation results of the proposed algorithm, where
each complex input sample was replaced by the absolute value of each sample (M = 9).
60 5 ALGORITHM DESIGN
was investigated using a simulation
9
. The results can be seen in Fig. 5.9. The plot
shows that there is no visible impact on the accuracy in the range from 0 to 30 dB
channel SNR as long as at least eight effective bits are used. Six effective bits already
show a distinctive deviation from the ideal case and four effective bits are denitely
not enough to store small noise terms.
5.2.9 Proposed Algorithm: Further Ideas and Simulations
Until now, all simulations of the proposed algorithm were conducted in an ideal envi-
ronment without any hardware effects or changing channel coefcients. Nonidealities
could include phase noise, a DC-offset, a small amplitude modulation or other types
of noise (e.g. shot noise). In some of those cases, it might be possible to adapt the
algorithm to take care of certain hardware effects - for example removing the offset by
a high pass lter.
But is it really a good idea to remove such effects? If the aim is to estimate the channel
SNR as accurately as possible, it would be favorable to do so. On the other hand, the
removal of those effects would likely decrease the performance of whole system, if
those effects are not removed for the rest of the received signal too. One can argue that
they probably dont originate in the channel, but can be treated as if they would.
9
SNR range: 0-30 [dB](step: 1 [dB]) number of sweeps: 20000 (seed=0-9)
channel model: TGn C transmitting antennas: 4 receiving antennas: 4 number of tones: 64
channel estimator: FDMLE demapper: MMSE modulation: QPSK
5.2 PROPOSED ALGORITHM 61
0 5 10 15 20 25 30
0
1
2
3
4
5
6
7
SNR (channel) [dB]
m
e
a
n
(
S
N
R


S
N
R
_
h
a
t
)

[
d
B
]


10 bit effective
8 bit effective
6 bit effective
4 bit effective
Figure 5.9: This gure shows the simulation results of the proposed algorithm, where
each sample has only a limited precision (M = 9). 8 effective bits for example mean,
that the absolute value of the largest received sample can be stored in a 8 bit unsigned
integer. The other samples are scaled proportionally.
62 5 ALGORITHM DESIGN
6 Implementation
6.1 Requirements and Limitations
The algorithm is implemented on a Xilinx Virtex4 FPGA (type: xc4vsx55 - see [32]).
As the algorithm has to share the space on the FPGA with other components, it is crucial
that the implementation is optimized for minimal hardware usage. The clock frequency
is given - so there is no point in optimizing the implementation for speed, as long as the
design is able to run with the given 80 MHz clock signal. The maximal allowed latency
is dened by the requirement that the results have to be ready before the consecutive
long preamble is fully arrived. The long preamble is 128 samples long and one sample
is arriving every fourth clock cycle. This results in a maximal allowed latency of 512
clock cycles.
The SNR estimation has to be done for each receiving antenna individually. Besides
the SNR, it is further necessary to compute the noise variance. The data is arriving
separated in a real and an imaginary part. One new data sample pair is arriving
every fourth clock cycle. Both signals are 10 bit long and use one of those 10 bits for
the sign (twos complement). The automatic gain control is adjusted in a way that
it approximately scales the largest signal parts of the short preamble to half of the
possible amplitude. This is done in order to prevent data loss as the peak to average
power ratio (PAPR) in MIMO-OFDM systems is potentially large [33]. This results in
an effective used data width of 8 bits for both the real and the imaginary part. As
discussed in the previous chapter, 8 effective bits are accurate enough for the expected
SNR range of 0 to 30 dB.
A complete list of all input and output singals with their denition can be found in
table 6.1.
For hardware cost analysis, the following assumptions that are consistent with the
testbed are made: The relevant data of the short preamble consists of a maximum of
10 complex subsignals with 16 samples each and 10 bits are used for each the real and
the imaginary part.
64 6 IMPLEMENTATION
NAME FORMAT DESCRIPTION
INPUTS
DATA_STREAM_REAL signed,
10 bit
one new sample each 4 clock cycles, real
part of the sample
DATA_STREAM_IMAG signed,
10 bit
one new sample each 4 clock cycles, imagi-
nary part of the sample
AGC_CONSTANT 1 bit is 0 if the AGC didnt freeze yet and 1 if
AGC frozen
SHORT_PREAM
_FINISHED
1 bit is 1 if the last sample of the short preamble
arrives and 0 otherwise
NEW_SAMPLE_READY 1 bit is 1 if a new sample arrived and 0 other-
wise
SELECT_OPERATION
_MODE
1 bit is 0 for constant SNR and 1 for estimated
SNR
CLOCK 1 bit clock signal
RESET 1 bit active low reset
OUTPUTS
SNR >10 bit,
unsigned
the calculated SNR value
SNR_READY 1 bit is 1 as soon as the SNR calculation nished,
otherwise 0 if no valid value present
SIGMA_S >10 bit,
unsigned
the calculated value for the noise variance
(sigma squared) per complex sample
SIGMA_S_READY 1 bit is 1 as soon as the sigma square calcula-
tion nished, otherwise 0 if no valid value
present
Table 6.1: This table shows all incoming and outgoing signals from the SNR estimation
block.
6.2 FIRST APPROACH 65
6.2 First Approach
The most direct way would be store all received values. Once all values arrived, one
could then calculate the desired values in parallel. This approach is equivalent to a
direct implementation of standard Software code and needs very little control logic. An
approximation of the hardware costs for this approach can be found in table 6.2.
TYPE NUMBER NEEDED USE
10 bit registers 2 160 = 320 store incoming values
10 bit multipliers 320 square each signal sample
20,21,... bit adders 320 adder tree for the total signal plus
noise power
.. .. ... and so on...
Table 6.2: Approximate hardware costs for approach 1. Only about half of the parts are
listed as it is obvious that this approach is not a good one considering the constraints.
6.3 Second Approach
As the data path for the total signal plus noise power and the estimated signal power
have highly different needs, it seems to be a smart idea to separate them.
The total signal plus noise power datapath needs two registers, two multipliers and
two adders. First, the real and the imaginary sample are both squared and then added.
This is equivalent to the total power squared of the actual sample. This power is added
to the power of the previous samples. Each 16 samples, the total power is written in
the second register to ensure that no partially nished cycles contribute to the nal
result. Some additional control logic is needed to enable the two registers.
The estimated signal power can also be simplied. It is not necessary to save all
samples, but only the present ones and the average of the previous ones. A schematic
using that approach and minimizing additional control by implementing a shift register
is presented in Fig. 6.1. Compared to the rst approach, this version already saves large
66 6 IMPLEMENTATION
amounts of hardware. The total costs are still high as can be seen in the approximation
in table 6.3.
TYPE NUMBER NEEDED USE
10 bit multipliers 2 total signal plus noise power dat-
apath
20/30 bit adder 1 each total signal plus noise power dat-
apath
30 bit register 2 total signal plus noise power dat-
apath
15 bit multipliers 2 16 = 32 SMART_ADDER - estimated sig-
nal datapath
15 bit register 2 16 = 32 SMART_ADDER - estimated sig-
nal datapath
15 bit adders 2 16 = 32 SMART_ADDER - estimated sig-
nal datapath
30, 31, 32, 33, 34 bit
adders
16, 8, 4, 2, 1 adder tree - estimated signal dat-
apath
10 bit registers 2 16 = 32 shift register - estimated signal
datapath
additional adders,
multipliers, logic,
dividers
to calculate the SNR and the noise
variance
Table 6.3: Approximate hardware costs for second approach.
6.4 Final Approach
Additional hardware costs can be saved in the estimated signal datapath. The adder
and the multiplier from the SMART_ADDER entity can be shared among all 16 samples,
as they dont produce any relevant data most of the time. As new samples only arrive
6.4 FINAL APPROACH 67
S
A
M
P
L
E
_
S
Q
U
A
R
E
_
E
N
T
S
A
M
P
L
E
_
S
Q
U
A
R
E
_
E
N
T
S
A
M
P
L
E
_
S
Q
U
A
R
E
_
E
N
T
S
A
M
P
L
E
_
S
Q
U
A
R
E
_
E
N
T
S
A
M
P
L
E
_
S
Q
U
A
R
E
_
E
N
T
S
A
M
P
L
E
_
S
Q
U
A
R
E
_
E
N
T
S
A
M
P
L
E
_
S
Q
U
A
R
E
_
E
N
T
S
A
M
P
L
E
_
S
Q
U
A
R
E
_
E
N
T
S
A
M
P
L
E
_
S
Q
U
A
R
E
_
E
N
T
S
A
M
P
L
E
_
S
Q
U
A
R
E
_
E
N
T
S
A
M
P
L
E
_
S
Q
U
A
R
E
_
E
N
T
S
A
M
P
L
E
_
S
Q
U
A
R
E
_
E
N
T
S
A
M
P
L
E
_
S
Q
U
A
R
E
_
E
N
T
S
A
M
P
L
E
_
S
Q
U
A
R
E
_
E
N
T
S
A
M
P
L
E
_
S
Q
U
A
R
E
_
E
N
T
S
A
M
P
L
E
_
S
Q
U
A
R
E
_
E
N
T
SAMPLE_SQUARE_ENT
DATA_STREAM_REAL
SIG_POW
DATA_STREAM_IMAG
D Q
Clk
Rst
En
D Q
Clk
Rst
0
INITIALIZE
FULL_CYCLE
SMART_ADDER_ENT
D Q
Clk
Rst
SMART_ADDER_ENT
SAMPLE_SQUARE_ENT
REAL_IN
IMAG_IN IMAG_OUT
SAMPLE_SQUARE
REAL_OUT
FULL_CYCLE
INITIALIZE
Figure 6.1: Second approach: Schematic of the datapath for the estimated signal
power. Compared to the rst approach, the costs are highly reduced.
68 6 IMPLEMENTATION
every fourth clock period, they could theoretically also be shared between the real and
the imaginary samples. This is not done as the amount of hardware to be saved is small
compared to the control overhead.
About half of the registers in the estimated signal datapath can be omitted by realizing
that it is not necessary to save all the old averaged samples. Those were introduced to
ensure that only full cycles are considered. Otherwise, one needed a counter for each
sample and a division through the number of samples that were added. This is not a
good option, as the division is more cost intensive than a few registers. The trick is not
to save all the averaged samples but the total power of those instead.
The different parts of the design are introduced in the following subsections.
6.4.1 SNR_EST_ENT
The SNR_EST_ENT entity is the top level design entity. A schematic can be seen in
Fig. 6.2. The thick lines represent the datapath and all other lines the controlpath. The
datapath is separated into two parts: The part for the total signal plus noise power
and the part for the estimated signal power.
The the total signal plus noise power datapath consists of:
The TOTAL_POWER_ENT(d=10,e=9): calculating the total power of M 16-
sample subsignals including all the noise.
The NUMBER_OF_FULL_CYCLES_ENT: counting the number of nished 16-
sample subsignals M.
The multiplier linking both of them - resulting in:
M
2
P
(16samplesubsignal)+noise
The estimated signal power datapath consists of:
Both AVERAGE_SIGNAL_ENT: adding up all M subsignals for both the real and
the imaginary part
The TOTAL_POWER_ENT(d=15,e=4): calculating the power of the added up
real and imaginary 16-sample subsignals - resulting in:
M
2
P
(16samplesubsignal)
The calculation of the SNR and the noise variance:
6.4 FINAL APPROACH 69
D
A
T
A
_
S
T
R
E
A
M
_
R
E
A
L
S
N
R
D
A
T
A
_
S
T
R
E
A
M
_
I
M
A
G
A
G
C
_
C
O
N
S
T
A
N
T
S
H
O
R
T
_
P
R
E
A
M
_
F
I
N
I
S
H
E
D
C
L
O
C
K
S
E
L
E
C
T
_
O
P
E
R
A
T
I
O
N
_
M
O
D
E
1
0
1
0
S
I
G
M
A
_
S
R
E
S
E
T
S
N
R

e
s
t
i
m
a
t
i
o
n

p
e
r

r
e
c
e
i
v
i
n
g

a
n
t
e
n
n
a
N
E
W
_
S
A
M
P
L
E
_
R
E
A
D
Y
T
O
T
A
L
_
P
O
W
E
R
_
E
N
T
(
d
=
1
0
,
e
=
9
)
D
A
T
A
_
S
T
R
E
A
M
_
R
E
A
L
D
A
T
A
_
S
T
R
E
A
M
_
I
M
A
G
T
O
T
A
L
_
P
O
W
E
R
I
N
I
T
I
A
L
I
Z
E
F
U
L
L
_
C
Y
C
L
E
_
F
I
N
I
S
H
E
D
N
E
W
_
S
A
M
P
L
E
_
R
E
A
D
Y
N
U
M
B
E
R
_
O
F
_
F
U
L
L
_
C
Y
C
L
E
S
_
E
N
T
F
U
L
L
_
C
Y
C
L
E
_
F
I
N
I
S
H
E
D
I
N
I
T
I
A
L
I
Z
E
N
U
M
B
E
R
_
O
F
_
F
U
L
L
_
C
Y
C
L
E
S
3
0
5
A
V
E
R
A
G
E
_
S
I
G
N
A
L
_
E
N
T
T
O
T
A
L
_
P
O
W
E
R
_
E
N
T
(
d
=
1
5
,
e
=
4
)
D
A
T
A
_
S
T
R
E
A
M
_
R
E
A
L
D
A
T
A
_
S
T
R
E
A
M
_
I
M
A
G
I
N
I
T
I
A
L
I
Z
E
F
U
L
L
_
C
Y
C
L
E
_
F
I
N
I
S
H
E
D
N
E
W
_
S
A
M
P
L
E
_
R
E
A
D
Y
1
5
1
5
3
5
3
5
D
A
T
A
_
S
T
R
E
A
M
S
A
M
P
L
E
_
C
O
U
N
T
E
R
N
E
W
_
S
A
M
P
L
E
_
R
E
A
D
Y
I
N
I
T
I
A
L
I
Z
E
A
V
_
S
I
G
_
S
E
R
I
A
L
F
U
L
L
_
C
Y
C
L
E
_
F
I
N
I
S
H
E
D
_
E
N
T
I
N
I
T
I
A
L
I
Z
E
F
U
L
L
_
C
Y
C
L
E
_
F
I
N
I
S
H
E
D
S
A
M
P
L
E
_
C
O
U
N
T
E
R
I
N
I
T
I
A
L
I
Z
E
_
E
N
T
A
G
C
_
C
O
N
S
T
A
N
T
I
N
I
T
I
A
L
I
Z
E
V
A
L
I
D
_
D
A
T
A
V
A
L
I
D
_
D
A
T
A
_
E
N
T
V
A
L
I
D
_
D
A
T
A
A
G
C
_
C
O
N
S
T
A
N
T
S
H
O
R
T
_
P
R
E
A
M
_
F
I
N
I
S
H
E
D
S
N
R
_
E
S
T
_
E
N
T
s
i
g
n
a
l
T
O
T
A
L
_
P
O
W
E
R
C
O
N
T
_
A
V
_
S
I
G
_
E
N
T
F
U
L
L
_
C
Y
C
L
E
_
F
I
N
I
S
H
E
D
S
I
G
_
S
E
L
S
I
G
_
R
E
A
D
Y
S
I
G
_
I
N
I
T
S
I
G
_
S
E
L
1 0

?
?
d
b

3
5
n
u
m
b
e
r

o
f

o
u
t
p
u
t
b
i
t
s
1
0

-
>

0
.
.
.
3
0
d
B
1
4

-
>

0
.
.
.
4
0
d
B
1
7

-
>

0
.
.
.
5
0
d
B
A
V
E
R
A
G
E
_
S
I
G
N
A
L
_
E
N
T
D
A
T
A
_
S
T
R
E
A
M
S
A
M
P
L
E
_
C
O
U
N
T
E
R
N
E
W
_
S
A
M
P
L
E
_
R
E
A
D
Y
I
N
I
T
I
A
L
I
Z
E
A
V
_
S
I
G
_
S
E
R
I
A
L
S
I
G
_
S
E
L
N
E
W
_
S
A
M
P
L
E
_
R
E
A
D
Y
N
R
_
D
I
V
I
S
I
O
N
_
E
N
T
D
A
T
A
_
B
R
E
S
U
L
T
D
A
T
A
_
A
B
O
T
H
_
V
A
L
U
E
S
R
E
A
D
Y
R
E
S
U
L
T
_
R
E
A
D
Y
S
N
R
_
R
E
A
D
Y
n o i s e
3
5
3
5
3
5
3
5
s i g n a l + n o i s e
N
R
_
D
I
V
I
S
I
O
N
_
E
N
T
D
A
T
A
_
B
R
E
S
U
L
T
D
A
T
A
_
A
B
O
T
H
_
V
A
L
U
E
S
R
E
A
D
Y
R
E
S
U
L
T
_
R
E
A
D
Y
L
S
R

4
3
1
3
1
S
I
G
M
A
_
S
_
R
E
A
D
Y
A
N
D
b
i
t
0
.
.
3
1
1
2

s

c
o
m
p
1
0
"
C
Y
C
L
E
S
_
S
"
"
D
_
F
F
1
"
"
D
_
F
F
2
"
"
D
_
F
F
3
"
p
i
p
e
l
i
n
e
p i p e l i n e
A
N
D
V
A
L
I
D
_
D
A
T
A
1 0

?
?

A
N
D
V
A
L
I
D
_
D
A
T
A
A
N
D
V
A
L
I
D
_
D
A
T
A
Figure 6.2: SNR_EST_ENT - top level design entity.
70 6 IMPLEMENTATION
The adder: Subtracting the signal power from the signal plus noise power resulting
in the noise power.
The rst divider (NR_DIVISION_ENT): Dividing the noise power rst by 16 and
then by M
2
to get the noise variance per sample:
2
n
The second divider (NR_DIVISION_ENT): Dividing the signal power by the noise
power, resulting in the SNR.
There is some additional logic at the output of the datapath to switch to constant SNR
and noise variance values instead of the estimated ones and some logic that sets the
ready signals to high as soon the calculations are nished.
The two registers with gray background inserted into the datapath and the control path
dont have any functional tasks. They are pipeline registers to ensure that the desired
clock period is met.
The control path consists of several small units. The general tasks are:
FULL_CYCLE_FINISHED_ENT: Counts the samples already arrived in the actual
16-sample subsignal and noties other entities if a complete subsignal arrived.
INITIALIZE_ENT: Initializes all other entities at the beginning.
VALID_DATA_ENT: Has a 1 at the output, as long as the arriving data is valid.
CONT_AV_SIG_ENT: Control unit for the estimated signal datapath.
6.4.2 TOTAL_POWER_ENT
The TOTAL_POWER_ENT entity calculates the power of a stream of data arriving
separated in the real and the imaginary part. The schematic can be seen in Fig. 6.3.
The implementation is straight forward: Once a new data sample arrives, the real
and the imaginary part are squared and added. They are then added to the previous
stored power. Once a full subsignal arrived, the total power is stored into the second
register. The rst register can further be initialized with the value zero to start a new
computation.
6.4 FINAL APPROACH 71
E
n
D
Q
C
l
k
R
s
t

T
O
T
A
L
_
P
O
W
E
R
_
E
N
T
I
N
I
T
I
A
L
I
Z
E
D
A
T
A
_
S
T
R
E
A
M
_
R
E
A
L
D
A
T
A
_
S
T
R
E
A
M
_
I
M
A
G
F
U
L
L
_
C
Y
C
L
E
_
F
I
N
I
S
H
E
D
T
O
T
A
L
_
P
O
W
E
R
E
n
D
Q
C
l
k
R
s
t
N
E
W
_
S
A
M
P
L
E
_
R
E
A
D
Y
d
2
*
d
d
2
*
d
2
*
d
+
1

w
o
r
s
t


c
a
s
e
:

1
6
(
s
a
m
p
l
e
s
)
*
5
(
s
u
b
s
i
g
n
a
l
s
)
*
4
(
O
F
D
M
-
s
y
m
b
o
l
s
)

=

3
2
0

r
u
n
s
2
^
9

=

5
1
2

=
=
>

3
0

b
i
t

=
=
>

e
=
9
2
*
d
+
1
+
e
2
*
d
+
1
+
e
2
*
d
+
1
+
e
2
*
d
+
1
+
e
2
*
d
+
1
+
e
1 0
O
R
1
1
1
1
#
#
Figure 6.3: TOTAL_POWER_ENT - calculating the power of a stream of data.
72 6 IMPLEMENTATION
6.4.3 AVERAGE_SIGNAL_ENT
The AVERAGE_SIGNAL_ENT entity sums up all the samples that belong together (e.g.
every rst sample of a 16-sample subsignal is stored in the register s0). The samples
are simply added up and not averaged in this entity to save the costs and latency of a
divider. The schematic can be seen in Fig. 6.4.
Every arriving sample is added to the corresponding sample from the previous subsig-
nals. As there is only one sample at a time arriving, the adder can be shared among
all 16 samples. To ensure that only nished subsignals have an inuence on the nal
result, the values are read out before they are overwritten by the new arriving values.
As before, some additional circuitry was added to initialize the entity.
6.4.4 FULL_CYCLE_FINISHED_ENT
The FULL_CYCLE_FINISHED_ENT entity counts the arriving samples within a 16-
sample subsignal and noties the NUMBER_OF_FULL_CYCLES_ENT entity if a com-
plete subsignal has arrived. The schematic can be seen in Fig. 6.5.
The implementation of the FULL_CYCLE_FINISHED_ENT consists of a counter and
some additional hardware. The delay register and the AND-gate at the output are there
to ensure that this entity only conrms a nished subsignal only if all 16 samples have
arrived. The other difculty is to reset the counter once it arrived at the number 15.
This can be done either by a controlled overow or over the multiplexer. The second
version is implemented.
6.4.5 NUMBER_OF_FULL_CYCLES_ENT
The NUMBER_OF_FULL_CYCLES_ENT entity counts the number of complete subsignals
that have arrived. The schematic can be seen in Fig. 6.5. The implementation of the
NUMBER_OF_FULL_CYCLES_ENT is straight forward: It consists of a counter with an
initialize mechanism.
6.4 FINAL APPROACH 73

I
N
I
T
I
A
L
I
Z
E
A
V
E
R
A
G
E
_
S
I
G
N
A
L
_
E
N
T
D
A
T
A
_
S
T
R
E
A
M
E
n
D
Q
C
l
k
R
s
t
E
n
D
Q
C
l
k
R
s
t
E
n
D
Q
C
l
k
R
s
t
E
n
D
Q
C
l
k
R
s
t
E
n
D
Q
C
l
k
R
s
t
E
n
D
Q
C
l
k
R
s
t
E
n
D
Q
C
l
k
R
s
t
E
n
D
Q
C
l
k
R
s
t
E
n
D
Q
C
l
k
R
s
t
E
n
D
Q
C
l
k
R
s
t
E
n
D
Q
C
l
k
R
s
t
E
n
D
Q
C
l
k
R
s
t
E
n
D
Q
C
l
k
R
s
t
E
n
D
Q
C
l
k
R
s
t
E
n
D
Q
C
l
k
R
s
t
E
n
D
Q
C
l
k
R
s
t
N
E
W
_
S
A
M
P
L
E
_
R
E
A
D
Y
S
A
M
P
L
E
_
C
O
U
N
T
E
R
A
V
_
S
I
G
_
S
E
R
I
A
L
1
0

w
o
r
s
t


c
a
s
e
:

2
0

r
u
n
s

=
=
>

2
^
5
1
5
1
5
1
0
1
5
1 0
s
0
s
1
s
2
s
3
s
4
s
5
s
6
s
7
s
8
s
9
s
1
0
s
1
1
s
1
2
s
1
3
s
1
4
s
1
5
1
5
.
.
0
1
5
1
5
1
5
1
5
1
5
1
5
1
5
1
5
1
5
1
5
1
5
1
5
1
5
1
5
1
5
1
5
0
1
2
3
4
5
6
7
8
9
1
0
1
1
1
2
1
3
1
4
1
5
S
I
G
_
S
E
L
1
4
4

t
o

1
6

d
e
m
u
l
t
i
p
l
e
x
e
r
0


1


2


3


4


5


6


7


8


9


1
0

1
1

1
2

1
3

1
4

1
5
OR
AND
r
e
s
t

o
f

t
h
e

e
n
a
b
l
e

e
n
t
r
i
e
s

a
s

t
h
e

f
i
r
s
t

o
n
e

-
b
u
t

w
i
t
h

d
i
f
f
e
r
e
n
t

d
e
m
u
x

o
u
t
p
u
t
s
!
4
4
1
1
1
1
1
5
Figure 6.4: AVERAGE_SIGNAL_ENT - averaging all samples that belong together.
74 6 IMPLEMENTATION

4
4
4

E
n
D
Q
C
l
k
R
s
t
4
b
i
t

0
b
i
t

1
b
i
t

2
b
i
t

3
F
U
L
L
_
C
Y
C
L
E
_
F
I
N
I
S
H
E
D
_
E
N
T
N
E
W
_
S
A
M
P
L
E
_
R
E
A
D
Y
I
N
I
T
I
A
L
I
Z
E
F
U
L
L
_
C
Y
C
L
E
_
F
I
N
I
S
H
E
D
S
A
M
P
L
E
_
C
O
U
N
T
E
R
V
A
L
I
D
_
D
A
T
A
A
N
D
A
N
D
A
N
D
4
4
1
1 0
4
1
1
1
1
4
A
N
D
O
R
OR
D
Q
C
l
k
R
s
t
A
N
D
E
n
D
Q
C
l
k
R
s
t

N
U
M
B
E
R
_
O
F
_
F
U
L
L
_
C
Y
C
L
E
S
_
E
N
T
F
U
L
L
_
C
Y
C
L
E
_
F
I
N
I
S
H
E
D
I
N
I
T
I
A
L
I
Z
E
N
U
M
B
E
R
_
O
F
_
F
U
L
L
_
C
Y
C
L
E
S
O
R
1 0
5
5
5
5
5
5
1
1
1

w
o
r
s
t


c
a
s
e
:

2
0

r
u
n
s

=
=
>

2
^
5
Figure 6.5: NUMBER_OF_FULL_CYCLES_ENT and FULL_CYCLE_FINISHED_ENT -
counting subsignals.
6.4 FINAL APPROACH 75
6.4.6 INITIALIZE_ENT
The INITIALIZE_ENT entity is a simple rising edge detector. The schematic can be
seen in Fig. 6.6.
INITIALIZE_ENT
AGC_CONSTANT
INITIALIZE
D Q
Clk
Rst
AND
Figure 6.6: INITIALIZE_ENT - initializes the rest of the circuit as soon as the AGC
freezes.
6.4.7 VALID_DATA_ENT
The VALID_DATA_ENT entity is a small automaton, that produces a logic one at its
output, as long as there are valid data samples from the short preamble arriving. This
means that the AGC has to be frozen and the short preambles are not nished yet. The
schematic can be seen in Fig. 6.7
6.4.8 CONT_AV_SIG_ENT
The CONT_AV_SIG_ENT entity is responsible for controlling the estimated signal dat-
apath. As soon as a complete subsignal arrived, the automaton produces the control sig-
nal to read out the correct values in the correct order from the AVERAGE_SIGNAL_ENT
entity. It further initializes and controls the second TOTAL_POWER_ENT entity. The
automaton is also responsible to start the division in the NR_DIVISION_ENT entity.
The schematic can be seen in Fig. 6.8
6.4.9 NR_DIVISION_ENT
The NR_DIVISION_ENT is a parametrized division entity. Division by a variable is
generally a rather complex operation in hardware - but cannot be circumvented in this
76 6 IMPLEMENTATION
VALID_DATA=0
VALID_DATA=1
AGC_CONSTANT, SHORT_PREAM_FINISHED
VALID_DATA_ENT
1,0 <- AGC froze
AGC_CONSTANT
VALID_DATA
0,x <- AGC still changing
1,1 <- AGC froze on last
short pream sample
AGC froze -> 1,0
valid data arriving
MOORE-Automaton
INPUT, INPUT
RESET
SHORT_PREAM_FINISHED
ST0
ST1
VALID_DATA=0
ST2
0,x
x,1
0,0 <- strange
1,x
Figure 6.7: VALID_DATA_ENT - monitors the state of the arriving samples.
case. There exist different algorithms with different advantages and disadvantages each.
It follows a list with the requirements for the division algorithm to be implemented:
division: Q = A/B + R. The residual R is not needed - which can be justied
the following way: The SNR value has a sufciently high resolution if only the
integer value is taken - except in the extreme low dB ranges that are usually not
of much interest: 1 0dB, 2 3dB, 3 4.8dB, 4 6dB,..., 100 20dB,
101 20.04dB,..
both A and B are p-bit unsigned values
latency small enough - ideally below 100 clock cycles
small hardware costs - preferably parametrizable
There exist four algorithms that are sometimes called slow division algorithms: Restor-
ing, Non-Performing, Non-Restoring and the SRT division algorithms. If the residual
is not needed, the Non-Restoring algorithm is faster than the Restoring and the Non-
Performing algorithms. The SRT algorithm uses a look-up-table (LUT) and is the one
responsible for the Intel Pentium Bug that was discovered in 1994 [34]. A LUT is
6.4 FINAL APPROACH 77
p
r
o
c
e
s
s
e
d

s
a
m
p
l
e

1
3
C
O
N
T
_
A
V
_
S
I
G
_
E
N
T
M
e
a
l
e
y
-
A
u
t
o
m
a
t
o
n
w
a
i
t
i
n
g
0
/
x
,
0
,
1
p
r
o
c
e
s
s
e
d

s
a
m
p
l
e

0
F
U
L
L
_
C
Y
C
L
E
_
F
I
N
I
S
H
E
D

/

S
I
G
_
S
E
L

,

S
I
G
_
R
E
A
D
Y

,

S
I
G
_
I
N
I
T

p
r
o
c
e
s
s
e
d

s
a
m
p
l
e

1
p
r
o
c
e
s
s
e
d

s
a
m
p
l
e

2
p
r
o
c
e
s
s
e
d

s
a
m
p
l
e

3
p
r
o
c
e
s
s
e
d

s
a
m
p
l
e

4
p
r
o
c
e
s
s
e
d

s
a
m
p
l
e

5
p
r
o
c
e
s
s
e
d

s
a
m
p
l
e

6
p
r
o
c
e
s
s
e
d

s
a
m
p
l
e

7
p
r
o
c
e
s
s
e
d

s
a
m
p
l
e

8
p
r
o
c
e
s
s
e
d

s
a
m
p
l
e

9
p
r
o
c
e
s
s
e
d

s
a
m
p
l
e

1
0
p
r
o
c
e
s
s
e
d

s
a
m
p
l
e

1
1
p
r
o
c
e
s
s
e
d

s
a
m
p
l
e

1
2
p
r
o
c
e
s
s
e
d

s
a
m
p
l
e

1
4
1
/
0
,
1
,
0
x
/
1
,
1
,
0
x
/
2
,
1
,
0
x
/
3
,
1
,
0
x
/
4
,
1
,
0
x
/
5
,
1
,
0
x
/
7
,
1
,
0
x
/
6
,
1
,
0
x
/
8
,
1
,
0
x
/
9
,
1
,
0
x
/
1
0
,
1
,
0
x
/
1
1
,
1
,
0
x
/
1
2
,
1
,
0
x
/
1
3
,
1
,
0
x
/
1
4
,
1
,
0
x
/
1
5
,
1
,
0
R
E
S
E
T
S
I
G
_
S
E
L
S
I
G
_
R
E
A
D
Y
S
I
G
_
I
N
I
T
4 1 1
F
U
L
L
_
C
Y
C
L
E
_
F
I
N
I
S
H
E
D
0
1
2
3
4
5 6 7
8
9
1
0
1
1
1
2
1
3
1
4
1
5
Figure 6.8: CONT_AV_SIG_ENT - control for the estimated signal datapath.
78 6 IMPLEMENTATION
undesirable for this application, as the hardware costs are not negligible and a LUT
makes it harder to parametrize the algorithm.
There exist two algorithms that are sometimes called fast division algorithms: The
Newton-Raphson and the Goldschmidt algorithms. Both of them need a LUT [35, 36].
The latter is used in AMD processors [37].
Out of these algorithms, the Non-Restoring digital division algorithm looks most promis-
ing. It is therefore further investigated. The algorithm is rather simple:
1. A, B are unsigned p-bit values
2. set: r
0
= A
3. start with i = 1 and repeat until i = p + 1
4. r
i
=
_
_
_
r
i1
B 2
p+1i
, if r
i1
0
r
i1
+B 2
p+1i
, if r
i1
< 0
5. q
pi
= 1 if r
i
0 and q
pi
= 0
6. when nished, Q = [q
pi
, ..., q
1
, q
0
] is the desired result
A numerical example for clarication can be found in Fig. 6.9.
A = 105, B = 5
A
B
=?
r
0
= 100
r
1
= 105 5 2
7
= 535 q
7
= 0
r
2
= 535 + 5 2
6
= 215 q
6
= 0
r
3
= 215 + 5 2
5
= 55 q
5
= 0
r
4
= 55 + 5 2
4
= 25 q
4
= 1
r
5
= 25 + 5 2
3
= 15 q
3
= 0
r
6
= 15 + 5 2
2
= 5 q
2
= 1
r
7
= 5 + 5 2
1
= 5 q
1
= 0
r
8
= 5 + 5 2
0
= 0 q
0
= 1
Q = 2
4
+ 2
2
+ 2
0
= 21 as expected
Figure 6.9: A numerical example for the digital Non-Restoring division algorithm.
The NR-Division algorithm needs a 2p + 1 bit adder, three p bit registers for the inputs
and the output and a multiplication by a power of two. The latter can be done by simply
shifting the desired value left (logical shift left - LSL). The comparison if a value is
smaller than zero is easy, as this information is stored in the most signicant bit (MSB)
6.4 FINAL APPROACH 79
if the twos complement number representation is used. Additionally, some control
logic is needed.
This NR-Division seems to fulll the low hardware requirements. The latency of this
algorithm is p clock cycles and therefore no problem for this application, as p = 35 is
much smaller than the allowed number of latency cycles. The only critical problem left
is the question, if the NR-Algorithm with its bit adder is fast enough to meet the clock
cycle requirements. A parametrized 35 bit version with some adaptions to meet the
clock cycle requirement is presented in Fig. 6.10. The parts with the gray background
fulll no functional tasks - they are simply present to shorten the longest path and
therefore allow for a higher clock frequency. It is not necessary to initialize this circuit,
as the algorithm changes all values automatically. Another possibility would be to make
the divider smaller and cut away the last few bits of the input values (i.e. dividing both
A and B by a power of two). This could lead to a loss in precision.
6.4.10 Mapping Onto FPGA
The different entities were programmed in VHDL and then mapped onto the FPGA. An
overview of the hardware costs can be seen in table 6.4. This overview shows that the
implementation uses only a small part of the available FPGA resources. The amount
of ip ops and 4LUTs can be further reduced by omitting one of the dividers at the
output. One could either share one divider for both outputs or if one of the outputs is
not needed, simply omit that divider. The results also show that most of the slices only
use logic or storage - but rarely both. It has to be further noted that the gate count is
rather a marketing number and is not suitable for direct comparison to standard ASIC
designs.
6.4.11 Testing
The system was tested using several sets of test vectors. It seems to perform as expected.
For the better understanding of the signal forms, a sample run can be seen in Fig. 6.11.
80 6 IMPLEMENTATION
E
n
D
Q
C
l
k
R
s
t
N
R
_
D
I
V
I
S
I
O
N
_
E
N
T
D
A
T
A
_
B
B
O
T
H
_
V
A
L
U
E
S
_
R
E
A
D
Y
R
E
S
U
L
T
R
E
S
U
L
T
_
R
E
A
D
Y
D
A
T
A
_
A
1 0
O
R
"
p
"
"
-
1
"
D
Q
C
l
k
R
s
t
E
n
D
Q
C
l
k
R
s
t
i
f

=
=

0
t
h
e
n

1
E
n
D
Q
C
l
k
R
s
t
p
1
l
o
g
i
c
a
l

s
h
i
f
t

l
e
f
t
b
y

N

b
i
t
s
N
0 1
"
0
"
2
p
2
p
+
1
1 0
p
2
p
+
1
2
p
+
1
2
p
+
1

(
*
)
(
*
)

t
h
e
o
r
e
t
i
c
a
l
l
y

2
p
+
2
,

b
u
t

n
o

o
v
e
r
f
l
o
w





p
o
s
s
i
b
l
e

d
u
e

t
o

r
e
s
t
r
i
c
t
i
o
n
s

o
f

i
n
p
u
t
s
D
Q
C
l
k
R
s
t
o
n
l
y

M
S
B
p
D
E
M
U
X
N
1
p
p
D
Q
C
l
k
R
s
t
"
B
"
"
R
"
"
S
H
I
F
T
_
B
"
"
S
U
M
"
"
M
S
B
"
"
Q
"
"
D
E
L
A
Y
" "
D
E
L
A
Y
_
O
U
T
"
A
N
D
1
1
o
u
t
p
u
t

p
o
u
t
p
u
t
s

p
-
1

.
.
.

0
O
R
"
Q
_
E
N
A
B
L
E
"
"
C
N
T
"
n
u
m
e
r
i
c
a
l

e
x
a
m
p
l
e
:
(

A
=
1
0
0

/

B
=
3

)

=

3
3

;

p
r
e
c
i
s
i
o
n
:

7

b
i
t
r
0

=

A

=

1
0
0

>

0
r
1

=

r
0

-

2
^
7
*
B

=

-
2
8
4

<


0

=
=
>

q
7

=

0



(
1
2
8
)
r
2

=

r
1

+

2
^
6
*
B

=


-
9
2

<


0

=
=
>

q
6

=

0



(
6
4
)
r
3

=

r
2

+

2
^
5
*
B

=




4

>
=

0

=
=
>

q
5

=

1



(
3
2
)
r
4

=

r
3

-

2
^
4
*
B

=


-
4
4

<


0

=
=
>

q
4

=

0



(
1
6
)
r
5

=

r
4

+

2
^
3
*
B

=


-
2
0

<


0

=
=
>

q
3

=

0



(

8
)

r
6

=

r
5

+

2
^
2
*
B

=



-
8

<


0

=
=
>

q
2

=

0



(

4
)
r
7

=

r
6

+

2
^
1
*
B

=



-
2

<


0

=
=
>

q
1

=

0



(

2
)
r
8

=

r
7

+

2
^
0
*
B

=




1

>
=

0

=
=
>

q
0

=

1



(

1
)



=
=
>

Q

=

0
1
0

0
0
0
1

=

3
3

a
s

e
x
p
e
c
t
e
d
n
o
n

r
e
s
t
o
r
i
n
g

d
i
g
i
t
a
l

d
i
v
i
s
i
o
n

a
l
g
o
r
i
t
h
m
A
,
B

>

0
Q

=

r
o
u
n
d
d
o
w
n
(
A
/
B
)
"
I
N
T
E
R
N
A
L
_
S
H
I
F
T
_
T
E
M
P
"
b i t p - 1
b i t 0
O R
p
D
Q
C
l
k
R
s
t
1 0
l
o
g
i
c
a
l

s
h
i
f
t

l
e
f
t
b
y


b
i
t
s
p
i
p
e
l
i
n
e

r
e
g
i
s
t
e
r
Figure 6.10: NR_DIVISION_ENT - the division entity.
6.4 FINAL APPROACH 81
Figure 6.11: Overview over all signals for the nal estimator entity.
82 6 IMPLEMENTATION
used available utilization
ip ops 1,236 49,152 2.5%
4 input Look Up Tables (4LUT) 2,409 49,152 4.9%
slices (= two 4LUT and two FF plus connec-
tions to adjacent slices)
1,644 24,576 6.7%
Digital Signal Processing blocks (DSP48,
used for multipliers)
6 512 1.2%
total equivalent gate count for design 29,436
Table 6.4: Overview over the hardware costs for the implementation of the SNR-
estimator.
7 Measurements
7.1 Measurements With Ofine Testbed
In a rst part, several measurements were made with the ofine testbed. An image of
the testbed can be seen in Fig. 7.1.
Figure 7.1: A picture of the MIMO-OFDM testbed with 4 antennas.
The measurement setup was the following: One of the testbeds transmits a packet of
data. This packet is then sent over the channel simulator (simulating TGn C channel)
84 7 MEASUREMENTS
and received by the second testbed. The received datapoints are read out by a software
environment and the SNR is calculated in that software. In a next step, the data is
decoded separately for the constant SNR estimators and the proposed estimator. The
BER is calculated for each case. This step is repeated for several output power settings
of the channel simulator. The change of the output power is equivalent to changing
the channel SNR. The whole procedure was repeated for 200 different TGN C channels
(1000 bit data each). At the end, the BER values were averaged. A plot with the results
can be seen in Fig. 7.2.
As can be seen, the proposed algorithm performs better than the constant 60dB SNR
estimator over the whole range. In the low SNR range, the proposed algorithm is
superior to the constant 30dB estimator. In the high SNR range, the constant 30dB
estimator seems to perform slightly better than the proposed algorithm.
The main reason for this behavior can be found in Fig. 7.3: The estimated SNR curve
attens in the high SNR region and the estimation is therefore too low. The reasons for
this loss in performance will be investigated in the following sections.
7.1.1 DC Carrier Removal
Inspection of the received signal showed that there was a slight offset that seemed to
be slowly time varying. This offset was removed by the use of a high order digial high
pass lter. It was necessary to periodically extend the received signal in order to neglect
border effects. Measurements showed that the use of such a high pass lter had no
visible inuence on the performance of the SNR estimator.
7.1.2 Four SNR Values Estimated but Only One Required
Each receiving antenna calculates one SNR. But which of those SNRs should be
forwarded to the decoding components? The most obvious idea would be taking the
average. This doesnt seem to be the best version. A few measurements indicate that
taking the highest of all four SNRs leads to a better performance. This option was
used in the measurement from Fig. 7.2. One explanation could be that it is safer to
overestimate the SNR than to underestimate it. Taking the largest of the four values
also could be able to suppress some hardware nonidealities. But there are also other
7.1 MEASUREMENTS WITH OFFLINE TESTBED 85

6
0

5
5

5
0

4
5

4
0

3
5

3
0
1
0

2
1
0

1
O
u
t
p
u
t

p
o
w
e
r

c
h
a
n
n
e
l

s
i
m
u
l
a
t
o
r

[
d
B
]
B E R


E
s
t
i
m
a
t
o
r
0 5 1
0
1
5
2
0
2
5
3
0
3
5
6
0
Figure 7.2: Measurement of the BER with the ofine testbed.
86 7 MEASUREMENTS

6
5

6
0

5
5

5
0

4
5

4
0

3
5

3
0

2
5
1
2
1
4
1
6
1
8
2
0
2
2
2
4
2
6
e s t i m a t e d S N R [ d B ]
o
u
t
p
u
t

p
o
w
e
r

c
h
a
n
n
e
l

s
i
m
u
l
a
t
o
r

[
d
B
]


p
r
o
p
o
s
e
d

a
l
g
o
r
i
t
h
m

(
H
W
)
o
p
t
i
m
a
l

g
u
e
s
s

(
r
o
u
g
h

e
s
t
i
m
a
t
i
o
n
)
Figure 7.3: Estimated SNR values with ofine testbed compared to expected SNR
values. The expected values were approximated by taking the best performing curves
from Fig. 7.2 for each output setting.
7.1 MEASUREMENTS WITH OFFLINE TESTBED 87
possibilities - e.g. taking the average of the two highest values. This topic needs to be
further investigated and possibly also depends on the employed hardware platform.
7.1.3 Scaling All Streams to Equal Noise
It is possible to scale each receiving stream such that all of the streams have the same
noise power. There are implementational issues with this idea as a signal overow
due to upscaling and loss in precision due to downscaling have to be avoided. First
measurements suggest that there is no increase in performance due to the scaling.
Further measurements would be necessary to reliably determine the effects of scaling
the streams to equal noise power.
7.1.4 Transmit Noise
Another issue is transmit noise. A model describing transmit noise can be found in
Fig. 7.4.
transmit
data: s
channel H
transmit
noise: n1
channel
noise: n2
AGC
receive
noise: n3
received
data: y
Figure 7.4: A transmit noise model.
The received signal can be described in the following way:
y = (H (s +n
1
) +n
2
) +n
3
It has to be noted that even if the transmit noise n
1
was assumed to be AWGN, the
corresponding noise at the receiver would not be white anymore.
The inuence of transmit noise was investigated by the use of a simulation
1
. The
following assumptions were made: n
1
such that a desired transmit SNR is reached.
1
SNR range: 0-30 [dB](step: 1 [dB]) number of sweeps: 20000 (seed=0..9)
channel model: TGn C transmitting antennas: 4 receiving antennas: 4 number of tones: 64
channel estimator: FDMLE/ideal demapper: MMSE modulation: QPSK
88 7 MEASUREMENTS
was set to one and n
3
was set to zero. n
2
was set such that the desired channel
SNR was achieved. The estimated SNR for several given transmit SNR can be seen in
Fig. 7.5. The attening in the high SNR region that was observed in the measurement
can also be seen in the simulation. Comparing the measured and the simulated SNR
curves, it seems that the testbed has a transmit SNR of approximately 27dB which is
approximately the value that was known beforehand. Transmit noise therefore seems
to be sufcient to explain the attening of the SNR curve in the high SNR region.
It is obvious that one cannot achieve a total SNR that is higher than the transmit SNR.
As the simulated curves are still visibly rising at the point where the channel SNR
reaches the transmit SNR, one could compensate the attening by the use of a LUT.
It remains to investigate if this attening is responsible for the increased BER. A
simulation
2
with 30dB transmit SNR can be seen in Fig. 7.6. It is clearly visible that
the proposed estimator is as good as optimal in the low SNR region. In the high SNR
region, the constant estimators that estimate a higher SNR than the proposed estimator
perform slightly better. The attening due to the transmit noise seems therefore to be
sufcient to explain the loss in performance of the proposed estimator in the high SNR
region.
7.2 Measurements With Online Testbed
The estimator block was inserted into the testbed once for each receiving antenna.
Measurements show that the SNR value from the hardware estimator is signicantly
lower than the value obtained from the ofine testbed. There are basically two issues:
The beginning and the end of the valid data are not easily detected and the frequency
offset between the clocks seems to be a major problem. The rst problem is a timing
problem that can be solved by inserting the appropriate delays. The second problem
can be reproduced with the ofine testbed by switching off the frequency offset com-
pensation. It therefore seems that the frequency offset is sufcient to explain the loss in
performance of the online testbed estimator.
2
SNR range: 0-30 [dB](step: 1 [dB]) number of sweeps: 20000 (seed=0..9)
channel model: TGn C transmitting antennas: 4 receiving antennas: 4 number of tones: 64
channel estimator: FDMLE/ideal demapper: MMSE modulation: QPSK
7.2 MEASUREMENTS WITH ONLINE TESTBED 89
0 5 10 15 20 25 30 35 40 45 50
0
5
10
15
20
25
30
35
40
SNR (channel) [dB]
e
s
t
i
m
a
t
e
d

S
N
R

[
d
B
]


proposed algorithm, 20dB transmit SNR
proposed algorithm, 30dB transmit SNR
proposed algorithm, 50dB transmit SNR
channel SNR = estimated SNR
Figure 7.5: Estimated SNR for several transmit SNR values
90 7 MEASUREMENTS
0 5 10 15 20 25 30
10
3
10
2
10
1
SNR (channel) [dB]
B
E
R


proposed algorithm
const=10dB
const=15dB
const=20dB
const=25dB
const=30dB
Figure 7.6: Simulation showing the BER for several estimators with 30dB transmit SNR.
7.2 MEASUREMENTS WITH ONLINE TESTBED 91
The problem with the frequency offset can be solved by placing the estimator in a dif-
ferent position on the testbed. Instead of using directly the downsampled datastreams,
the estimator could be placed further back after the synchronization block where the
frequency offset is compensated. A block diagram of the online testbed can be seen in
Fig. 7.7.
92 7 MEASUREMENTS
b
u
f
f
e
r
B
A
T

B
o
a
r
d
F
P
G
A

3

-

V
i
r
t
e
x

4
W
I
N
G

B
o
a
r
d
s
R
F
R
F
R
F
V
A
M
P

B
o
a
r
d
s
y
n
c
h
r
o
-
n
i
z
a
t
i
o
n
m
o
d
u
l
a
t
i
o
n
F
F
T
/
I
F
F
T
d
e
m
o
d
u
l
a
t
i
o
n
M
I
M
O
p
r
o
c
e
s
s
i
n
g
n
o
i
s
e
e
s
t
i
m
a
t
o
r
u
p
-

&

d
o
w
n
-
s
a
m
p
l
i
n
g
F
P
G
A

2

-

V
i
r
t
e
x

2

P
r
o
R
F
c
h
a
n
n
e
l
c
o
d
i
n
g
a
n
d

d
e
c
o
d
i
n
g
b
u
f
f
e
r
b
u
f
f
e
r
P
o
w
e
r
P
C
s
u
b
-
s
y
s
t
e
m
e
t
h
e
r
n
e
t
s
u
b
-
s
y
s
t
e
m
e
t
h
e
r
n
e
t

p
l
u
g
F
P
G
A

1

-

V
i
r
t
e
x

2

P
r
o
D
A
C
A
G
C
D
A
C
A
G
C
D
A
C
A
G
C
D
A
C
A
G
C
b
u
f
f
e
r
B
A
T

B
o
a
r
d
F
P
G
A

3

-

V
i
r
t
e
x

4
W
I
N
G

B
o
a
r
d
s
R
F
R
F
R
F
V
A
M
P

B
o
a
r
d
s
y
n
c
h
r
o
-
n
i
z
a
t
i
o
n
m
o
d
u
l
a
t
i
o
n
F
F
T
/
I
F
F
T
d
e
m
o
d
u
l
a
t
i
o
n
M
I
M
O
p
r
o
c
e
s
s
i
n
g
u
p
-

&

d
o
w
n
-
s
a
m
p
l
i
n
g
n
o
i
s
e
e
s
t
i
m
a
t
o
r
F
P
G
A

2

-

V
i
r
t
e
x

2

P
r
o
R
F
c
h
a
n
n
e
l
c
o
d
i
n
g
a
n
d

d
e
c
o
d
i
n
g
b
u
f
f
e
r
b
u
f
f
e
r
P
o
w
e
r
P
C
s
u
b
-
s
y
s
t
e
m
e
t
h
e
r
n
e
t
s
u
b
-
s
y
s
t
e
m
e
t
h
e
r
n
e
t

p
l
u
g
F
P
G
A

1

-

V
i
r
t
e
x

2

P
r
o
D
A
C
A
G
C
D
A
C
A
G
C
D
A
C
A
G
C
D
A
C
A
G
C
Figure 7.7: Left image: Block diagram of the online testbed without frequency offset
compensation. Right image: Alternative block diagram of the online testbed that solves
the frequency offset problem.
8 Summary, Conclusion and Outlook
8.1 Summary
The rst chapter of this thesis presents the ofcial task description for this semester
thesis. The aim is to implement a noise variance estimator (or equivalently a SNR
estimator) for a MIMO-OFDM testbed. In the following, the term SNR estimator is used
instead of noise variance estimator, as the SNR is a value that is easier understandable.
The two can be converted into each other by the following equation:
SNR =
signal power
noise variance
In the second chapter of this thesis, the basics of MIMO-OFDM communication are
presented. It is further justied why it is worth to increase the hardware costs in order
to use MIMO-OFDM instead of a simple SISO system. The principal arguments are
a higher throughput and more robustness against noise. Several channel models are
presented including the TGn channels.
In the third chapter, an overview of the actual state of research in the topic of noise
estimation is presented. There exist a larger number of different algorithms. Most
of the presented algorithms are not directly applicable to MIMO-OFDM. Some of the
remaining ones have high hardware costs or exhibit poor performance. There are a few
algorithms that look promising for estimating the SNR in a MIMO-OFDM system.
The fourth chapter introduces the simulation environment. It is justied why imple-
menting a good SNR estimator is worth the effort. It is shown that a good SNR estimator
can lower the BER in a MIMO-OFDM system compared to a constant SNR estimator
that was used beforehand. It is also elaborated why the best performance is attained by
estimating the SNR two decibels lower than the actual channel SNR.
At the beginning of the fth chapter, two of the most promising SNR estimation
algorithms are implemented and tested. It is found that algorithms working in the
frequency domain are highly dependent on the quality of the channel estimator. This
94 8 SUMMARY, CONCLUSION AND OUTLOOK
is not desired out of two reasons: Firstly, it is not desired to be dependent on another
component. And even more important, a high quality channel estimator signicantly
decreases the throughput of the system. Out of those reections, the autor proposes
a novel algorithm working in the time domain. The mathematical description of the
algorithm is elaborated. In a next step, the algorithm is simulated on a TGn C channel
using several scenarios. The algorithm seems to perform quite well in situations that
might be found in a real system.
In the sixth chapter, the proposed algorithm is implemented on a FPGA. The different
blocks of the nal version are described in detail. One of the most critical blocks was
the divider, as cell libraries usually do not provide dividers. The non restoring digital
division algorithm was found to be the best suited solution. At the end, the algorithm
was successfully mapped onto an FPGA.
The seventh chapter presents measurements with the ofine testbed. The results show
that the presented SNR estimator is superior to the previously used constant SNR. There
is one issue with the attening of the estimated SNR curve in the high SNR region.
The reason for this behavior is the transmit noise present in the transmitter hardware.
An idea for a solution using a LUT is presented. In a next step, measurements with
the online testbed were conducted. The estimated SNR is signicantly lower than
expected. The main issue here is the frequency offset between the transmitter and the
receiver. This problem should be solvable by putting the SNR estimator block behind
the synchronization block already present on the testbed.
8.2 Conclusion and Outlook
The proposed algorithm performs in the ofine testbed visibly better than the previous
constant 30dB SNR estimator. If one assumes that the region of interest starts some-
where above 0dB SNR and ends at around 27dB SNR (limited by hardware noise), the
algorithm has an acceptable performance over the whole range. As already mentioned
before, one could further try to compensate the attening in the high SNR region.
In this work, it was not investigated how the algorithm performs if the SNR is below
0dB. It seems that this region is not of interest in the given case. If SNRs between zero
and ten decibels were expected regularly, it would probably be worth to supplement
the hardware implementation with some extra precision. It could also be interesting
8.2 CONCLUSION AND OUTLOOK 95
to further decrease the hardware costs by sharing the two dividers at the output (or
leaving one away) or by removing some precision in the high SNR region.
The hardware implementation still needs to be fully incorporated into the online testbed
and further measurements have to be conducted in order to evaluate the exact benet
of implementing the SNR estimator.
Further, the algorithm could still be enhanced. It is not yet investigated thoroughly,
which combination of the four estimated SNRs should be taken. It remains also to
investigate what effects the scaling of all streams to equal noise power generates. It
could further be interesting to investigate, if it would be worth to subtract the two
decibels of noise that are added by the channel estimator. There are also some ideas
available about how to further increase the performance of the algorithm itself. Scaling
all samples to the same area, removing the offset or ltering them are only a few
ideas. One has to watch out that one of the strong points of the algorithm doesnt get
destroyed: Its simplicity. There are the small hardware costs as well as the use of the
already present short preambles.
There exist also further possibilities for simulations: The effects of having different
noise and signal powers on each of the receiving streams or the effects of a slowly
fading channel could be further investigated.
96 8 SUMMARY, CONCLUSION AND OUTLOOK
Bibliography
[1] Helmut Blcskei, MIMO-OFDM Wireless Systems: Basics, Perspectives and Chal-
lenges, IEEE Wireless Communications, Volume 13, Issue 4, August 2006.
[2] LANCOM Systems GmbH, LANCOM Techpaper: 802.11n im Uberblick,
www.lancom.de, 2008.
[3] Luis Litwin and Michael Pugel, The Principles of OFDM, RF signal processing,
January 2001.
[4] S. B. Weinstein, Data Transmission by Frequency-Division Multiplexing Using
the Discrete Fourier Transform, IEEE Transactions on Communication Technology,
Volume 19, Issue 5, October 1971.
[5] Cisco Systems GMBH, berblick ber die Wireless-Technologie 802.11n,
www.cisco.com, 2007.
[6] Vinko Erceg, Laurent Schumacher, Persefoni Kyritsi, et al., TGn Channel Models,
doc.: IEEE 802.11-03/940r4, May 2004.
[7] Carlos H. Aldana, Atul A. Salvekar, Jose Tellado and John Ciof, Accurate Noise
Estimates in Multicarrier Systems, IEEE Vehicular Technology Conference, 2000.
[8] Doukas Athanasios and Grigorios Kalivas, SNR Estimation Algorithms in AWGN
for HiperLAN/2 Transceiver, Applied Electronics Laboratory, Department of Electri-
cal Computer Engineering, University of Patras, 2005.
[9] , SNR Estimation for Low Bit Rate OFDM Systems in AWGN Channel,
Proceedings of the International Conference on Networking, International Conference
on Systems and International Conference on Mobile Communications and Learning
Technologies, 2006.
[10] Norman C. Beaulieu, Comparison of Four SNR Estimators for QPSK Modulations,
IEEE Communications Letters, Volume 4, Issue 2, February 2000.
98 BIBLIOGRAPHY
[11] Dae-Ki Hong, Cheol-Hee Park, Min-Chul Ju, Kyu-Jung Youn, Sun-Do Jun and Jin-
Woong Cho, SNR Estimation in Frequency Domain Using Circular Correlation,
IEEE Electronics Letters, Volume 38, Issue 25, December 2002.
[12] Sandrine Boumard, Novel Noise Variance and SNR Estimation Algorithm for Wire-
less MIMO OFDM Systems, Global Telecommunications Conference, GLOBECOM
03, Volume 3, 2003.
[13] David R. Pauluzzi and Norman C. Beaulieu, A Comparison of SNR Estimation
Techniques for the AWGN Channel, IEEE Transactions on Communications, Volume
48, Issue 10, October 2000.
[14] Bin Li, Robert DiFazio and Ariela Zeira, A Low Bias Algorithm to Estimate
Negative SNRs in an AWGN Channel, IEEE Communications Letters, Volume 6,
Issue 11, November 2002.
[15] GuangLiang Ren, YiLin Chang and Hui Zhang, A New SNRs Estimator for QPSK
Modulations in an AWGN Channel, IEEE Transactions on Circuits and Systems II:
Express Briefs, Volume 52, Issue 6, June 2005.
[16] GuangLiang Ren, YiLin Chang and HuiNing Zhang, SNR Estimation Algorithm
Based on the Preamble for Wireless OFDM Systems, Science in China Series F:
Information Sciences, Volume 51, Issue 7, July 2008.
[17] Timothy M. Schmidl and Donald C. Cox, Robust Frequency and Timing Synchro-
nization for OFDM, IEEE Transactions on Communications, Volume 45, Issue 12,
December 1997.
[18] Dong-Joon Shin, Wonjin Sung and In-Kyung Kim, Simple SNR Estimation Meth-
ods for QPSK Modulated Short Bursts, Global Telecommunications Conference,
GLOBECOM 01, 2001.
[19] Xiaodong Xu, Ya Jing, Xiaohu Yu, Subspace-Based Noise Variance and SNR
Estimation for OFDM Systems, IEEE Wireless Communications and Networking
Conference, 2005.
[20] Huilin Xu, Guo Wei and Jinkang Zhu, A Novel SNR Estimation Algorithm for
OFDM, IEEE Vehicular Technology Conference, 2005.
BIBLIOGRAPHY 99
[21] Tevk Ycek and Hseyin Arslan, MMSE Noise Power and SNR Estimation for
OFDM Systems, IEEE Transactions on Vehicular Technology, Volume 56, Issue 6,
2006.
[22] , Noise Plus Interference Power Estimation in Adaptive OFDM Systems,
IEEE Transactions on Vehicular Technology, Volume 56, Issue 6, 2005.
[23] Nader S. Alagha, Cramer-Rao Bounds of SNR Estimates for BPSK and QPSK
Modulated Signals, IEEE Communications Letters, Volume 5, Issue 1, January
2001.
[24] Thomas R. Benedict, The Joint Estimation of Signal and Noise From the Sum
Envelope, IEEE Information Theory, Volume 13, Issue 3, July 1967.
[25] Shousheng He and Mats Torkelson, Effective SNR Estimation in OFDM System
Simulation, Global Telecommunications Conference, GLOBECOM 98, Volume 2,
1998.
[26] M. C. Jeruchim and R.J. Wolfe, Estimation of the Signal-to-Noise Ratio (SNR) in
Communication Simulation, Global Telecommunications Conference, GLOBECOM
89, 1989.
[27] R. B. Kerr, On Signal and Noise Level Estimation in a Coherent PCM Channel,
IEEE Transactions on Aerospace and Electronic Systems, Volume 2, Issue 4, March
1966.
[28] Mustafa Trkboylari and Gordon L. Stber, An Efcient Algorithm for Estimating
the Signal-to-Interference Ratio in TDMA Cellular Systems, IEEE Transactions on
Communications, Volume 46, Issue 6, June 1998.
[29] Ami Wiesel, Jason Goldberg and Hagit Messer, Data-Aided Signal-to-Noise-Ratio
Estimation in Time Selective Fading Channels, IEEE International Conference on
Acoustics, Speech and Signal Processing, 2002.
[30] , Non-Data-Aided Signal-to-Noise-Ratio Estimation, IEEE International Con-
ference on Communications, 2002.
[31] Ami Wiesel, Jason Goldberg and Hagit Messer-Yaron, SNR Estimation in Time-
Varying Fading Channels, IEEE Transactions on Communications, Volume 54, Issue
5, May 2006.
[32] Xilinx Inc., Virtex-4 Family Overview, Product Specication, 2007.
100 BIBLIOGRAPHY
[33] Volker Jungnickel and Eduard Jorswiek et. al., White Paper: MIMO-OFDM in the
TDD Mode, Fraunhofer Institute for Telecommunications, Heinrich-Hertz Institut,
2005.
[34] A. Strey, Computer Arithmetik, SS 2005, Lecture Slides - Universitt Ulm, April
1998.
[35] Peter Markstein, Software Division and Square Root Using Goldschmidts Algo-
rithm, Real Numbers and Computers6, 146-157, November 2004.
[36] Reto Zimmermann, Computer Arithmetic: Principles, Architectures and VLSI
Design, Lecture Notes; Integrated Systems Laboratory - Swiss Federal Institute of
Technology (ETH), 1999.
[37] Taek-Jun Kwon, Jeff Draper, Floating-Point Division and Square Root Implemen-
tation Using a Taylor-Series Expansion Algorithm With Reduced Look-Up Tables,
Symposium on Circuits and Systems, 2008.