You are on page 1of 50

DEGREE PROJECT IN ELECTRICAL ENGINEERING,

SECOND CYCLE, 30 CREDITS


STOCKHOLM, SWEDEN 2017

Analysis of Wi-Fi performance data


for a Wi-Fi throughput prediction
approach

DAN PAN

KTH ROYAL INSTITUTE OF TECHNOLOGY


SCHOOL OF INFORMATION AND COMMUNICATION TECHNOLOGY
Analysis of Wi-Fi performance data for a
Wi-Fi throughput prediction approach

DAN PAN

Master of Science Thesis


Major in Communication System, KTH.
June 2017

KTH Examiner:Professor Anders Vstberg


KTH Supervisor: Professor Ki Wong Sung
Telenor Supervisor: Rius i Riu Jaume
KTH School of Information and Communications Technology (ICT)
Communication System
TRITA-ICT-EX-2010:X
c Dan Pan, June 2017
Tryck: Universitetsservice AB
Abstract

Due to low cost and portability of Wi-Fi technologies, wireless network deployment has been widely

accepted in the residential environment. The evaluation results of customers’ home wireless net-

work performance level provides a reference for operators to improve their network capacity in

order to face the emerging requirement of Wi-Fi service. However, the dynamic nature of Wi-Fi

network makes Wi-Fi performance analysis difficult to perform. In this thesis, a Wi-Fi parameter

visualization tool is implemented to show users’ Wi-Fi performance in a graphic way. This tool

could help operators investigate customers’ Wi-Fi environment to see if performance degradation

exists or not. Besides, a machine learning method is used for Wi-Fi performance analysis to predict

Wi-Fi throughput. A SVM-based classification model is proposed to work as a prediction function.

This function takes Wi-Fi parameters both for target AP and nearby interference APs as input,

and output is categorized Wi-Fi throughput, good, medium, poor or very poor. Different SVM

kernel functions conducted to evaluate the proposed model and results show that classification ac-

curacy can be up to 0.88. It demonstrates that Wi-Fi throughput could be classified using a simple

measurement way and limited Wi-Fi physical parameters.

iii
Sammanfattning

På grund av låg kostnad och hög bärbarhet, för Wi-Fi-teknik, har trådlösa nätverk blivit mycket

vanliga i bostadsmiljön. Den stora anvndningen av Wi-Fi-tjänster betyder att operatrerna vill

förbättra nätverkstjänsterna, genom att känna till kundernas prestanda fr deras trådlsa nätverk i

hemmen. De dynamiska egenskaperna hos Wi-Fi-ntverk gr det dock svårt att utföra analysen av

Wi-Fi data.

I denna avhandling implementeras ett Wi-Fi-parameter visualiseringsverktyg, för att visa användar-

nas Wi-Fi-prestanda på ett graskt stt. Det här verktyget kan hjälpa operatörer att underska kun-

dernas Wi-Fi-miljö, för att se om prestanda försämras eller ej.

Dessutom föreslås en SVM-baserad klassiceringsmodell för att förutsäga Wi-Fi-genomstrmning.

Denna klassiceringsmodell fungerar som en prediktionsfunktion som tar Wi-Fi-parametrar både för

den egna accesspunkten och närliggande accesspunkters interferens som input, och för utsignalen

kategoriseras datatakten som: bra, medium, fattig eller mycket dålig. Olika SVM-körfunktioner

utförda för att utvärdera den föreslagna modellen och resultaten visar att klassiceringsnoggrannhe-

ten kan vara upp till 0,88. Det visar att Wi-Fi-datatakten kan klassiceras med ett enkelt mätverktyg

och genom att känna till begränsat antal Wi-Fi- parametrar.

iv
Acknowledgements

I would like to thank my supervisor Rius i Riu Jaume in Telenor for the opportunity to conduct

this valuable master thesis project. He helps me through all aspects of the project, guiding me in

the right direction, arranging meetings with other Telenor colleges who may help in my project,

giving feedbacks on my reports and presentations.

Furthermore, I would like to thank Professor Ki Won Sung, my KTH academic supervisor and

Professor Anders Västberg, my KTH academic examiner, for organizing the monthly seminar

during the whole project, providing useful feedbacks from an academic point of view.

Also, I would like to thank other Telenor colleges, Tingsborg Fredrik, Wistedt Anna-Clara,

Roos Christer for providing data collection tool experiment equipments, helping with technique

problems.

v
Contents

1 Introduction 1
1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Research objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.5 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.6 Benefits and Social Impact . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.7 Thesis Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Background 6
2.1 Wi-Fi Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.1 IEEE 802.11 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.2 Overview of IEEE 802.11 standards . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.3 Wi-Fi network performance parameter . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Wi-Fi data measurement and analysis tool . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.1 Data measurement method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.2 Data measurement tool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2.3 Data analysis tool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Machine Learning Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3 Visualization of Wi-Fi parameters data analysis and evolution over time 13


3.1 Parameter affecting Wi-Fi performance . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.2 Analysis and visualizing Wi-Fi parameters evolution over time . . . . . . . . . . . . 14
3.2.1 End-user results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.2.2 Accesspoint results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

4 Proposed Estimation Model for Wi-Fi Performance 19


4.1 Estimation model function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.2 Machine learning based Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.2.1 Support Vector Machine (SVM) modeling . . . . . . . . . . . . . . . . . . . . 20
4.2.2 Feature Normalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.2.3 Model parameters selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.2.4 Model performance evaluation metric . . . . . . . . . . . . . . . . . . . . . . 23
4.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

vi
Contents vii

5 Experiments and Results 24


5.1 Experiment 1: No control with Neighbor traffic . . . . . . . . . . . . . . . . . . . . . 24
5.1.1 Experimental Testbed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
5.1.2 Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.1.3 Experiment Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
5.2 Experiment 2: Control with neighbor traffic . . . . . . . . . . . . . . . . . . . . . . . 30
5.2.1 Experimental Testbed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
5.2.2 Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
5.2.3 Experiment Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

6 Conclusion and Future Work 35


6.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

Bibliography 37

Bibliography 39
List of Tables

2.1 IEEE 802.11 PHY Standards[14] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

5.1 Traffic sender and receiver specifications . . . . . . . . . . . . . . . . . . . . . . . . . 25


5.2 AP Configurations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.3 Iphone 6 throughput classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
5.4 Accuracy with different kernel functions . . . . . . . . . . . . . . . . . . . . . . . . . 28
5.5 Gussian Accuracy Score . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
5.6 Traffic sender and receiver specifications . . . . . . . . . . . . . . . . . . . . . . . . . 31
5.7 APs Configurations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
5.8 SVC kernel Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.9 Accuracy Score for Testing set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

viii
List of Figures

1.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2


1.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Performance predictive Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.1 802.11 basic service set(BSS) infrastructure . . . . . . . . . . . . . . . . . . . . . . . 6


2.2 2,4GHz band with 20MHz channel band[15] . . . . . . . . . . . . . . . . . . . . . . . 7
2.3 2,4GHz band with 40MHz channel band[15] . . . . . . . . . . . . . . . . . . . . . . . 8
2.4 5GHz band[15] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.5 Classification example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.6 Regression example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

3.1 Specified STA parameter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14


3.2 all STAs rssi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.3 all STAs transmit physical data rate . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.4 all STAs receive physical data rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.5 2,4GHz channel parameter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.6 2,4GHz channel noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.7 2,4GHz channel neighbor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

4.1 Proposed Wi-Fi performance prediction model . . . . . . . . . . . . . . . . . . . . . 20


4.2 SVC hyperplane concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

5.1 Experimental Deployment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24


5.2 Testbed Place . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
5.3 SVC Feature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
5.4 Screenshot for dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
5.5 Performance for different measurement points . . . . . . . . . . . . . . . . . . . . . . 29
5.6 Accuracy with different input features . . . . . . . . . . . . . . . . . . . . . . . . . . 29
5.7 Experimental Deployment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
5.8 SVM Feature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.9 Performance for different dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.10 Performance for different features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

ix
Chapter 1

Introduction

1.1 Overview
In today's digital technology environment, Wi-Fi, an acronym for Wireless Fidelity based on the
IEEE 802.11 specifications, plays a significant role facilitating access to the Internet. It could
provide any person to connect to the network anywhere without the need of any wires. Moreover,
according to a cisco report[1], traffic from wireless and mobile devices will account for two-thirds
of total IP traffic by 2020. It states that Wi-Fi and mobile devices will account for 66 percent
of IP traffic. In other words, clients of Wi-Fi operators are more online than ever. Therefore,
an important target for operators is to improve customer's broadband experience for delivering a
reliable wireless service.

Due to these high expectations from wireless users, the broadband operators focus on monitoring the
performance of a wireless network and trying to improve it from deeply understanding customers'
wireless environment.
For these reasons Telenor Sverige AB supported the development of three master thesis activities,
aiming at:

• Study on the Wi-Fi data collected from access points, visualize the data and propose a Wi-Fi
performance predictive model to find out the possible factors affecting Wi-Fi performance, carried
out by me.

• Propose an appropriate performance optimization approach based on the Wi-Fi data analysis
results, carried out by Diego Alonso Landa Torrejon[2].
• Develop a GUI tool to present customers' relevant data. This tool is intended to show performance
metrics at the appropriate aggregation and complexity level, as requested by the end user. This
activity is carried out by Yuqing Gu[3].

This thesis presents two contributions. First, a Wi-Fi data visualization tool was developed to
show physical layer metrics variation over a given time interval. Second, this thesis demonstrated
an analysis of the relation between the Wi-Fi performance parameters and a Wi-Fi performance

1
1.2. Related Work 2

indicator, the saturated throughput.This thesis describes how to use a limited Wi-Fi parameter set
to accurately estimate Wi-Fi throughput under the controlled radio communication environment
by using Support Vector Machine (SVM) learning techniques.

1.2 Related Work


There are certain of previous works focusing on understanding wireless network environment until
now. Authors in [4] proposed a frequency analysis method based on sensor facility of the intelligent
Wi-Fi access point, a solution of continuous evaluation of the Wi-Fi QoS in enterprise and academic
environment. In [5], analysis of access points loads was shown in a university building. Moreover,
this paper also discussed clients’ periodical behavior and peak hours efficiency. Residential wireless
environment also attracts many researchers since the increasingly widespread deployment of home
wireless networks. [6] studied the average download and upload rates of home access networks. A
non-WiFi interference detection system was developed by using custom hardware in [7]. An active
home-WLAN analysis tool to detect wireless problems, such as low SNR and hidden terminals, was
proposed in [8].The WiSe project in [9] analyzed a diverse set of home environments by configuring
all OpenWrt-based access points with specialized measurement and monitoring software, this de-
ployment could observe all traffic to and from connected devices and has a complete picture about
home wireless environment.
All these previous work provide valuable resources for understanding the wireless network, but
some studies need extra hardware installed in theWi-Fi access point.Therefore, a common and easy
method is needed for service operators to monitor and analyze Wi-Fi network performance.

1.3 Problem Statement


Customers home WiFi environments are complex and non-identical, such as single house or apart-
ment with multiple floors environment, multiple services providing for each end-user, etc. A detailed
understating of the individuals environment is needed to deliver a proper performance for each user,
including increasing Wi-Fi coverage, speed and reducing interference, etc. There are two research
questions needed to be answered based on my research part in this master thesis project:

1. How to analyze Wi-Fi performance parameters evolution over time?


2. How to predict Wi-Fi throughput ?

Figure 1.1: Problem Statement


1.4. Research objectives 3

1.4 Research objectives


For helping broadband operators get more insight about a residential wireless network in an easy
way, there are two objectives in this thesis study, one is to develop a time-dependent Wi-Fi param-
eter analysis and visualization tool to present wireless network performance parameter in a graphic
way. With this Wi-Fi performance visibility, operators could get insight of performance degradation
issues.The other objective is to develop a Wi-Fi performance prediction model to estimate Wi-Fi
throughput under customer’s home environment.

1.5 Methodology
This thesis uses Quantitative, and Experimental Research [10] methods in this thesis, including data
collection, visualization and analysis, Based on research objectives, a subsequent (non-iterative)
design process a suitable methodology for this thesis study, and the phases are illustrated in Figure
1.2:

Figure 1.2: Methodology

• Wi-Fi data measurement


Wi-Fi information is reported periodically through Ubus[11]to assess network performance and
status. The Ubus data model used for data collection is a lightweight message bus system based
on OpenWRT platform.All data of wireless network states and statistics from the access point
is represented in JSON format. An important aspect of this Ubus approach is this measurement
does not impact the real traffic on the Wi-Fi network. Detail measurement method and tool will
be described in Chapter 2.
• Implement a Wi-Fi parameters analysis and visualization tool
1.6. Benefits and Social Impact 4

In this step, a Wi-Fi parameter time evolution analysis and visualization tool will be deployed
to present Wi-Fi parameters graphically over different time. The visualization result will be
discussed in Chapter 3.
• Wi-Fi performance prediction model building

Figure 1.3: Performance predictive Model

Here, Wi-Fi link performance is treated as a black box, an estimation model of Wi-Fi saturated
throughput is proposed by giving the neighbor APs’ information, including traffic volume, signal
strength, noise floor and the channel, and target AP’s signal strength and its noise floor. This
estimation model building, experimental setup and performance evaluation will be illustrated in
Chapter 4, 5, and 6.

1.6 Benefits and Social Impact


• Benefits for Wi-Fi service provider and customer
Wi-Fi performance analysis could help service provider to identify Wi-Fi problems, and find the
best deployment for customer’s router/access point.
• Social impact
With the rapid growth of Wi-Fi infrastructure, the social implications are evident in many fields.
A good example is a WiFi-enabled education. Education based on Wi-Fi technology offers more
opportunities for those people who want to learn anywhere and anytime. Besides, educational
wireless applications provide different kinds of learning tools that make learning easy and im-
prove educational outcomes. Another social impact is Wi-Fi technology make people connect
around the world and change the way of communication between the people. Therefore, a better
understanding of the Wi-Fi technology will enable that Wi-Fi solutions will be more efficient
than before[12].

1.7 Thesis Structure


This thesis is organized as follows:
• Chapter 2, the background information is provided. First, the fundamentals of Wi-Fi based
communication are introduced. Then, data process tools that are used in this thesis are de-
scribed. Finally, machine learning concepts and the algorithm that is relevant to the modeling
in the later chapter are elaborated.
1.7. Thesis Structure 5

• Chapter 3, a Wi-Fi parameter time evolution analysis and visualization tool is deployed to
present Wi-Fi metric time-dependent characteristics.
• Chapter 4, a learning model is proposed to predict Wi-Fi saturated throughput.
• Chapter 5, two experiments and results are described in this chapter.

• Chapter 6, the conclusion and future work is remarked at the end.


Chapter 2

Background

2.1 Wi-Fi Network

The purpose of this section is to provide a basic overview of Wi-Fi network, including an introduction
about the wireless protocol, specifications and performance parameters.

2.1.1 IEEE 802.11 Architecture

IEEE 802.11 is a cell technique by which a wireless network is separated by several cells. Each cell
is controlled by an access point(AP) that links two or more station(STA) as described in Figure
2.1 that is called Infrastructure mode. Multiple infrastructure BSS could be joined together into an
extended service set (ESS) providing continuous larger service coverage[13].

Figure 2.1: 802.11 basic service set(BSS) infrastructure

• Accesspoint: Packets delivered inside wireless network is 802.11 frame type, however, if a wireless
network wants to communicate with outside, i.e., Internet, accesspoint is a networking hardware
like a hub or a switch working as a frame converter.
• Station : All devices that can connect to the accesspoint via wireless network interface are
stations(STA). STAs use a wireless medium to transfer frames between each other.

6
2.1. Wi-Fi Network 7

2.1.2 Overview of IEEE 802.11 standards

The earliest IEEE 802.11 version was released in 1997 providing wireless communication at maxi-
mum data rate 2 Mbits/s based on DSSS/FHSS modulation scheme.

Due to the slow data rate in the first version deployment, 802.11 working group published two new
protocols, 802.11b and 802.11a in 1999 with different frequency band. Comparing to 1999-802,11,
the maximum data rate can be up to 11 and 54 Mbits/s respectively.

In 2003, a new standard called 802.11g came out on the market. It uses the same spectrum
band with 802.11b, but to achieve higher theoretical throughput, 802.11g adds OFDM modulation
scheme.

802.11n was published in 2009. It supports two spectrum bands, both 2,4GHz and 5GHz. 802.11n
firstly introduced advanced antenna technology, providing Multi-input and Multi-output (MIMO)
up to 4 spatial streams.Therefore, the maximum throughput of 802.11n can reach 600 Mbits/s.

802.11ac was published in 2013, providing very high-throughput on 5GHz by using up to 8 spatial
streams MIMO. Comparing to 802.11n, it has more option for the channel bandwidth, 40/80/160MHz.

Table 2.1 summarized parts of IEEE 802.11 specifications techniques that described before.

Table 2.1: IEEE 802.11 PHY Standards[14]


Time Standard Frequency Band(GHz) Bandwidth(MHz) Modulation Antenna Technologies Maximum physical data rate
1999 802.11b 2.4GHz 20 MHz DSSS N/A 11 Mbits/s
1999 802.11a 5GHz 20 MHz OFDM N/A 54 Mbits/s
2003 802.11g 2.4GHz 20 MHz DSSS,OFDM N/A 542 Mbits/s
2009 802.11n 2.4GHz/5GHz 20 /40MHz OFDM MIMO, up to 4 spatial streams 600Mbits/s
2013 802.11ac 5GHz 40/80/160MHz OFDM MIMO, MU-MIMO, up to 8 spatial streams 6.93Gbits/s

From 802.11 PHY standards, Wireless network typically uses two unlicensed spectrum at 2,4 GHz
and 5 GHz band.

Figure 2.2: 2,4GHz band with 20MHz channel band[15]


2.1. Wi-Fi Network 8

Figure 2.3: 2,4GHz band with 40MHz channel band[15]

Figure 2.2 demonstrates 2,4 GHz spectrum with 20 MHz channel width. There are 13 channels on
2,4 GHz spectrum allowed to use in Europe, and channel 14 is allowed in another country, e.g.,
Japan. The channel centers are separated by 5MHz. However, there are only three non-overlapping
channels without interference for each other, which are channels 1, 6, 11.

Figure 2.3 also describes 2,4 GHz spectrum channel allocation, but different from figure 2.2, it
shows 40MHz channel width for each channel on 2,4 GHz through joining two neighbor channels
together. As can be seen from figure 2.3 , there is no independent channel with 40 MHz channel
width. Therefore, it is not an optimal choice for multi-access points deployment.

Figure 2.4: 5GHz band[15]

As limited independent channels on 2,4 GHz spectrum band, IEEE 802.11 working group de-
fines more non-overlapping channels of 20MHz width with the center frequency from 5170MHz to
5835MHz. Channels 36 to 144 are allowed to use in Europe dividing by three Unlicensed National
Information Infrastructure (UNII) bands, UNII-1, UNII-2 and UNII-3. Moreover, aLL these 20MHz
channels can be simply bonded into 40MHz or 80 MHz and even 160 MHz channel width as in figure
2.4. Therefore, 5GHz deployment is suitable for high-density wireless environment since it has more
non-overlapping channels.

However, the question that which spectrum band should be chosen for the wireless communication
environment is not easy to answer. In the interference issue, the 5GHz band is better than 2,4
GHz band, but 2,4 GHz can travel a larger distance than 5GHz which can reach more coverage
2.2. Wi-Fi data measurement and analysis tool 9

than 5GHz. Therefore, wireless network deployment should be varied according to the different
requirement.

Although new 802.11 standards emerge continuously, 802.11n is still prevalent in nowadays wireless
network. Therefore, performance modeling with802.11n on 2,4 GHz spectrum band is determined
in Chapter 5 experiment setup section. However, it can be extended to other standards on other
band option in the future work if needed.

2.1.3 Wi-Fi network performance parameter

The common parameters used to indicate wireless network performance are throughput, jitter,
packet loss rate, latency[16].

• Throughput : It represents successful delivery message over a unit time between two wireless
nodes, measuring by bits/second, Kbits/second, or Mbits/second.
• Latency : In the network context, latency typically means how much time it takes for a packet
of data traveling from one network node to another. However, in some environment like TCP
traffic, latency is measured by Round Time Trip(RTT) that describes the delay calculating by
sending a packet to the destination and receiving an acknowledgment from the destination.
• Jitter : It describes the variation in the different packet delay, i.e., the time difference between
message arrival time. It may be an issue in the voice traffic environment, lower jitter more stable
in VOIP communication.
• Packet loss : It is also known as drop rate, happening when packets fail to deliver from sender
to receiver. It typically caused by network congestion, there is no available wireless medium to
send the packet but drop it. Other reason like errors happening during the transmission also
could result in packet loss.

According to [17], data rate is the most people care about, and this dimension of performance is
mainly driving to the wireless network deployment. Therefore, data rate is selected to study in
Chapter 4,5 and 6.

2.2 Wi-Fi data measurement and analysis tool


2.2.1 Data measurement method

There are two common network data measurement methods[18]: active measurement and passive
measurement.

• Active Measurement
Active measurement needs to inject additional probe packets int to the wireless network. There-
fore, network performance indicators(such as end-to-end response time, transmit error rate, net-
work capability) can be calculated by tracking the probe packets. Active measurements can
2.2. Wi-Fi data measurement and analysis tool 10

better characterize client perceived service quality because they simulate actual traffic behavior
using a few test packages, however, since this measurement require to introduce additional traf-
fic, it shares the same network bandwidth with actual traffic and may disturb the normal traffic
flows.
• Passive Measurement
In passive network measurement, data is collected by passively capturing traffic by monitoring
network nodes, e.g., wireless routers. Most wireless routers have pre-installed passive measure-
ment tools, providing an easy way to record different types of network data (such as traffic
volume, packet loss). Besides, the passive measurements [9][19] are most widely used in wire-
less communication environment. Therefore, the passive measurement method is selected in this
thesis.

2.2.2 Data measurement tool

Three built-in passive measurement tools are described in this section: Ubus[11] ,Uci [20] and
Wlctl [21].
• Ubus
Ubus is a command line tool in OpenWrt based wireless router, allowing interaction between
ubus server and all registered services. It calls procedures with parameters and returns responses
using userfriendly JSON format.
• Wlctl
Wlctl (Web Listener Control) is a common wireless gateway interface for wireless measurement,
which can determine the effects of changes in the wireless network.
• Uci
Uci (Unified Configuration Interface) is OpenWRT centralized configuration interface, which
can modify the wireless access point configuration files (such as Wi-Fi channel, channel width,
transmit power).

In this thesis, an executable shell script written by ubus and wlctl is used to periodically scan access
point for experimental data collection in Chapter 5, and uci is used for change wireless access point
configuration.

2.2.3 Data analysis tool

• Iperf
Iperf [22] is a network performance measurement tool for TCP and UDP protocol. iperf allows
being set various parameters, such as time, packet size, for a testing network. It has a client and
server mode that can measure throughput between two network nodes, either one-way or two-way.
The output of iperf is a time-stamped report including the throughput and the amount of data
transferred for a particular time interval, In this thesis, iperf is used to generate experimental
2.3. Machine Learning Overview 11

traffic flow with different transfer data rate in Chapter 5, including saturated and unsaturated
traffic.
• Pandas
Pandas[23] is a powerful data analysis tool for Python programming language. Its flexible data
structures make label and present data more easy and fast. There are two data structures, Series
and DataFrame. DataFrame is used in this thesis, it is an Excel-like data structure including
ordered columns, which can be a different value type(such as string, numeric).
• Matplotlib
Matplolib[24] is also a Python toolkit for data visualization. It is a 2D plotting library which
could produces different figure formats(PDF, JPG, PNG,BMP) . In this thesis, after Pandas
structuring collected data, Matplolib library is used to develop a Wi-Fi parameter visualization
tool in Chapter 3.
• Scikit-learn
Scikit-learn[25] is another python module for machine learning. It integrates various features
including classification,regression,model selection and preprocessing. In this thesis, Scikit-learn
is selected to implement machine learning classification problem, which contains data scaling,
modeling and performance evaluation.

2.3 Machine Learning Overview

Machine learning(ML) solve a series of problems by computer learning the correlations between the
input and output modeling from collected data set.Normally, ML algorithm is applied if there are
no exactly mathematical relationships that can be observed between the input and output.

In this section, a brief introduction of ML algorithms based on [26][27] is discussed. The specific
ML algorithm chosen for the problem of modeling the Wi-Fi environment will be introduced in
Chapter 5.

There are some basic concepts that help to understand ML programming as below:
• Data : There are two types of data in ML, training data and testing data. Both two data are
generated by testbed or simulation, containing input vectors xi and corresponding output vectors
yi . Training data is used for learning in order to build model. Testing data is used for building
model performance evaluation.
• Feature : The concept of input vectors illustrated before is called features, describing properties
of the studied problem.

• Classification : Classification is tried to find an optimal classifier on the training data. In other
words, in the training step, the training data is separated by several classes. Then this defined
classes will be used to predict on testing data which class they belong to. Figure 2.5 illustrates
a simple linear classification problem.
2.3. Machine Learning Overview 12

• Regression : Different from classification concept, regression works on the value of training
data.The purpose is to find an optimal mapping function represented by a curve or a line to fit
all the data samples. Figure 2.6 illustrates a simple linear regression example

Figure 2.5: Classification example Figure 2.6: Regression example

Beside the basic concept, ML is divided into two broad categories: Supervised machine learning
and Unsupervised machine learning.

• Supervised machine learning : Supervised machine learning is learning from labeled data,
a.k.a., training data sample, including input vectors x = [x1 , · · · , xi ] and output vectors y =
[y1 , · · · , yi ]. This process is known as model building. Then, this model is used to make predictions
based on new data, a.k.a, testing data, since the model is needed to be test how good it is, i.e.,
the predictive accuracy is calculated to evaluate model performance.
• Support vector machine(SVM) : SVM is one of Supervised machine learning methods. It
divided into two core groups, Support vector classification (SVC and Support vector regres-
sion(SVR). SVC performs classification to find a decision boundary between categorical labels
that is maximally far from any labels. SVR performs regression to predict continuous trend line
for ordered points in training data.
• Unsupervised machine learning : Unsupervised machine learning is studied on unlabeled data
which only contains input vectors x = [x1 , · · · , xi ]. This type of machine learning algorithm tries
to find out the hidden structure about the data sample and distinguish them accordingly.

In this thesis, the Wi-Fi performance analysis focuses on predicting the Wi-Fi throughput based
on the channel condition which belongs to the Supervised machine learning problem field.
Chapter 3

Visualization of Wi-Fi parameters


data analysis and evolution over
time

The chapter describes a Wi-Fi parameter analysis and visualization tool developed using Python
to diagnose Wi-Fi quality at any time quickly. With this tool, any access point and associated STA
data reported by the AP via UBUS interface can be selected and visualized in a graphical way.

3.1 Parameter affecting Wi-Fi performance

Studies in [9][28][29] show that Wi-Fi quality could be impacted by the wireless channel condition,
i.e. factors, such as Wi-Fi signal strength(RSSI), traffic volume, resource contention (including
internal and external), and noise level can affect Wi-Fi performance. Therefore, this analysis and
visualization tool aims to analyze and present the features as below to reveal Wi-Fi performance
degradation:

• signal strength(RSSI) : RSSI demonstrates Wi-Fi signal at some location. If the RSSI value
drops dramatically for a period, it shows a weak signal level of wireless end-user around that
position.
• traffic volume: packets transmitted and received during a period, providing local wireless
network activity information. If no packet is transmitted or received over a time period, it
manifests no active end-user or coverage problem around that location.
• data rate per client: It is highly depended by users activities, e.g., browsing web pages,
downloading a file, watching a video.
• noise level: it signals the non-Wi-Fi interference sources, such as Bluetooth, cordless phone,
microwave oven, operating the same radio band(2,4 GHz) as a local wireless network. High

13
3.2. Analysis and visualizing Wi-Fi parameters evolution over
time 14

noise value means lower signal to noise ratio(SNR), which may lead to reduced available data
rate for wireless clients.
• information of neighbor Wi-Fi activity: it reveals how many other wireless networks and what
is the signal strength about those neighbor network.This extend Wi-Fi contention may also
result in reduced data rate for the local wireless clients.

3.2 Analysis and visualizing Wi-Fi parameters evolution


over time

The tool is developed by pandas and matplib library which were introduced in Chapter 2. In this
section, the graphic results produced by the tool are demonstrated in the following subsections. The
Wi-Fi data which is used as visualization tool input is collected from real Wi-Fi users’ environment.

3.2.1 End-user results

Figure 3.1: Specified STA parameter


3.2. Analysis and visualizing Wi-Fi parameters evolution over
time 15

Figure 3.1 represents a STA related information during a specific time range, i.e., from 15:55pm to
21:50pm ,which is described below:
1. Title: 5c:a3:9d:00:5a:e2 802.11n 2X2 WMM AMPDU 2.4G traffic
• 5c:a3:9d:00:5a:e2 : STA mac address
• 802.11n 2X2 WMM AMPDU : STA capabilities, it supports MIMO(2X2) model, Quality
of Service(QoS) and Frame Aggregation.
2. First graph
• red line : STA RSSI over time
• orange bar :STA received packets over time
• blue bar : STA sent packets over time
3. Second graph(STA transmit direction)
• black line :STA transmit phy data rate over time, unit is Kbps
• purple line :STA actual transmit data rate over time, unit is Kbps, according to Tech-
nicolor paper[30], if actual data rate is less than 1 Kbps, the record is 0.
• blue bar : STA transmit packets over time.
4. Third graph(STA received direction), the parameters are similar as second graph except in
the received direction.
• black line :The STA receive phy data rate over time, unit is Kbps
• purple line :The STA actual receive data rate over time, unit is Kbps, according to
Technicolor paper[30], if actual data rate is less than 1 Kbps, the record is 0.
• blue bar : The STA receive packets over time.
5. Fourth graph: STA transmit noack failure (percentage) without receiving acknowledge from
receiver,

Figure 3.2: all STAs rssi


3.2. Analysis and visualizing Wi-Fi parameters evolution over
time 16

Figure 3.3: all STAs transmit physical data rate

Figure 3.4: all STAs receive physical data rate

Figure 3.2, Figure 3.3 and Figure 3.4 exemplify how many STAs contend for the same accesspoint
resource during the same time period and all those STAs individual parameters, i.e., rssi over time,
transmit physical rate over time and receive physical rate over time.
3.2. Analysis and visualizing Wi-Fi parameters evolution over
time 17

3.2.2 Accesspoint results

Figure 3.5: 2,4GHz channel parameter

Figure 3.5 describe 2,4 GHz channel usage of accesspopint during a specific time period as follows:
• blue bar : channel width(MHZ) over time
• green line : accesspoint physical data rate over time
• red dot : accesspoint used channel over time

Figure 3.6: 2,4GHz channel noise

Figure 3.6 represents 2,4GHz band background noise(dBm) during a specific time period. The
green color is from light to dark corresponding to the noise level is from low to high.
3.3. Conclusion 18

Figure 3.7: 2,4GHz channel neighbor

Figure 3.7 describes neighbor channel information as below:


• First graph: demonstrate how many neighbor APs existed in different channels, such as, 3
neighbors in channel 1, 2 neighbors in channel 5, etc
• Second graph : demonstrate how many interference APs in surrounding neighbor APs.
• Third graph: received signal strength from neighbor APs in different channel. The green color
is from light to dark corresponding to the signal level is from low to high.

3.3 Conclusion
As the results of all Wi-Fi performance parameters evolution over time shown in this chapter, this
analysis and visualization tool could let operators monitor Wi-Fi environment to have visibilities
for overall performance, which helps service providers see if performance degradation exists and
what are the potential reasons.
Chapter 4

Proposed Estimation Model for


Wi-Fi Performance

4.1 Estimation model function

In this chapter, a Wi-Fi performance estimation model is proposed based on Machine Learning
method, which allows service providers to predict Wi-Fi saturated throughput1 from easy measure-
ment on the access point.

According to the factors impacting Wi-Fi performance that are discussed in Chapter 3, the proposed
Wi-Fi throughput prediction model considers as a function of signal strength, resource contention
and noise level as shown in equation (4.1):

W iF ithroughput = f (W iF iRSSI , W iF icontention , W iF inoise ) (4.1)

4.2 Machine learning based Modeling


As described in Section 4.1, the proposed Wi-Fi throughput estimation model consists of saturated
throughput as a function of device RSSI and noise floor, resource contention. The key idea in this
modeling is to use machine learning method to classify the device saturated throughput level, good,
medium, poor or very poor (will discuss in next Chapter 5 ).

In this approach, Wi-Fi performance parameters(such as RSSI, noise level and contention informa-
tion) are extracted from measurement collection tools that described in Chapter 2 are defined as
machine learning input features, and saturated throughput collected in the same way is labeled as
machine learning output features. The steps are illustrated as Figure 4.1:
1 In this thesis, saturated throughput is selected as the Wi-Fi performance indicator in the proposed estimation

model describing in Chapter 2

19
4.2. Machine learning based Modeling 20

Figure 4.1: Proposed Wi-Fi performance prediction model

Firstly, label each measurement as input and output features in the data collection process. Sec-
ondly, preprocess input features through features normalization. Thirdly, apply selecting machine
learning method on the features for model building. Finally, evaluate performance of the prediction
model.

4.2.1 Support Vector Machine (SVM) modeling


The machine learning algorithm selected in the thesis is Support Vector Machine(SVM) [27][31],
which is a powerful pattern recognition tool in data learning field, and it has already widely used
in 802.11 wireless communication environment[28][32]. SVM can be used for classification and
regression which belongs to supervised machine learning . In the thesis,SVM classification called
Support Vector Classification(SVC) is studied and discussed for model building. The basic machine
learning techniques are discussed in Chapter 2.

The original idea for SVC is used for simple 2-classification, which use an optimal classification
line, i.e., hyperplane to classify two classes. However, by converting the multi-class classification
problem into several 2-classification problems, SVC could apply for Wi-Fi saturated throughput
estimation that belongs to multi-class classification. The concept of finding a hyperplane is shown
in Figure 4.2 :
4.2. Machine learning based Modeling 21

Figure 4.2: SVC hyperplane concept

Given a set of training vectors x = {x1 , x2 , · · · , xi }representing by dots and crosses in Figure 4.2
as input features. The purpose of SVC is to use an optimal classifier, i.e., hyperplane (the red line)
can be written in equation (4.8) when y = 0:
y =ω·x+b (4.2)
ω is the weight vector, x is the input features, b is bias weight.The two classes needed to be predicted
are the value of output feature, i.e., y = ±1 .

Therefore, the main problem for SVC is to find the optimal hyperplane , i.e., to fine the best position
of red line in Figure 4.2 to minimize the miss-classification probability . This can be converted into
maximum the margin(m)[31] between two other hyperplane,i.e., ω · x + b = +1 and ω · x + b = −1,
which is defined as equation (4.3) and data samples that locate at these two hyperplane are called
support vectors.
2
m= (4.3)
kωk
kωk is the norm of ω. This margin(m) is subject to (4.4) :
ω · xi + b ≥ 1, if yi = 1
(4.4)
ω · xi + b ≤ −1, if yi = −1
where xi is the ith training vector, yi is the correct output of the SVC classification for sample xi .

The equation (4.3) and two constrain in (4.4) can be combined to (4.5) :
min : kωk, subject to : yi (ω · xi + b) ≥ 1 (4.5)
Equation (4.5) is a quadratic program problem aiming to solve linear classification. Dataset may
not be always separately linearly, then this linear classifier (4.5) is modified to use dual function
[31] as optimization function that could solve non-linear problem as well:
N N N
X 1 XX
M aximum : L(α) = αi − αi αj yi uj K(xi , xj )
i=1
2 i=1 j=1
(4.6)
N
X N
X
Subject to : 0 ≤ αi ≤ C, αi yi = 0, ω = αi yi xi
i=1 i=1
4.2. Machine learning based Modeling 22

Where K(xi , xj ) is kernel function. The purpose of kernel function is to map original data vectors
into higher dimension , which could lead to a linear classification solution. In the equation (4.3) , C
is an upper bound constant which controls the trade-off between the training error and the model
optimization.

The optimization equation (4.6) can be solved by a solver[26], and the parameters in the hyperplane
classifier, i.e., equation (4.8), can be obtained as follow:
N
X
ω= αi yi xi
i=1
(4.7)
1 X X
b= (yi − αj yj K(xi , yj ))
NS
i∈S j∈S

where S is the set of total supports vectors.


Wi-Fi throughput prediction can be converted to solve equation (4.12) to find an optimal classifier
to know the Wi-Fi device saturated throughput level.
The equation (4.12) is complicated and not easy to solve, however, it has been implemented by many
program language.Python library Scikit-learn [25] is selected in the thesis and it can be illustrated
in an easy way as shown in equation (4.8)

predicton model = sklearn.svm.SV C(kernel, C, gammal).f it(input f eatures) (4.8)


From equation (4.8), an optimal classifier problem is simplified to find appropriate parameters
including Kernel function, C and gammal.Kernel specifies the classification function to be used in
the algorithm, which could be linear in equation (4.9),polynomial in equation (4.11),and gaussian
in equation (4.10); C is the regularization parameter avoiding over fitting; gammal is the kernel
coefficient only for gaussian kernel function.

Linear : K(xi , xj ) = xi · xj (4.9)

Gaussian : K(xi , xj ) = exp(−γkxi − xj k2 ) (4.10)

P olynomial : K(xi , xj ) = (γ(xi · xj ) + r)d (4.11)

4.2.2 Feature Normalization


Features normalization is also known as data preprocessing [25], which is a common requirement for
many machine learning estimators implemented in scikit-learn. This step is used to standardize the
range of input features to better suit for SVC classification. Some machine learning algorithm may
work badly if input features do not more or less like standard normally distributed data: Gaussian
with zero mean and unit variance. Therefore, in this thesis, StandardScaler function is used to scale
features in the early step of machine learning modeling.
4.3. Conclusion 23

4.2.3 Model parameters selection


In section 4.2.2, in order to find an optimal SVC classifier, parameters such as γ, C and kernelneeded
to specified. For this purpose, GridSearchCV method[25],which stands for grid search cross valida-
tion in Scikit-learn ,is used for parameter tunning to maximize the accuracy of the model.

4.2.4 Model performance evaluation metric


To evaluate the proposed model performance, accuracy score [25] approach is considered to calculate
the quality of the prediction model, it is represented by a fraction of correct predictions over the
testing set with n samples as defined in (4.12):
n
1X
Accuracy score = I(yi , yˆi ) (4.12)
n i=1

yˆi is the predictive value of ith test sample, and yi is the true value of ith test sample . I(·) is
indicator function, is defined by (4.13):
(
1 if yi = yˆi
I(yi , yˆi ) = (4.13)
0 if yi 6= yˆi
Therefore, in order to find a proper classifier, the higher Accuracy score the better modeling.

4.3 Conclusion
SVM based modeling can produce robust classification results with relevant input information in
a convenient way. The input information can be linear or not linear. To predict Wi-Fi saturated
throughput, a few efficient steps are needed to get device throughput level. Firstly, normalize
the input relevant information(noise, signal strength, resource contention) described in subsection
4.2.2. Secondly, select different model parameters described in subsection 4.2.3 to find an optimal
prediction model in equation (4.8). Finally, evaluate the prediction model described in subsection
4.2.4. For the above reason, SVM based modeling is selected in this thesis to predict Wi-Fi device
saturated throughput.
Chapter 5

Experiments and Results

In this chapter, two kinds of experiments are conducted to predict a wireless connected device's
throughput in a real Wi-Fi environment. One experiment is performed with no knowledge about
the surrounding neighbor APs' traffic volume. The other experiment is performed with knowing
about the near-interference APs' traffic volume. The dominated factors that impact Wi-Fi device's
throughput will be discussed in the experimental results.

5.1 Experiment 1: No control with Neighbor traffic


In this section, the saturated throughput of a connected Wi-Fi device in a real Wi-Fi environment
is investigated with no knowledge about the surrounding neighbor APs' traffic volume.

5.1.1 Experimental Testbed

Figure 5.1: Experimental Deployment

24
5.1. Experiment 1: No control with Neighbor traffic 25

Figure 5.1 shows the network diagram of the testbed experiment.There is only one AP connecting
to two nodes, one node is used as the traffic generator, and the other wireless node becomes the
traffic receiver. The saturated TCP throughput between a client (iPhone 6) and its AP (target AP)
is investigated under the controlled wireless communication conditions.All the nodes specifications
are shown in Table 5.1:

Table 5.1: Traffic sender and receiver specifications

(a) Traffic Sender


Traffic Sender Processor Memory Operation System
Mac Pro 2,5 GHz Intel Core i7 16 GB 1600 MHz DDR3 macOS Sierra

(b) Traffic Receiver


Traffic Receiver capability for 2.4GHz
Iphone 6 802.11n 1x1

Regarding the traffic sender and receiver nodes in Table 5.1, network tool application is installed
in receiver nodes to use iperf in server mode, and turn off Auto Lock to prevent the receiver from
sleep mode. Iperf client mode is used in traffic sender nodes to generate traffic from the AP to
iPhone 6.
Experiments are conducted in an indoor environment in Sodra Langgatan 36, 169 59 Solna, one
apartment in four floors building which have many surrounding Wi-Fi APs.

5.1.2 Data Collection


• Measurement points

Table 5.2: AP Configurations

Parameter Values
Frequency Band 2.4GHz
Area Size 7m x 10m
Target AP IPERF (TCP) saturate the link between target AP and its client
Channels [1,6,11]
Channel width 20MHz

The experiments are designed to investigate the impact of Wi-Fi throughput in the 2.4GHz band.
According to this purpose, measurement points are generated by periodically sampling the target
AP with different configurations shown in Table 5.2 and uci described in Chapter 2 is used for
modifying the APs configuration.
Regarding the measurements in Figure 5.1 testbed environment, iPhone 6 moves around the
experiment area (7m x 10m) with blue point shown in Figure 5.2.
5.1. Experiment 1: No control with Neighbor traffic 26

Figure 5.2: Testbed Place

Ubus and Wlctl that described in Chapter 2 are used to periodically collect all the measurement
points that are saved as txt files.

(i) ubus call wireless.ssid.accesspoint.station get


—report wireless connected STA statistics, e.g. RSSI, throughput,
(ii) ubus call wireless.radio get
—report AP radio band information, e.g. channel used, channel bandwidth
(iii) wl scan
—report neighbor APs real-time information, e.g. neighbor APs RSSI, neighbor APs noise
level

5.1.3 Experiment Results


• Features
According to the experiment deployment in Figure 5.1, the saturated throughput for the
wireless link Tl between target AP and iPhone 6 is predicted with unknown surrounding APs
' traffic volume.
The input features selected to predict Wi-Fi throughput should be measurable at target AP
and could impact Wi-Fi performance[9][28].
5.1. Experiment 1: No control with Neighbor traffic 27

– The power received by target AP from client Iphone 6. There is one such power feature.
– The power received by target AP from neighbor APs. There are sixteen such power features.
– Target AP noise floor.There is one such noise feature.
– Neighbor APs noise floor. There are sixteen such noise features
These input features for iPhone 6 prediction is shown in Figure5.3.

Figure 5.3: SVC Feature

• Performance Evaluation
To use SVC based estimation approach, a dataset that consists of all features that illustrate
in Figure 5.3 is needed to be established. Python code, introduced in Chapter 2 filter all
necessary features from the raw txt files generated from Figure 5.1 testbed environment. The
dataset includes 600 data points as shown in Figure 5.4.

Figure 5.4: Screenshot for dataset

The dataset is divided into two parts, 30% of the dataset is randomly selected as testing set,
the other 70% of the dataset is training set used for modeling. SVC in Python is used to
build a prediction model on training set. Then, this classification model is assessed using the
5.1. Experiment 1: No control with Neighbor traffic 28

testing set.The testing set is separated from the dataset before building an estimation model.
Therefore, the accuracy of validating the estimation model with the testing set could give a
reliable performance evaluation result.

In Figure 5.4, the throughput in column A is represented the iphone6 saturated throughput
under the different wireless communication environment as shown in Table 5.3 :

Table 5.3: Iphone 6 throughput classification

Iphone6 throughput(Mbps) Classification Performance Rate


> 30 3 good
20-30 2 medium
10-20 1 poor
6 10 0 very poor

In order to choose the best kernel function to predict the iphone 6 throughput , Grid-
SearchCV is applied to the 70% of the dataset and three different kernel functions (linear,
gaussian,polynomial ) were tested. Linear kernel refers to a linear classifier among the training
data set. Gaussian kernel represents a feature transformation in input space via Gaussian
function. Polynomial kernel also refers to the input space mapping over polynomials of the
original features, Gaussian and polynomial kernel belong to non-linear model. This three
different kernel function definition are described in subsection 4.2.3. The result is shown in
Table 5.4, the accuracy score defined in the result means how accuracy the prediction result
which has been described in subsection 4.2.4. The value of accuracy score is from 0 to 1. 1
represents the prediction result is perfectly matched. Therefore, the higher Accuracy score
the better prediction result.

Table 5.4: Accuracy with different kernel functions

Kernel Function Accuracy Score Run time(second) C gammal


gaussian Kernel 0,53 4 10 0.01
linear kernel 0,52 183 10 N/A
polynomial(degree with 3) 0,50 6 0,001 N/A

From table 5.4, the overall accuracy for all three kernel function is low, around 0,55. Gussian
Kernel has the highest accuracy and the shortest running time, which was selected for esti-
mation modeling. Finally, this model is used to evaluate the test data that is 30% of dataset
as shown in Table 5.5 :

Table 5.5: Gussian Accuracy Score

Kernel Function Training Data Accuracy Testing Data Accuracy


Gussian Kernel 0.53 0.56
5.1. Experiment 1: No control with Neighbor traffic 29

From table 5.5, the accuracy to predict unseen data is very low with only 0,56. Then the ex-
periment is repeated to increase the measurement points and the result of prediction accuracy
for different data points is shown in figure 5.5.

Figure 5.5: Performance for different measurement points

As shown in figure 5.5, the prediction accuracy is still around 0,54 even increasing the data
points up to 1355.
• Prediction accuracy with different input features

Figure 5.6: Accuracy with different input features

The impact of different input features' combination on prediction accuracy is illustrated in


5.2. Experiment 2: Control with neighbor traffic 30

figure 5.6, with one input feature of iPhone 6 RSSI, the prediction accuracy is 0,38. However,
the prediction accuracy only grows up to 0,54 with all four input features. Therefore, in mul-
tiple dwelling units (MDUs) environment, adding input features with surrounding neighbor
RSSI and noise level is not enough to predict a Wi-Fi device's throughout. Besides these
four features, the neighbor APs' traffic volume [33] also impact the target Wi-Fi device's
throughput. Then another experiment adding neighbor traffic load is discussed in the next
section.

5.2 Experiment 2: Control with neighbor traffic


In this Section, the saturated throughput of a connected Wi-Fi device is investigated with knowing
about the near-interference APs' traffic volume.

5.2.1 Experimental Testbed

Figure 5.7: Experimental Deployment

Figure 5.7 shows the network diagram of the testbed experiment.There are 3 APs; one is target AP,
the other two are neighbor APs for introducing competing traffic to the target AP. Besides, two
nodes are connecting to each AP, one node is used as the traffic generator, and the other wireless
node becomes the traffic receiver. The saturated TCP throughput between a client (iPhone 6) and
its AP (target AP) is investigated under the controlled wireless communication conditions.All the
nodes specifications are shown in Table 5.6:
5.2. Experiment 2: Control with neighbor traffic 31

Table 5.6: Traffic sender and receiver specifications

(a) Traffic Sender


Traffic Sender Processor Memory Operation System
Mac Pro 2,5 GHz Intel Core i7 16 GB 1600 MHz DDR3 macOS Sierra
Leno Intel Core i5 CPU M560 @ 2.67GHz x4 7.6 GiB Ubuntu 16.04 LTS
Mac air 1,7 GHz Intel Core i5 4 GB 1333 MHz DDR3 macOS Sierra

(b) Traffic Receiver


Traffic Receiver capability for 2.4GHz
Iphone 6 802.11n 1x1
Ipad mini4 802.11n 2x2
Iphone 7plus 802.11n 2x2

The network tool application installed in traffic senders and receivers are the same introduced in
5.1.1 Experiment testbed of Experiment 1.
Experiments are also conducted in an indoor environment in Sodra Langgatan 36, 169 59 Solna, one
apartment in four floors building. Besides, to minimize other surrounding neighbors’ interference,
all the experiments proceed from 2 am to 4 am in the morning, and a Windows application Wi-Fi
Ch analyzer [34] is installed to check Wi-Fi channels utilization.

5.2.2 Data Collection


• Measurement points

Table 5.7: APs Configurations

Parameter Values
Frequency Band 2.4GHz
Area Size 7m x 10m
Target AP IPERF (TCP) saturate the link between target AP and its client
Interference AP IPERF (UDP)[3Mbps, 45Mbps] step = 3Mbps
Channels channel(target) = 1, channel(interference)=[1,2,3]
Channel width 20MHz both in target AP and interference AP

The experiment is conducted following the same process described in 5.1.2 Data Collection of
Experiment 1 except with different APs configurations shown in table 5.7.
• Measurement synchronization
Different from Experiment 1 in 5.1 ,all the measurements in this Experiment 2 are passively
collected from both target AP and neighbor APs. Therefore, the measurement points in a dataset
are needed to be synchronized. This process is done with Linux date[35] on all the APs to set
the same clock time.
5.2. Experiment 2: Control with neighbor traffic 32

5.2.3 Experiment Results


• Features
According to the experiment deployment in Figure 5.7, the saturated throughput for the
wireless link Tl between target AP and iPhone 6 is predicted given a set Nl of 2 neighboring
links with arbitrary traffic load and configuration.
The input features selected to predict Wi-Fi throughput contain an additional neighbor AP
traffic load compared to features of Experiment 1 in 5.1.3:
– The power received by target AP from client Iphone 6. There is one such power feature.
– The power received by target AP from neighbor APs. There are two such power features.
– Target AP noise floor.There is one such noise feature.
– Neighbor APs noise floor. There are two such noise features.
– Neighbor APs traffic volume. There are two such noise features.
These input features for iPhone 6 prediction is shown in Figure5.8.

Figure 5.8: SVM Feature

• Performance Evaluation
The whole process of building and evaluating prediction model is the same with 5.1.3 perfor-
mance evaluation of Experiment 1. There are total 721 measurement points in this experiment,
which has the similar data structure to the Experiment 1 dataset shown in figure 5.4, except
including additional input feature called neighbor traffic load.
The dataset is divided into two parts, 30% of the dataset is randomly selected as testing set,
the other 70% of the dataset as training set is used for prediction model building.
The result of prediction accuracy on training set with three different kernel functions (linear,
gaussian,polynomial ) is shown in table 5.8, the description of different kernel function and
accuracy score are the same as shown in Experiment 1 of section 5.1.

Table 5.8: SVC kernel Function

Kernel Function Accuracy Score Run time(second) C gammal


gaussian kernel 0.82 3 10 0.01
linear kernel 0.79 25 100 N/A
polynomial(degree with 3) 0.74 8 0.01 N/A
5.2. Experiment 2: Control with neighbor traffic 33

From table 5.8, Gaussian kernel has the highest accuracy and the shortest running time, which
was selected for estimation modeling. Then the testing set is used to evaluate the gussian
prediction model and the result is shown in table 5.9 :

Table 5.9: Accuracy Score for Testing set

Kernel Function Training Data Accuracy Testing Data Accuracy


gussian Kernel 0.82 0.80

• Performance Accuracy with different measurement size

Figure 5.9: Performance for different dataset

As a result shown in Figure 5.9, the accuracy as a function of data set size is calculated. The
prediction accuracy grows by increasing the number of measurement points. Moreover, using
1304 measurements is enough to obtain a good prediction accuracy, 0.87.

• Performance Accuracy with different features


5.2. Experiment 2: Control with neighbor traffic 34

Figure 5.10: Performance for different features

The impact of different input features used in Experiment 2: Control with neighbor traffic
are shown in Figure 5.10. The blue bar represents accuracy score with different features
combination. With one feature of iPhone 6 RSSI, the prediction accuracy is 0,39, and it is
only 0,47 with one additional feature of iPhone 6 noise level. However, with a combination
of iPhone 6 RSSI, iPhone 6 noise level and neighbor traffic load, the accuracy can achieve up
to 0,86, which has a large improvement compared to the two input features(iPhone 6 RSSI
and iPhone 6 noise level). Besides, the accuracy only increases to 0,88 by adding another two
features(neighbor RSSI and neighbor noise level), which means the RSSI and noise level of
two neighbors shown in figure 5.7 experimental deployment do not have much impact on the
prediction accuracy.

Therefore, in the environment of controlling with near neighbors' traffic, the neighbor traffic
load feature plays the most important role on a prediction of a Wi-Fi device's throughput.
Chapter 6

Conclusion and Future Work

6.1 Conclusion
This master thesis is aimed to investigate Wi-Fi performance in residential environment. It in-
volves the development of visualization tool for Wi-Fi parameter and Wi-Fi throughput prediction
modeling.
• A Wi-Fi parameter visualization tool has been implemented in Python. Time-dependent graphs
for different Wi-Fi parameters can be shown in a clear way. All these graphs are plotted according
to the real user’s traffic data, which can demonstrate if there is Wi-Fi performance degradation
in residential wireless environment.
• A machine learning based classification model has been proposed, which allows service providers
to predict Wi-Fi saturated throughput from passive measurement in the access point. In this
classification model, Wi-Fi throughput is considered a function of several input features containing
RSSI, noise level and contention traffic. This model is performed and evaluated in a testbed with
different network traffic and configurations. The result shows that this prediction method can
reach high accuracy up to 0.88 with knowing near interference APs traffic load.

6.2 Future Work

Limitation
The features selected in the Wi-Fi throughput prediction model is not only measurable but also
easily accessible from Wi-Fi vendor’s router(e.g. RSSI, noise level and transmit data rate). It is
a trade-off between complexity and estimation accuracy. Therefore, such measurement does not
capture more detailed environment characteristics like packet size, 802.11n frame aggregation size
or signal reflection (multi-path fading) which may improve prediction accuracy.

For the purpose of improving this project, the future work is considered as follows:
• Add another filter function in the visualization tool to categorize STAs by the level of RSSI, traffic
volume or data rate. This feature can give service providers an overview about performance
degradation at first glance when many STAs connect to the same access point.

35
6.2. Future Work 36

• Improve and extend throughput prediction model by introducing local Wi-Fi contention in the
experiment testbed. In this thesis, only extend Wi-Fi contention is considered. Besides, more
input features that impact Wi-Fi performance could be included to improve the accuracy of the
prediction model.
Bibliography

[1] Cisco. Cisco visual networking index forecast and methodology, 2015-2020. [Online].
Available : http://www.cisco.com/c/en/us/solutions/collateral/service-provider/
visual-networking-index-vni/complete-white-paper-c11-481360.html(Last accessed
on 2017-02-10).
[2] Diego Alonso Landa. Analysis and evaluation of viable features for an ieee 802.11n/ac
self-optimizing solution. [Online]. Available : https://drive.google.com/open?id=
0B0sISjvbr4krNm1VTGpKY2Nrc2s(Last accessed on 2017-07-25).

[3] Yuqing Gu. Home wi-fi optimization application front-end design. [Online]. Available : https:
//drive.google.com/open?id=0B0sISjvbr4krN29pQTF5cFNkWVk(Last accessed on 2017-07-
25).
[4] Z Gal, T Balla, and A Sz Karsai. On the wifi interference analysis based on sensor network
measurements. In Intelligent Systems and Informatics (SISY), 2013 IEEE 11th International
Symposium on, pages 215–220. IEEE, 2013.
[5] A Kamińska-Chuchmala. Performance analysis of access points of university wireless network.
Rynek Energii, 2016.
[6] Marcel Dischinger, Andreas Haeberlen, Krishna P Gummadi, and Stefan Saroiu. Charac-
terizing residential broadband networks. In Internet Measurement Comference, pages 43–56,
2007.
[7] Shravan Rayanchu, Ashish Patro, and Suman Banerjee. Airshark: detecting non-wifi rf devices
using commodity wifi hardware. In Proceedings of the 2011 ACM SIGCOMM conference on
Internet measurement conference, pages 137–154. ACM, 2011.

[8] Partha Kanuparthy, Constantine Dovrolis, Konstantina Papagiannaki, Srinivasan Seshan, and
Peter Steenkiste. Can user-level probing detect and diagnose common home-wlan pathologies.
ACM SIGCOMM Computer Communication Review, 42(1):7–15, 2012.
[9] Ashish Patro, Srinivas Govindan, and Suman Banerjee. Observing home wireless experience
through wifi aps. In Proceedings of the 19th annual international conference on Mobile com-
puting & networking, pages 339–350. ACM, 2013.
[10] Anne Håkansson. Portal of research methods and methodologies for research projects and
degree projects. In Proceedings of the International Conference on Frontiers in Education:

37
Bibliography 38

Computer Science and Computer Engineering (FECS), page 1. The Steering Committee of
The World Congress in Computer Science, Computer Engineering and Applied Computing
(WorldComp), 2013.
[11] OpenWrt. Ubus (openwrt micro bus architecture). [Online]. Available : https://wiki.
openwrt.org/doc/techref/ubus/(Last accessed on 2017-02-05).

[12] Telenor. How wifi has changed the world. [Online]. Available : http://purple.ai/
wifi-changed-world/(Last accessed on 2017-02-04).
[13] Matthew Gast. 802.11 wireless networks: the definitive guide. ” O’Reilly Media, Inc.”, 2005.
[14] IEEE Standards Association. Telecommunications and information exchange between systems
local and metropolitan area networks–specific requirements part 11: Wireless lan medium
access control (mac) and physical layer (phy) specifications. IEEE Std, 802, 2012.
[15] David D Coleman and David A Westcott. Cwna: certified wireless network administrator
official study guide: exam Pw0-105. John Wiley & Sons, 2012.

[16] Theodore S Rappaport. Wireless communications–principles and practice, (the book end).
Microwave Journal, 2002.
[17] Jack L Burbank, Julia Andrusenko, Jared S Everett, and William TM Kasch. Wireless Net-
working: Understanding Internetworking Challenges. John Wiley & Sons, 2013.
[18] Venkat Mohan, YR Janardhan Reddy, and K Kalpana. Active and passive network measure-
ments: a survey. International Journal of Computer Science and Information Technologies,
2(4):1372–1385, 2011.
[19] Ratul Mahajan, Maya Rodrig, David Wetherall, and John Zahorjan. Analyzing the mac-level
behavior of wireless networks in the wild. In ACM SIGCOMM Computer Communication
Review, volume 36, pages 75–86. ACM, 2006.

[20] OpenWrt. Uci (openwrt architecture). [Online]. Available : https://wiki.openwrt.org/


doc/uci(Last accessed on 2017-02-08).
[21] DD-WRT. Wltcl (wireless measurement with wl). [Online]. Available : https://www.dd-wrt.
com/wiki/index.php/Wl_command(Last accessed on 2017-03-15).

[22] Bruce A. Mah Jeff Poskanzer Kaustubh Prabhu Jon Dugan, Seth Elliott. Iperf (the ultimate
speed test tool for tcp, udp and sctp).
[23] Wes McKinney. pandas (powerful python data analysis toolkit). [Online]. Available : http:
//pandas.pydata.org/pandas-docs/stable/(Last accessed on 2017-03-15).

[24] Numfocus organization. matplotlib (powerful python visualization tool). [Online]. Available :
https://matplotlib.org/index.html(Last accessed on 2017-03-18).
[25] Scikit-learn. Python svc tool: scikit-learn. [Online]. Available : http://scikit-learn.org/
stable/modules/svm.html#svc(Last accessed on 2017-03-20).
Bibliography 39

[26] CM Luscombe. Pattern recognition and machine learning (information science and statistics),
2007.
[27] Christopher JC Burges. A tutorial on support vector machines for pattern recognition. Data
mining and knowledge discovery, 2(2):121–167, 1998.

[28] Guillaume Kremer, Philippe Owezarski, Pascal Berthou, and German Capdehourat. Predictive
estimation of wireless link performance from medium physical parameters using support vector
regression and k-nearest neighbors. In International Workshop on Traffic Monitoring and
Analysis, pages 78–90. Springer, 2014.
[29] Ajay Gupta and Prabhash Dhyani. Performance indicators in a 802.11 wlan deployment.
In Advances in Recent Technologies in Communication and Computing, 2009. ARTCom’09.
International Conference on, pages 490–494. IEEE, 2009.
[30] Ioannis Pefkianakis, Henrik Lundgren, Augustin Soule, Jaideep Chandrashekar, Pascal
Le Guyadec, Christophe Diot, Martin May, Karel Van Doorselaer, and Koen Van Oost. Char-
acterizing home wireless performance: The gateway view. In Computer Communications (IN-
FOCOM), 2015 IEEE Conference on, pages 2713–2731. IEEE, 2015.

[31] Tong Zhang. An introduction to support vector machines and other kernel-based learning
methods. AI Magazine, 22(2):103, 2001.
[32] Julien Herzen, Henrik Lundgren, and Nidhi Hegde. Learning wi-fi performance. In Sensing,
Communication, and Networking (SECON), 2015 12th Annual IEEE International Conference
on, pages 118–126. IEEE, 2015.
[33] Aniket Mahanti, Niklas Carlsson, Carey Williamson, and Martin Arlitt. Ambient interference
effects in wi-fi networks. In International Conference on Research in Networking, pages 160–
173. Springer, 2010.
[34] Metageek. Metageek wi-fi chanalyzer. [Online]. Available : http://www.metageek.com/
products/wi-spy/(Last accessed on 2017-03-25).
[35] Linux. Linux date man page. [Online]. Available : http://man7.org/linux/man-pages/
man1/date.1.html (Last accessed on 2017-04-10).
TRITA TRITA-ICT-EX-2017:63

www.kth.se

You might also like