You are on page 1of 9

SAI Intelligent Systems Conference 2015

November 10-11, 2015 | London, UK

Applying Regression Models to Calculate the Q


Factor of Multiplexed Video Signal based on
Optisystem
Ronit Rudra

Ankan Biswas

School of Electronics Engineering


VIT University
Vellore, India
ronit.rudra2012@vit.ac.in

School of Electrical Engineering


VIT University
Vellore, India
post2ankan_95@yahoo.com

Praneet Dutta

Prof. Aarthi G

School of Electronics Engineering


VIT University
Vellore, India
praneet.dutta2012@vit.ac.in

School of Electronics Engineering


VIT University
Vellore, India
aarthi.g@vit.ac.in

AbstractThe objective is to analyze the input parameters of


Dense Wavelength Multiplexing System and accurately predict
the output parameters, using machine learning techniques and
model its dependencies on the input parameters such as
Frequency, Frequency Spacing, Bit Rate and Fiber length. The
training data will be mined from Optisystem 13.0 software and
machine learning algorithms will be implemented using R and
MATLAB. The algorithms used are Multivariable regression
models and neural networks. The accuracy of the two methods
are compared. The predicted values have a close co-relation with
input parameters and cost function errors have been minimized
making use of these techniques.
KeywordsRegression; Dense Wavelength Multiplexing
System; Levenberg-Marquardt Back-propagation algorithm;
Residuals; Q-Factor; Applied Machine Learning; Neural Networks

I.

INTRODUCTION

Dense Wavelength Division Multiplexing (DWDM) is a


technology that allows multiplexing of multiple optical carrier
signals on a single optical fiber by using different wavelengths
for transmission of various information sources. It forms a
more efficient source of transmission than Time Division
Multiplexing Technique. The least amount of attenuation is
achieved by transmitting at a wavelength of 1550nm.
It also allows for the expansion of the existing capacity
without laying additional fibers in optic cables. By increasing
the capacity of the existing system using multiplexers and
demultiplexers at the ends of the system, the given output can
be achieved.
Doped fiber amplifiers with erbium (EDFA - Erbium
Doped Fiber Amplifier) are used to successfully transmit
optical signals over long distances. Erbium being a rare
element when excited, emits light at a wavelength of 1, 54
m, which is the wavelength at which the attenuation of signal
takes place. [7-8]

Our aim is to calculate the quality of the signal transmitted


by the optical fiber across the channel. The output parameters
used are Quality factor, Bit Error Rate, Height and the
Threshold value. The input video signal is fed in through the
transmitter side and it is multiplexed up to 64 signals are
multiplexed to one output. The given apparatus for the
experiment is shown in Figure 1.This works in the third
optical window and the attenuation across the length of the
channel is 0.2db/km. Once it reaches the receiver,
demultiplexing is performed by the DWDM demultiplexer
running at the same frequency as the multiplexer. The signal is
further passed through a photo-detector, low pass filter and a
signal regenerator to recover the original signal. The BER
analyzer provides the variables to be measured.
The data on the required input and output parameters are
gathered and analysis and modelling is done in the subsequent
topics.
II.

DATA MINING AND ANALYSIS

The prerequisite to the application of any machine learning


algorithm is data collection and formatting. Without suitably
formatted data, the algorithm cannot be used to its fullest
extent and may provide spurious and undesirable results or
even outright reject the data being provided.
Data gathering and formatting is a multistage process and
is elucidated as follows:
A. Data Mining
The first step in this process is searching for a suitable
source from which data can be efficiently acquired. For the
purpose of this paper the data mining source is Optisystem
13.0 running on Windows OS.
The DWDM system to be analyzed was constructed in the
aforementioned IDE.

201 | P a g e
978-1-4673-7606-8/15/$31.00 2015 IEEE

SAI Intellig
gent Systems Conference 2015
Novem
mber 10-11, 2015 | London, UK

Fig. 1. DWDM System under study [1]

The system shown was simulated repeateedly for differing


input parameters. The output parameters for each simulation
were saved as CSV (Comma Separated Valuues) text files for
subsequent parsing and analysis.
The input parameters were Numberr of Channels,
Frequency, Frequency Spacing, Power Levvel and Bitrate.
Variation in any of the input parameters createes a unique class.
For example: 32 channel, 190 THz, 100 GHz, 5 dB, 2.5 GHz is
a different class than 32 channel, 195 THz, 2000 GHz, 5 dB, 2.5
GHz. (Input parameters in order as mentioned before)

Fig. 2. Graph of Q-Factor

The output parameters obtained from thee BER Analyzer


were Q-Factor, Minimum BER, Threshold annd Maximum Eye
Height. Thus, for the purpose of this ppaper, the input
parameter combinations amount to 10 classes and each class
has 10 simulations with each simulation havinng 4 sets of data
files corresponding to the aforementioned ouutput parameters.
The number of classes, simulations as well as output
parameters can be changed to suit specific neeeds. Figures 1-4
provide graphs of the output variables.
B. Data Parsing and Formatting
After the required data has been collected successfully, the
subsequent stage is to parse and convert the data into a
suitable form which can be accepted by the software or IDE
running the classifier. In this case the classifieer is being run on
MATLAB which can accept data files in .csv or .xlsx
formats.
Therefore the objective of this stage is to rread the collected
data, extract the required information andd format it for
application.
The programming language R was chosen as a suitable
candidate to perform this step as it is an efficiient tool for data
analysis. R version 3.1.3 was run on RStuddio IDE for this
purpose.

Fig. 3. Graph of Minimum BER

202 | P a g e
978-1-4673-7606-8/15/$31.00 2015 IEEE

SAI Intellig
gent Systems Conference 2015
Novem
mber 10-11, 2015 | London, UK

Fig. 6. Example of data obtained from simulation (text file)

The format of the directory containing the data is as


follows:
g sub-folders identified by
Parent Directory containing
input parameters
Each sub-folder contains N folders named 1 to N
where N is the number of sim
mulations run for that input
parameter. For our purpose N was taken to be 10.
Each simulation sub-folderr contains four text files
corresponding to the daata of the four output
parameters.

Fig. 4. Graph of Threshold

Each text file has dataa points of the graph


corresponding to an output parameter i.e. the abscissa
and ordinate values.
Thus the R function traverrses the whole directory
containing the data. The step-by-steep procedure is as follows:
The directory of the required input parameter and its
a passed as arguments to
corresponding class label are
the function
The function traverses each simulation folder in order
For each simulation folder, it reads the text files and
mes, then merges them into
converts them into data fram
a single data frame.
ns corresponding to each of
The data frame has column
the output data.
The required values are extraacted from the data frame.
Fig. 5. Graph of Eye Diagram Height

In brief, a function was written which wouuld automatically


read the CSV text files containing the data, create data frames,
extract useful values, and add the correct class labels to the
data. Then all the relevant data is stored onnto a single CSV
text file.
Fig. 6 shows a text file of data points of BER versus
Simulation time. The minimum BER, beiing one of the
parameters, has to be extracted from this datta. As shown the
data is quite cumbersome to analyze and therre are a hundred
of such files.

The class label is appended


d to the last column of the
data frame (Fig. 8.)
Another data frame is creatted which relates the class
labels to the input parameterrs (Fig. 7.)
The two data frames com
mprising of the input and
output are merged togetherr and the class labels are
discarded (Fig. 9.)

ported and saved to a CSV


The simulation data is exp
text file.

203 | P a g e
978-1-4673-7606-8/15/$31.00 2015 IEEE

SAI Intellig
gent Systems Conference 2015
Novem
mber 10-11, 2015 | London, UK

The data obtained in the previous step has the following


properties:
Number of instances is 100
Number of predictor variaables, given by first five
columns, amount to 5.

Fig. 7. Class Labels defined for input parameter combinnations (data frame in
R)

he target variables for which


The last four columns are th
Regression models are to be designed.
Figures 11. To 14. Show the plots of all the output
parameters versus the simulation in
ndex. The regions bounded
by the dashed demarcations indiccate the output classes as
shown in Fig. 7. After brief maanual analysis of the data
through plots, summaries, quantilee estimations etc., the next
phase of the design is tackled.

Fig. 8. Example of output data frame parsed in R

Fig. 11. Simulation index versus Eye Height

Fig. 9. Final Data Frame

Fig. 10. CSV text file of data in Fig. 9

It is clear from the figures that R is indeeed a suitable tool


for parsing and analysis of data as it is intuiitive and manual
analysis of data is easier (Fig. 9.).

Fig. 12. Simulation index versus Minimum BER

The final CSV file (Fig. 10.) will be fed as training data to
the algorithm for generation of a linear model..
C. Exploratory Analysis
After successfully extracting the required iinformation from
the collected data, it makes sense to visualizee the data to look
for patterns. Exploratory analysis is useful as one can quickly
mber of plotting
analyze the obtained data using a large num
methodologies such as scatter plots, histogram
ms, bar plots, line
plots, contours etc.
R has strong graphic capabilities and is a suitable tool to
visualize data.

Fig. 13. Simulation index versus Q Factor

204 | P a g e
978-1-4673-7606-8/15/$31.00 2015 IEEE

SAI Intelligent Systems Conference 2015


November 10-11, 2015 | London, UK

independent variable: E(Y/x)=0+1x where Y denotes the


response variable, x denotes a value of the independent
variable, and the i-values denote the model parameters.
The quantity is called the conditional mean or the expected
value of Y given the value of x. Many distribution functions
have been proposed for use in the analysis of a dichotomous
response variable (Hosmer and Lemeshow, 1989; Agresti,
1984; Feinberg, 1980).
Regression makes use of the Sigmoid Function. Unlike the
Heaviside function which instantaneously steps from 0 to
1(which makes it difficult to deal with), the function gradually
changes .Mathematically it is given by:

Fig. 14. Simulation index versus Threshold

III.

REGRESSION MODELS

A. Theory
Regression methods have become an integral component
of any data analysis concerned with the relationship between a
response variable and one or more explanatory variables. The
most common regression method is conventional regression
analysis (CRA), either linear or nonlinear, when the response
variable is continuous (IID or independent and identically
distributed). However, when the outcome (the response
variable) is discrete, CRA is not appropriate. Among several
reasons, the following two are the most significant:
1) The response variable in CRA must be continuous, and
2) The response variable in CRA can take non-negative
values.
These two primary assumptions are not satisfied when the
response variable is categorical.
TABLE I.
Collection
Preparation of Data
Analyze
Train
Test
Use

STEPS FOR MODEL REALIZATION

Any Method will Suffice


Numeric Values are needed for a distance calculation.
A structured data format is the best
Any Method
Majority of Time Complexity is spent on this
This is relatively easy once the training step is done
The application applies regression calculation on
input data and determines which class the input data
should belong to. The application then takes some
action on calculated class.

[2]It is important to understand that the goal of an analysis


using logistic regression is the same as that of any modelbuilding technique used in statistics: to find the best fit and the
most parsimonious one. What distinguishes a logistic
regression model from a linear regression model is the
response variable. In the logistic regression model, the
response variable is binary or dichotomous.


For the given data we design Regression Models using
Multiple Regression and Multivariate Multiple Regression

B. Multiple Regression
Linear Regression creates a model of the outcome variable
on the basis of a single predictor variable.
The Linear Regression model with predictor X and
outcome Y is given by:

Equation (1)
Where B is the Bias and C is the weight.
The equation (1) is a straight line with B as the intercept
and C as the slope. Hence Linear Regression determines a
straight line for modelling the relation between Y and X.
Now, Multiple Regression means that the outcome variable
Y is modelled to multiple predictor variables. This creates a
model in a higher dimensional plane whose dimension equals
the number of predictor variables plus the outcome itself. One
disadvantage is that the model cannot be visualized if the
number of predictor variables is more than two since
visualization is impossible exceeding three dimensions.
Therefore we have the model as follows:

Equation (2)
Where, B = Bias
Ci = Weight of ith Predictor
N = Total number of predictors

The difference between logistic and linear regression is


reflected both in the choice of a parametric model and in the
assumptions. Once this difference is accounted for, the
methods employed in an analysis using logistic regression
follow the same general principles used in linear regression
analysis.

Therefore, according to the data collected we have:

In any regression analysis the key quantity is the mean


value of the response variable given the values of the

Spacing

Predictors:
Channels
Frequency

205 | P a g e
978-1-4673-7606-8/15/$31.00 2015 IEEE

SAI Intellig
gent Systems Conference 2015
Novem
mber 10-11, 2015 | London, UK

Another important parameter is Residual or Cost. This is a


quantifier of how far the predicted
d model deviates from the
actual data. Lower value of Residu
ual is of course desirable as
the linear model becomes more acccurate. Figure. 15. Shows a
table of residuals for each instaance of each model. The
objective is to minimize the residuaals.

Power
Bitrate
Outcome:
Maximum Height
Minimum BER
Q Factor
Threshold
Now, the linear models are defined with thhe predictors and
outcome variables. RStudio was used to moddel the data. The
variables were passed to a function in the fo
form of Formula
class which relates the output columns to thee input columns.
They are as follows:


Figure 15. Shows that the models for Height and


Threshold have very low residual values while Min.BER and
ual values. This may lead
Q.Factor have fairly high residu
someone to believe that the first tw
wo of the models mentioned
are accurate while the latter two aree inaccurate. This might not
be the case always and therefore we
w need to examine another
parameter, namely, Mean Square Error.
E
The function predict(linear mod
del, dataframe) predicts the
output based on the linear model and
a data frame passed to it.
The output is a vector contaiining the predicted data
corresponding to each row of the prredictor data.

The next step requires calculaation of the Mean Square


Error which is taken to be the cosst function to determine the
accuracy of the model. MSE is giveen by:



The function lm(formula, dataframe) ccreates a linear
model based on the formula and the data fraame passed to it.
Hence, we have 4 linear models from the aforementioned
formulae. lm() works on the dataset provided to it and outputs
an object of class list which contains all the ddata pertaining to
the model such as coefficients, residuals, devviances, quantiles
etc. Fig. 15. Shows the bias (interceptt) and weights
(coefficients) of each linear model.

Equation
n (3)
Where, Xpredicted = Predicted Outtput
Xobserved = Observed Output
N = Total number of observations
he observed and predicted
The Square Error between th
output was calculated and for all the four models and their
o 20 show graphs of MSE
graphs were plotted. Figures 17 to
versus the instance.

Fig. 15. Coefficients and Intercepts of Linear Models

The variables Channels, Frequency and Power have no


coefficients because they have a constant vaalue in the given
dataset. They have been included in the m
model to prevent
incorrect predictions due to confoundingg variables i.e.
variables which indirectly affect relationshipp between input
and output. These values can be used in equation (2) to
develop a prediction model.

Fig. 16. Residuals of each instance of Lineaar Model

206 | P a g e
978-1-4673-7606-8/15/$31.00 2015 IEEE

SAI Intelligent Systems Conference 2015


November 10-11, 2015 | London, UK

Fig. 20. MSE versus Instance for Threshold


Fig. 17. MSE versus Instance for Height

In the above figure the blue curve signifies the LOWESS


(Locally Weighted Scatterplot Smoothing) curve for the data
along with the 95% confidence interval for the curve is given
by the dark grey region. The LOWESS curve gives us the
general trend in the data.
As mentioned earlier the data set on which the operations
are being performed have 100 simulations with 10 class labels
and each one of these class labels occupies 10 instances each.
Hence, we can divide the x axis of the plots into 10 regions of
10 instances each. On closer inspection it was concluded that,
for example, in instances 1 through 10, the MSE value was
higher at first and then went down. This is visible in all the
regions as well. Thus, the MSE regresses towards a minimum
mean value in all the plots.
The confidence intervals in Figures 19 and 20 occupy a
broader region indicating the fact that prediction is not as
accurate. Another concern is that for a class region, the MSE
decreases first, reaches the minimum and then increases or
vice versa.

Fig. 18. MSE versus Instance for BER

IV.

MULTIVARIATE MULTIPLE REGRESSION

Multivariate Multiple Regression, as the name suggests, is


a technique for estimating multiple outcome variables which
depend on multiple predictor variables. The outcome variables
may or may not be independent of each other. This technique
is similar to Multiple Regression with the sole addition of
multiple outputs.
For estimating the model using this technique, we utilized
Neural Networks and more specifically the LevenbergMarquardt Back-propagation Algorithm.
A. Theory
The most fundamental processing unit of neural systems is
a neural. It receives various signals from the inputs, combines
them and performs anon-linear operation to produce an output.

Fig. 19. MSE versus Instance for Q Factor

The Artificial Neural Network takes inspiration from this


model. It consists of an input layer, an output layer and hidden
layers. The number of hidden layers taken depends on the
specific application and type of neural network being built.

207 | P a g e
978-1-4673-7606-8/15/$31.00 2015 IEEE

SAI Intellig
gent Systems Conference 2015
Novem
mber 10-11, 2015 | London, UK

The number of rows in the input layer is equal to the


number of inputs. On moving from one llayer to another
weights are assigned to each transition. Ann additional Bias
Unit is added to the preceding layer while movving forward.

The input and output data is fed to the network.

The non-linear transfer functions which ccan be used areHyperbolic Tangent, Sine and Sigmoid Funcction Fig. 20. A
linear function can also be used but doess not match the
accuracy of a non-linear one. For our algorithhm we make use
of the Sigmoid Function which is of the form:

mized for this part and 70%


The data samples are random
is assigned to Training and
d the rest 30% is equally
divided into Validation and Testing.
T

p
Training; Validation;
The data is divided into 3 parts,
Testing.

The Neural Network is train


ned using the given data
After training, validation and
a
testing the networks
accuracy is gauged by its
i MSE and Regression
Coefficient.
Retraining can be done if the results are not
satisfactory by changing daata, number of hidden layer
neurons or by changing the percentage assigned to the
three phases of training.
dden layer neurons for the
We used 10,100 and 1000 hid
design and figures 22 to 24 show th
he MSE values for the three
designs as the training progresses.
Figure 26. Shows the Error Histogram for the 1000 layer
network. The result is satisfactory as the maximum frequency
of instances in the minimum errror region denoted by the
yellow vertical line.

Fig. 21. Sigmoid Function plotted using R

B. Back-propagation algorithm
This algorithm was originally introduced inn the 1970s.It is
a very efficient technique for calculating thee output variable
compared to previous techniques. The inittial weights are
configured and random weights are associiated with each
transition.

n line for all three phases of


Figure 27. Shows the regression
training and an overall trend of the 1000 hidden layer network.
The Lines fit to 99.99% and hence the design is quite accurate
f corresponds to training,
in predicting new data. The blue fill
the green fill to validation and the red
r to testing.

Using Feedforward propagation, the netw


work progresses
further and predicts an output based on thesee weights. In our
training example (70% of the data set), we havve a given output
and we compare that with our predicted output. Now, the
gradient of each step is computed starting from the error
function and it now propagates backwards.
This goes on till the first hidden layer. The weights are
now re-assigned. This process continues for aall the remaining
training examples until it reaches its most optimal
solution.15% of the data set is reserved for crross-validation as
shown in the Figure it fits with an accuracy oof 99%. Finally
the network is now ready to train any new
w value of input
parameter.

Fig. 22. MSE for 1000 Hidden Layers

C. Procedure
After finalizing the design parameters, thee neural network
(Fig. 25) is designed on MATLAB.
The design procedure is as follows:
The final data which was parsed is spllit into two parts
containing the input and output variables.
The number of neurons in the hidden layer as well as
the transfer function (sigmoid in this caase) is decided.

Fig. 23. MSE for 100 Hidden Layers

208 | P a g e
978-1-4673-7606-8/15/$31.00 2015 IEEE

SAI Intellig
gent Systems Conference 2015
Novem
mber 10-11, 2015 | London, UK

V.

CONCL
LUSION

From our simulations we have seen


s
a co-variance between
the input data set and the output. On
O comparing the efficiency
of perceptron model with that of neural
n
networks we see that
Neural Network Model converges at
a a faster rate as compared
to the regression model.

Fig. 24. MSE for 10 Hidden Layers

gression Model was a much


Thus Multivariate Multiple Reg
better alternative than the Multiplee Regression Model for the
data we utilized. We were able to em
mulate only 5 inputs for the
designed optical network out of which only 2 were varied. The
ge depending on the number
accuracy and efficiency may chang
of input instances as well on the number of attributes. It is
t obtain a more accurate
necessary to vary all the inputs to
model.
nough data spanning a wide
Thus, we conclude that given en
parameter list of the simulated
d optical communication
network, it is possible to create an independent prediction
model which, after training utilizes less resources and is faster
d
of accuracy than the
at predicting outputs with a high degree
traditional simulation software, esp
pecially in the field where
such licensed softwares are not avaiilable for quick analysis.

Fig. 25. Designed neural network in MATLAB

ACKNOWLEDGMENTS
We would like to thank VIT University -Department of
Electronics for providing us witth the resources and Lab
facilities, especially licensed versiions of MATLAB R2013b
and Optisystem 13.0 for carrying out the project, Professors
G
for their guidance in
Arulmozhivarman P. and Sankar Ganesh
mining the Data from the software.
REFERENCE
ES
S Ilic, B. Jaksic, M. Petrovic A. Mark
kovic, V. Elcic, "Analysis of Video
Signal Transmission Through DWDM
M Network Based on a Quality
Check Algorithm" Vol 3, _o. 2, 2013, 416-423
4
[2] M. T. Fatehi, M. Wilson, Optical Neetworking with WDM, McGrawHill, New York, 2001
mation of filtered signal envelope
[3] M. Stefanovic, D. Milic, "An approxim
with phase noise in coherent optical systems", Journal of Lightwave
Technology, Vol .19, No. 11, pp. 1685-1690, 2001
mance of optical heterodyne PSK
[4] I. Djordjevic, M. Stefanovic, "Perform
systems with Costas loop in multich
hannel environment for nonlinear
second-order PLL model", Journal off Lightwave Technology Vol. 17,
No.12, pp. 2470-2479, 1999
O
Networks: A Practical
[5] R. Ramaswami, K. Sivarajan, Optical
Perspective, 2nd ed., Morgan Kaufm
mann Publishers, San Francisco,
2002
W
(Eds), Optical Fiber
[6] I. P. Kaminow, T. Li, A. Willner
Telecommunications V, Elsevier/Acad
demic Press, 2008
[7] G. Agrawal, Nonlinear Fiber Optics, 2n
nd Ed., Academic Press, 2001
[8] G. Agrawal, Fiber-Optic Communicatiion Systems, 3nd Ed., Wiley, 2002
[9] E. G. Sauter, Nonlinear Optics, John Wiley
W
& Sons, Inc., New York
[10] Peter Harrington, Machine Learning in
i Action, DreamTech Press,2012
http://neuralnetworksanddeeplearning.com/chap2.htm
[1]

Fig. 26. Error Histogram for 1000 Hidden Neurons

Fig. 27. Regression Plots for 1000 Hidden Neurons

209 | P a g e
978-1-4673-7606-8/15/$31.00 2015 IEEE