You are on page 1of 9

SAI Intelligent Systems Conference 2015

November 10-11, 2015 | London, UK

Applying Regression Models to Calculate the Q

Factor of Multiplexed Video Signal based on
Ronit Rudra

Ankan Biswas

School of Electronics Engineering

VIT University
Vellore, India

School of Electrical Engineering

VIT University
Vellore, India

Praneet Dutta

Prof. Aarthi G

School of Electronics Engineering

VIT University
Vellore, India

School of Electronics Engineering

VIT University
Vellore, India

AbstractThe objective is to analyze the input parameters of

Dense Wavelength Multiplexing System and accurately predict
the output parameters, using machine learning techniques and
model its dependencies on the input parameters such as
Frequency, Frequency Spacing, Bit Rate and Fiber length. The
training data will be mined from Optisystem 13.0 software and
machine learning algorithms will be implemented using R and
MATLAB. The algorithms used are Multivariable regression
models and neural networks. The accuracy of the two methods
are compared. The predicted values have a close co-relation with
input parameters and cost function errors have been minimized
making use of these techniques.
KeywordsRegression; Dense Wavelength Multiplexing
System; Levenberg-Marquardt Back-propagation algorithm;
Residuals; Q-Factor; Applied Machine Learning; Neural Networks



Dense Wavelength Division Multiplexing (DWDM) is a

technology that allows multiplexing of multiple optical carrier
signals on a single optical fiber by using different wavelengths
for transmission of various information sources. It forms a
more efficient source of transmission than Time Division
Multiplexing Technique. The least amount of attenuation is
achieved by transmitting at a wavelength of 1550nm.
It also allows for the expansion of the existing capacity
without laying additional fibers in optic cables. By increasing
the capacity of the existing system using multiplexers and
demultiplexers at the ends of the system, the given output can
be achieved.
Doped fiber amplifiers with erbium (EDFA - Erbium
Doped Fiber Amplifier) are used to successfully transmit
optical signals over long distances. Erbium being a rare
element when excited, emits light at a wavelength of 1, 54
m, which is the wavelength at which the attenuation of signal
takes place. [7-8]

Our aim is to calculate the quality of the signal transmitted

by the optical fiber across the channel. The output parameters
used are Quality factor, Bit Error Rate, Height and the
Threshold value. The input video signal is fed in through the
transmitter side and it is multiplexed up to 64 signals are
multiplexed to one output. The given apparatus for the
experiment is shown in Figure 1.This works in the third
optical window and the attenuation across the length of the
channel is 0.2db/km. Once it reaches the receiver,
demultiplexing is performed by the DWDM demultiplexer
running at the same frequency as the multiplexer. The signal is
further passed through a photo-detector, low pass filter and a
signal regenerator to recover the original signal. The BER
analyzer provides the variables to be measured.
The data on the required input and output parameters are
gathered and analysis and modelling is done in the subsequent


The prerequisite to the application of any machine learning

algorithm is data collection and formatting. Without suitably
formatted data, the algorithm cannot be used to its fullest
extent and may provide spurious and undesirable results or
even outright reject the data being provided.
Data gathering and formatting is a multistage process and
is elucidated as follows:
A. Data Mining
The first step in this process is searching for a suitable
source from which data can be efficiently acquired. For the
purpose of this paper the data mining source is Optisystem
13.0 running on Windows OS.
The DWDM system to be analyzed was constructed in the
aforementioned IDE.

201 | P a g e
978-1-4673-7606-8/15/$31.00 2015 IEEE

SAI Intellig
gent Systems Conference 2015
mber 10-11, 2015 | London, UK

Fig. 1. DWDM System under study [1]

The system shown was simulated repeateedly for differing

input parameters. The output parameters for each simulation
were saved as CSV (Comma Separated Valuues) text files for
subsequent parsing and analysis.
The input parameters were Numberr of Channels,
Frequency, Frequency Spacing, Power Levvel and Bitrate.
Variation in any of the input parameters createes a unique class.
For example: 32 channel, 190 THz, 100 GHz, 5 dB, 2.5 GHz is
a different class than 32 channel, 195 THz, 2000 GHz, 5 dB, 2.5
GHz. (Input parameters in order as mentioned before)

Fig. 2. Graph of Q-Factor

The output parameters obtained from thee BER Analyzer

were Q-Factor, Minimum BER, Threshold annd Maximum Eye
Height. Thus, for the purpose of this ppaper, the input
parameter combinations amount to 10 classes and each class
has 10 simulations with each simulation havinng 4 sets of data
files corresponding to the aforementioned ouutput parameters.
The number of classes, simulations as well as output
parameters can be changed to suit specific neeeds. Figures 1-4
provide graphs of the output variables.
B. Data Parsing and Formatting
After the required data has been collected successfully, the
subsequent stage is to parse and convert the data into a
suitable form which can be accepted by the software or IDE
running the classifier. In this case the classifieer is being run on
MATLAB which can accept data files in .csv or .xlsx
Therefore the objective of this stage is to rread the collected
data, extract the required information andd format it for
The programming language R was chosen as a suitable
candidate to perform this step as it is an efficiient tool for data
analysis. R version 3.1.3 was run on RStuddio IDE for this

Fig. 3. Graph of Minimum BER

202 | P a g e
978-1-4673-7606-8/15/$31.00 2015 IEEE

SAI Intellig
gent Systems Conference 2015
mber 10-11, 2015 | London, UK

Fig. 6. Example of data obtained from simulation (text file)

The format of the directory containing the data is as

g sub-folders identified by
Parent Directory containing
input parameters
Each sub-folder contains N folders named 1 to N
where N is the number of sim
mulations run for that input
parameter. For our purpose N was taken to be 10.
Each simulation sub-folderr contains four text files
corresponding to the daata of the four output

Fig. 4. Graph of Threshold

Each text file has dataa points of the graph

corresponding to an output parameter i.e. the abscissa
and ordinate values.
Thus the R function traverrses the whole directory
containing the data. The step-by-steep procedure is as follows:
The directory of the required input parameter and its
a passed as arguments to
corresponding class label are
the function
The function traverses each simulation folder in order
For each simulation folder, it reads the text files and
mes, then merges them into
converts them into data fram
a single data frame.
ns corresponding to each of
The data frame has column
the output data.
The required values are extraacted from the data frame.
Fig. 5. Graph of Eye Diagram Height

In brief, a function was written which wouuld automatically

read the CSV text files containing the data, create data frames,
extract useful values, and add the correct class labels to the
data. Then all the relevant data is stored onnto a single CSV
text file.
Fig. 6 shows a text file of data points of BER versus
Simulation time. The minimum BER, beiing one of the
parameters, has to be extracted from this datta. As shown the
data is quite cumbersome to analyze and therre are a hundred
of such files.

The class label is appended

d to the last column of the
data frame (Fig. 8.)
Another data frame is creatted which relates the class
labels to the input parameterrs (Fig. 7.)
The two data frames com
mprising of the input and
output are merged togetherr and the class labels are
discarded (Fig. 9.)

ported and saved to a CSV

The simulation data is exp
text file.

203 | P a g e
978-1-4673-7606-8/15/$31.00 2015 IEEE

SAI Intellig
gent Systems Conference 2015
mber 10-11, 2015 | London, UK

The data obtained in the previous step has the following

Number of instances is 100
Number of predictor variaables, given by first five
columns, amount to 5.

Fig. 7. Class Labels defined for input parameter combinnations (data frame in

he target variables for which

The last four columns are th
Regression models are to be designed.
Figures 11. To 14. Show the plots of all the output
parameters versus the simulation in
ndex. The regions bounded
by the dashed demarcations indiccate the output classes as
shown in Fig. 7. After brief maanual analysis of the data
through plots, summaries, quantilee estimations etc., the next
phase of the design is tackled.

Fig. 8. Example of output data frame parsed in R

Fig. 11. Simulation index versus Eye Height

Fig. 9. Final Data Frame

Fig. 10. CSV text file of data in Fig. 9

It is clear from the figures that R is indeeed a suitable tool

for parsing and analysis of data as it is intuiitive and manual
analysis of data is easier (Fig. 9.).

Fig. 12. Simulation index versus Minimum BER

The final CSV file (Fig. 10.) will be fed as training data to
the algorithm for generation of a linear model..
C. Exploratory Analysis
After successfully extracting the required iinformation from
the collected data, it makes sense to visualizee the data to look
for patterns. Exploratory analysis is useful as one can quickly
mber of plotting
analyze the obtained data using a large num
methodologies such as scatter plots, histogram
ms, bar plots, line
plots, contours etc.
R has strong graphic capabilities and is a suitable tool to
visualize data.

Fig. 13. Simulation index versus Q Factor

204 | P a g e
978-1-4673-7606-8/15/$31.00 2015 IEEE

SAI Intelligent Systems Conference 2015

November 10-11, 2015 | London, UK

independent variable: E(Y/x)=0+1x where Y denotes the

response variable, x denotes a value of the independent
variable, and the i-values denote the model parameters.
The quantity is called the conditional mean or the expected
value of Y given the value of x. Many distribution functions
have been proposed for use in the analysis of a dichotomous
response variable (Hosmer and Lemeshow, 1989; Agresti,
1984; Feinberg, 1980).
Regression makes use of the Sigmoid Function. Unlike the
Heaviside function which instantaneously steps from 0 to
1(which makes it difficult to deal with), the function gradually
changes .Mathematically it is given by:

Fig. 14. Simulation index versus Threshold



A. Theory
Regression methods have become an integral component
of any data analysis concerned with the relationship between a
response variable and one or more explanatory variables. The
most common regression method is conventional regression
analysis (CRA), either linear or nonlinear, when the response
variable is continuous (IID or independent and identically
distributed). However, when the outcome (the response
variable) is discrete, CRA is not appropriate. Among several
reasons, the following two are the most significant:
1) The response variable in CRA must be continuous, and
2) The response variable in CRA can take non-negative
These two primary assumptions are not satisfied when the
response variable is categorical.
Preparation of Data


Any Method will Suffice

Numeric Values are needed for a distance calculation.
A structured data format is the best
Any Method
Majority of Time Complexity is spent on this
This is relatively easy once the training step is done
The application applies regression calculation on
input data and determines which class the input data
should belong to. The application then takes some
action on calculated class.

[2]It is important to understand that the goal of an analysis

using logistic regression is the same as that of any modelbuilding technique used in statistics: to find the best fit and the
most parsimonious one. What distinguishes a logistic
regression model from a linear regression model is the
response variable. In the logistic regression model, the
response variable is binary or dichotomous.

For the given data we design Regression Models using
Multiple Regression and Multivariate Multiple Regression

B. Multiple Regression
Linear Regression creates a model of the outcome variable
on the basis of a single predictor variable.
The Linear Regression model with predictor X and
outcome Y is given by:

Equation (1)
Where B is the Bias and C is the weight.
The equation (1) is a straight line with B as the intercept
and C as the slope. Hence Linear Regression determines a
straight line for modelling the relation between Y and X.
Now, Multiple Regression means that the outcome variable
Y is modelled to multiple predictor variables. This creates a
model in a higher dimensional plane whose dimension equals
the number of predictor variables plus the outcome itself. One
disadvantage is that the model cannot be visualized if the
number of predictor variables is more than two since
visualization is impossible exceeding three dimensions.
Therefore we have the model as follows:

Equation (2)
Where, B = Bias
Ci = Weight of ith Predictor
N = Total number of predictors

The difference between logistic and linear regression is

reflected both in the choice of a parametric model and in the
assumptions. Once this difference is accounted for, the
methods employed in an analysis using logistic regression
follow the same general principles used in linear regression

Therefore, according to the data collected we have:

In any regression analysis the key quantity is the mean

value of the response variable given the values of the



205 | P a g e
978-1-4673-7606-8/15/$31.00 2015 IEEE

SAI Intellig
gent Systems Conference 2015
mber 10-11, 2015 | London, UK

Another important parameter is Residual or Cost. This is a

quantifier of how far the predicted
d model deviates from the
actual data. Lower value of Residu
ual is of course desirable as
the linear model becomes more acccurate. Figure. 15. Shows a
table of residuals for each instaance of each model. The
objective is to minimize the residuaals.

Maximum Height
Minimum BER
Q Factor
Now, the linear models are defined with thhe predictors and
outcome variables. RStudio was used to moddel the data. The
variables were passed to a function in the fo
form of Formula
class which relates the output columns to thee input columns.
They are as follows:

Figure 15. Shows that the models for Height and

Threshold have very low residual values while Min.BER and
ual values. This may lead
Q.Factor have fairly high residu
someone to believe that the first tw
wo of the models mentioned
are accurate while the latter two aree inaccurate. This might not
be the case always and therefore we
w need to examine another
parameter, namely, Mean Square Error.
The function predict(linear mod
del, dataframe) predicts the
output based on the linear model and
a data frame passed to it.
The output is a vector contaiining the predicted data
corresponding to each row of the prredictor data.

The next step requires calculaation of the Mean Square

Error which is taken to be the cosst function to determine the
accuracy of the model. MSE is giveen by:

The function lm(formula, dataframe) ccreates a linear
model based on the formula and the data fraame passed to it.
Hence, we have 4 linear models from the aforementioned
formulae. lm() works on the dataset provided to it and outputs
an object of class list which contains all the ddata pertaining to
the model such as coefficients, residuals, devviances, quantiles
etc. Fig. 15. Shows the bias (interceptt) and weights
(coefficients) of each linear model.

n (3)
Where, Xpredicted = Predicted Outtput
Xobserved = Observed Output
N = Total number of observations
he observed and predicted
The Square Error between th
output was calculated and for all the four models and their
o 20 show graphs of MSE
graphs were plotted. Figures 17 to
versus the instance.

Fig. 15. Coefficients and Intercepts of Linear Models

The variables Channels, Frequency and Power have no

coefficients because they have a constant vaalue in the given
dataset. They have been included in the m
model to prevent
incorrect predictions due to confoundingg variables i.e.
variables which indirectly affect relationshipp between input
and output. These values can be used in equation (2) to
develop a prediction model.

Fig. 16. Residuals of each instance of Lineaar Model

206 | P a g e
978-1-4673-7606-8/15/$31.00 2015 IEEE

SAI Intelligent Systems Conference 2015

November 10-11, 2015 | London, UK

Fig. 20. MSE versus Instance for Threshold

Fig. 17. MSE versus Instance for Height

In the above figure the blue curve signifies the LOWESS

(Locally Weighted Scatterplot Smoothing) curve for the data
along with the 95% confidence interval for the curve is given
by the dark grey region. The LOWESS curve gives us the
general trend in the data.
As mentioned earlier the data set on which the operations
are being performed have 100 simulations with 10 class labels
and each one of these class labels occupies 10 instances each.
Hence, we can divide the x axis of the plots into 10 regions of
10 instances each. On closer inspection it was concluded that,
for example, in instances 1 through 10, the MSE value was
higher at first and then went down. This is visible in all the
regions as well. Thus, the MSE regresses towards a minimum
mean value in all the plots.
The confidence intervals in Figures 19 and 20 occupy a
broader region indicating the fact that prediction is not as
accurate. Another concern is that for a class region, the MSE
decreases first, reaches the minimum and then increases or
vice versa.

Fig. 18. MSE versus Instance for BER



Multivariate Multiple Regression, as the name suggests, is

a technique for estimating multiple outcome variables which
depend on multiple predictor variables. The outcome variables
may or may not be independent of each other. This technique
is similar to Multiple Regression with the sole addition of
multiple outputs.
For estimating the model using this technique, we utilized
Neural Networks and more specifically the LevenbergMarquardt Back-propagation Algorithm.
A. Theory
The most fundamental processing unit of neural systems is
a neural. It receives various signals from the inputs, combines
them and performs anon-linear operation to produce an output.

Fig. 19. MSE versus Instance for Q Factor

The Artificial Neural Network takes inspiration from this

model. It consists of an input layer, an output layer and hidden
layers. The number of hidden layers taken depends on the
specific application and type of neural network being built.

207 | P a g e
978-1-4673-7606-8/15/$31.00 2015 IEEE

SAI Intellig
gent Systems Conference 2015
mber 10-11, 2015 | London, UK

The number of rows in the input layer is equal to the

number of inputs. On moving from one llayer to another
weights are assigned to each transition. Ann additional Bias
Unit is added to the preceding layer while movving forward.

The input and output data is fed to the network.

The non-linear transfer functions which ccan be used areHyperbolic Tangent, Sine and Sigmoid Funcction Fig. 20. A
linear function can also be used but doess not match the
accuracy of a non-linear one. For our algorithhm we make use
of the Sigmoid Function which is of the form:

mized for this part and 70%

The data samples are random
is assigned to Training and
d the rest 30% is equally
divided into Validation and Testing.

Training; Validation;
The data is divided into 3 parts,

The Neural Network is train

ned using the given data
After training, validation and
testing the networks
accuracy is gauged by its
i MSE and Regression
Retraining can be done if the results are not
satisfactory by changing daata, number of hidden layer
neurons or by changing the percentage assigned to the
three phases of training.
dden layer neurons for the
We used 10,100 and 1000 hid
design and figures 22 to 24 show th
he MSE values for the three
designs as the training progresses.
Figure 26. Shows the Error Histogram for the 1000 layer
network. The result is satisfactory as the maximum frequency
of instances in the minimum errror region denoted by the
yellow vertical line.

Fig. 21. Sigmoid Function plotted using R

B. Back-propagation algorithm
This algorithm was originally introduced inn the 1970s.It is
a very efficient technique for calculating thee output variable
compared to previous techniques. The inittial weights are
configured and random weights are associiated with each

n line for all three phases of

Figure 27. Shows the regression
training and an overall trend of the 1000 hidden layer network.
The Lines fit to 99.99% and hence the design is quite accurate
f corresponds to training,
in predicting new data. The blue fill
the green fill to validation and the red
r to testing.

Using Feedforward propagation, the netw

work progresses
further and predicts an output based on thesee weights. In our
training example (70% of the data set), we havve a given output
and we compare that with our predicted output. Now, the
gradient of each step is computed starting from the error
function and it now propagates backwards.
This goes on till the first hidden layer. The weights are
now re-assigned. This process continues for aall the remaining
training examples until it reaches its most optimal
solution.15% of the data set is reserved for crross-validation as
shown in the Figure it fits with an accuracy oof 99%. Finally
the network is now ready to train any new
w value of input

Fig. 22. MSE for 1000 Hidden Layers

C. Procedure
After finalizing the design parameters, thee neural network
(Fig. 25) is designed on MATLAB.
The design procedure is as follows:
The final data which was parsed is spllit into two parts
containing the input and output variables.
The number of neurons in the hidden layer as well as
the transfer function (sigmoid in this caase) is decided.

Fig. 23. MSE for 100 Hidden Layers

208 | P a g e
978-1-4673-7606-8/15/$31.00 2015 IEEE

SAI Intellig
gent Systems Conference 2015
mber 10-11, 2015 | London, UK



From our simulations we have seen

a co-variance between
the input data set and the output. On
O comparing the efficiency
of perceptron model with that of neural
networks we see that
Neural Network Model converges at
a a faster rate as compared
to the regression model.

Fig. 24. MSE for 10 Hidden Layers

gression Model was a much

Thus Multivariate Multiple Reg
better alternative than the Multiplee Regression Model for the
data we utilized. We were able to em
mulate only 5 inputs for the
designed optical network out of which only 2 were varied. The
ge depending on the number
accuracy and efficiency may chang
of input instances as well on the number of attributes. It is
t obtain a more accurate
necessary to vary all the inputs to
nough data spanning a wide
Thus, we conclude that given en
parameter list of the simulated
d optical communication
network, it is possible to create an independent prediction
model which, after training utilizes less resources and is faster
of accuracy than the
at predicting outputs with a high degree
traditional simulation software, esp
pecially in the field where
such licensed softwares are not avaiilable for quick analysis.

Fig. 25. Designed neural network in MATLAB

We would like to thank VIT University -Department of
Electronics for providing us witth the resources and Lab
facilities, especially licensed versiions of MATLAB R2013b
and Optisystem 13.0 for carrying out the project, Professors
for their guidance in
Arulmozhivarman P. and Sankar Ganesh
mining the Data from the software.
S Ilic, B. Jaksic, M. Petrovic A. Mark
kovic, V. Elcic, "Analysis of Video
Signal Transmission Through DWDM
M Network Based on a Quality
Check Algorithm" Vol 3, _o. 2, 2013, 416-423
[2] M. T. Fatehi, M. Wilson, Optical Neetworking with WDM, McGrawHill, New York, 2001
mation of filtered signal envelope
[3] M. Stefanovic, D. Milic, "An approxim
with phase noise in coherent optical systems", Journal of Lightwave
Technology, Vol .19, No. 11, pp. 1685-1690, 2001
mance of optical heterodyne PSK
[4] I. Djordjevic, M. Stefanovic, "Perform
systems with Costas loop in multich
hannel environment for nonlinear
second-order PLL model", Journal off Lightwave Technology Vol. 17,
No.12, pp. 2470-2479, 1999
Networks: A Practical
[5] R. Ramaswami, K. Sivarajan, Optical
Perspective, 2nd ed., Morgan Kaufm
mann Publishers, San Francisco,
(Eds), Optical Fiber
[6] I. P. Kaminow, T. Li, A. Willner
Telecommunications V, Elsevier/Acad
demic Press, 2008
[7] G. Agrawal, Nonlinear Fiber Optics, 2n
nd Ed., Academic Press, 2001
[8] G. Agrawal, Fiber-Optic Communicatiion Systems, 3nd Ed., Wiley, 2002
[9] E. G. Sauter, Nonlinear Optics, John Wiley
& Sons, Inc., New York
[10] Peter Harrington, Machine Learning in
i Action, DreamTech Press,2012

Fig. 26. Error Histogram for 1000 Hidden Neurons

Fig. 27. Regression Plots for 1000 Hidden Neurons

209 | P a g e
978-1-4673-7606-8/15/$31.00 2015 IEEE