p151 Abusina PDF

Zalal Uddin Mohammad Abusina is with the National Institute of Information and Communications Technology (NICT), Japan.
He works for
Japan Gigabit Network-II (JGN-II) Project at NICTs Tohoku University Ofce housed in its Research Institute of Electrical Communications
(RIEC).
Salahuddin Muhammad Salim Zabir joined the Department of Computer Science and Engineering of Bangladesh University of Engineering
and Technology in 1995. At present, he is with RIEC, Tohoku University. He is a member of the IEEE, BCS and BAAS.
Ahmed Ashir received his PhD in 1999 from Tohoku University, Japan. He was with Japan Gigabit Network (JGN) Project of the Telecommu-
nication Advancement Organization (TAO), Tohoku University Ofce.
Debasish Chakraborty received his PhD in 1999 from Tohoku University, Japan. He is currently with Research Institute of Electrical Commu-
nications, Tohoku University, Sendai, Japan.
Takuo Suganuma is currently with Research Institute of Electrical Communications (RIEC), Tohoku University, Sendai, Japan. He is a member
of the IEEE.
Norio Shiratori is a professor at the Research Institute of Electrical Communication (RIEC), Tohoku University. He has been engaged in research
on distributed processing systems and exible intelligent networks. He is a Fellow of the IEEE, IEICE, IPSJ.
*Correspondence to: Zalal Uddin Mohammad Abusina, Research Institute of Electrical Communication, Tohoku University 2-1-1, Katahira
Aoba-ku, Sendai 980-8577, Japan.
E-mail: abusina@shiratori.riec.tohoku.ac.jp
Copyright 2005 John Wiley & Sons, Ltd.
An engineering approach to dynamic prediction of
network performance from application logs
By Zalal Uddin Mohammad Abusina*
,
, Salahuddin Muhammad Salim Zabir,
Ahmed Ashir, Debasish Chakraborty, Takuo Suganuma and Norio Shiratori
Network measurement traces contain information regarding network
behavior over the period of observation. Research carried out from
different contexts shows predictions of network behavior can be made
depending on network past history. Existing works on network
performance prediction use a complicated stochastic modeling approach
that extrapolates past data to yield a rough estimate of long-term future
network performance. However, prediction of network performance in the
immediate future is still an unresolved problem. In this paper, we address
network performance prediction as an engineering problem. The main
contribution of this paper is to predict network performance dynamically
for the immediate future. Our proposal also considers the practical
implication of prediction. Therefore, instead of following the conventional
approach to predict one single value, we predict a range within which
network performance may lie. This range is bounded by our two newly
proposed indices, namely, Optimistic Network Performance Index (ONPI)
and Robust Network Performance Index (RNPI). Experiments carried out
using one-year-long trafc traces between several pairs of real-life
networks validate the usefulness of our model. Copyright 2005 John
Wiley & Sons, Ltd.
INTERNATIONAL JOURNAL OF NETWORK MANAGEMENT
Int. J. Network Mgmt 2005; 15: 151162
Published online 28 February 2005 in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/nem.554
1. Introduction
T
he Internet is becoming an increas-
ingly important component of modern
day communication and information
exchange. Yet, the applications which use the
Internet in general do not have much information
about the underlying path(s) from the source
to the destination, far less the characteristics of
the paths. The layering concept has isolated
applications from the network related information
with the result that network applications such as
FTP, WWW, Mirroring etc. are currently operated
with little or no knowledge about the routes
and their characteristics.
1,2
It is clear that
these applications could operate more efciently if
the routes and their characteristics are known
and/or are made available to the concerned
application.
3
To attain these goals, there are two separate
issues to be dealt with. First, a better way to quan-
tify network performances should be dened so
that the information is suitable for the users or
applications. The second question is how to esti-
mate, or predict, the future network performance
parameters based on past observations. This is
added to the question of how to make them avail-
able to users and/or applications. The IP Perfor-
mance Metric working group (IETF-IPPM-WG)
4,5
is working on developing a set of metrics that will
characterize quality, performance, and reliability
of Internet data delivery services (networks).
Several tools
69
exist to measure different param-
eters of network performances. However, estima-
tion or prediction of network performance has
been a challenging issue because of the inherent
uncertainty in its behavior inuenced by several
factors
1
as follows.
Dynamic Behavior. The network resources as well
as the utilities are dynamic, thus the network char-
acteristics will have a dynamic component too. For
example, a link may be down for a short while and
this will show up as a very low data transfer rate,
which is not the true general characteristic of the
network.
Burstiness. The network trafc traversing the
network does not maintain a consistent pattern. At
times, when a session of an application starts, the
trafc pattern suddenly goes high. On the other
hand, during the inter-session period, network
resources remain idle. For these reasons, sudden
impulses are common phenomena on the network
performance (e.g. throughput, delay, latency,
trafc size etc).
Human Factor. Network behavior is greatly inu-
enced by human working hours. The human
working hours may be the ofcial working hours
(excluding the holidays) and the time span when
network usages are relatively cheap. In our pre-
vious works,
10,11
we have shown this periodic
behavior. RFC 3432,
12
also deals with such charac-
teristics. In a proper prediction model, all these
features should be taken into consideration.
The existing literature, as will be outlined in the
next section, describes works carried out from
different perspectives. Despite differences in the
addressed domains, these works have one point
in common. Most of them aim at developing
complex, and to some extent complete, mathema-
tical models to predict future semi-static network
trafc. These may work well in providing the man-
agement with necessary information regarding
networking requirements. However, they fail to
predict the dynamic network behavior in the
immediate future.
In this paper, we address network performance
prediction as an engineering problem. The basic
contribution of this paper is to predict network
performance (e.g., throughput), dynamically for
the immediate future. Therefore, rather than con-
centrating on a complex mathematical model,
we develop a relatively simple heuristic-based
approach that yields a reasonably accurate esti-
mate of network throughput in the near future
with moderately low computational requirement.
Our proposal also considers the practical im-
plications of prediction. Therefore, instead of
following the conventional approach to predict
one single value, we predict a range within which
the network performance may lie. This range
is bounded by our two newly proposed indices,
namely, Optimistic Network Performance Index
(ONPI) and Robust Network Performance Index
(RNPI). ONPI corresponds to the best expected
network performance. RNPI, on the other hand,
corresponds to the region of lowest expected
network performance. Most of the time, network
performance is expected to lie between ONPI and
RNPI. This approach has practical implications for
various applications.
152 Z. U. M. ABUSINA ET AL.
Copyright 2005 John Wiley & Sons, Ltd. Int. J. Network Mgmt 2005; 15: 151162
W
e address network performance
prediction as an engineering problem.
We examine our model through a one-year-long
data transaction between several pairs of net-
works. It is found that our model performs quite
well in predicting network throughput dynami-
cally. Besides forecasting network throughput, this
model can also be a candidate for prediction of
other network performance parameters such as
round trip delay etc.
The rest of this paper is organized as follows. In
Section 2, we provide a brief outline of related
works on network performance prediction. We
then present our proposed model for prediction of
network performance in Section 3. We present the
experiments and results along with evaluation of
our model in Section 4. Finally we make conclu-
sions in Section 5.
2. Related Works
With the development and deployment of
measurement tools,
1315
prediction of network per-
formance based on network history has started
gaining pace.
16
In particular, network through-
put prediction has attained maximum attention
among researchers.
17
One particular trend that is being observed
is the development of complex mathematical
models for network performance prediction.
These models, in general, make forecasts of semi-
static network throughput off-line. For example, in
Reference 18, the seasonal form of the Auto
Regressive Integrated Moving Average (ARIMA)
model is employed to perform long-term predic-
tions (two or more years into the future).
Again, in Reference 19, a simple linear extra-
polation algorithm is used to predict hotspots
(sudden increases of trafc) which are frequently
seen around large world-wide events such as the
Olympic games or the World Cup.
Some prediction models based on neural net-
works have also been proposed. However, in
general, they focus on application-specic perfor-
mance prediction. In Reference 20, single and
multiple frame video throughput prediction using
neural network models has been proposed.
It has been observed
10,11
that network past
history may be an excellent source for estimating
network behavior in the future. However, except
for presenting a gross measurement, that is, a
normal statistical average over different time slices
of the 24 hour day and inferring that obtainable
network performance has some relation with the
period at which it operates, no further analysis has
been made. Since mere statistical averages are
prone to error induced by non-representative data
(that is, outliers) this information is not enough for
our purposes.
These previous efforts have predicted the prob-
able value of a future performance parameter as a
single curve, hence neglecting the intrinsic bursty
nature of the trafc. Also, they predict network
performance off-line, ignoring the dynamic
change in the network operating environment.
Predicting a single curve with a view to claiming
that network throughput will attain that value in
the future does not have signicant meaning from
the practical application point of view. Also, off-
line information may prove to be of little use and
only to applications, if at all.
However, in our approach presented in this
paper, we propose a range for expected network
performance (throughput) as a representative per-
formance parameter. Again, our model makes a
dynamic prediction of network throughput, taking
immediate past network throughput information
to forecast the immediate future. These two
characteristics make our approach a practical engi-
neering solution for meeting the requirements of
various applications.
3. Our Estimation Model
Our estimation model
21
follows the observations
made in Reference 10. As discussed in Section 2,
the authors show that network past history may
be an excellent source for estimating network
behavior in the future. However, except for pre-
senting a gross measurement, that is, a normal sta-
tistical average over different time slices of the 24
hour day and inferring that network throughput
has some relation with the period at which it
operates, no further analysis has been made.
Since mere statistical averages are prone to error
induced by non-representative data (the outliers),
DYNAMIC PREDICTION OF NETWORK PERFORMANCE 153
this information is not enough for deployment
purposes.
22
This is revealed clearly in Figure 1.
Here, we plot the average and median of network
throughput between two networks (RIEC-net and
goo-net) at different times of day computed over
a month (March 1998). This clearly indicates that
at least in 50% of the cases, the average overesti-
mated the network throughput.
In this paper we therefore propose a model for
the prediction of network performance that con-
siders median as the base statistics. We then use
standard statistical tools like percentiles, quartiles,
SIQR (Semi Inter Quartile Range) and their
aggregates. Since we propose an engineering
approach to network performance prediction
based on network measurements, rather than a
complex mathematical framework, our model
introduces some new heuristic-based operators
similar to the basic Genetic Algorithm (GA) oper-
ators. These new operators and associated actions
will be referred to by the same name as their GA
counterparts in this text.
As stated before, the essential requirement
of a measurement-based dynamic network perfor-
mance prediction mechanism is to have practical
signicance. Internet trafc has a continuously
varying pattern. Therefore, predicting a single
value to represent network throughput in the
immediate future is almost meaningless. This fact
leads us to propose a range-based prediction
model. The idea is to make a prediction of a range
within which we wish our network to operate. At
the two ends of this lie Optimistic Network
Performance Index (ONPI) and Robust Network
Performance Index (RNPI) which we describe
later. The reason for using the term performance
index is that, in addition to predicting network
throughput, this model can be used for other
network performance parameters like round trip
delay etc. In this paper we, however, focus on
network throughput only and the term performance
index would correspond to throughput in a
network.
Our model essentially depends on a continuous
measurement of network throughput. We consider
both the historical network trafc information
as well as the immediate past network trafc
information to forecast a meaningful range for
network throughput in the immediate future. This
type of prediction can enhance the performance
of various applications over the Internet quite
signicantly.
60
80
100
120
140
160
180
200
220
240
260
00:00 03:00 06:00 09:00 12:00 15:00 18:00 21:00 00:00
]
s
p
b
k
[
t
u
p
h
g
u
o
r
h
T
k
r
o
w
t
e
N
Hours of day (GMT)
Median
Average
Figure 1. Comparison of average and median of network trafc
3.1. Denitions and Assumptions
We are using median network performance
indices as our base statistics. This gives us an esti-
mate of the central tendency of the observed data.
At the same time, it helps us to remove the out-
liers from active consideration. Once we have
chosen the measure of central tendency, we need
some way to characterize the measure of dis-
persion. There are many ways such as range,
variance/standard deviation, mean absolute
deviation, semi-interquartile range etc. for the
purpose.
22
Among them, semi-interquartile range
or SIQR is very much resistant to outliers. It is
similarly true for percentile ranges. We have there-
fore used percentile ranges in our mathematical
framework.
Figure 2 shows the medians of network
throughput on different working days of a week
at different times of each day. From this gure we
easily observe that at a particular time of each day,
network throughput follows similar patterns for
different working days of a week. Therefore, as in
Reference 10, we also infer that the operating
period is one of the parameters governing network
performance. We consider this time dependence
dividing them into discrete time slices, Dt. The
length of these time slices may vary depending on
the requirements. We may also consider variable
time slices Dt for some applications. For the sake of
easy representations, from now on, we shall refer
to different time slices Dt as t, t + 1, t + 2, and so
on. We can have many observations l, l - 1, l - 2,
etc. at the past for the same set of time slices. Sim-
ilarly, we can have many observations l + 1, l + 2,
l + 3 etc. for the present and also the future for the
same set of time slices.
In this model we consider two types of time
dependence. One is the historical dependence of
network throughput for a particular time slice t.
The other one is the dependence on the current
state of network throughput.
In our mathematical model, we have taken the
rst type of dependence into account through the
historical factor h
t,l+1
. The idea is that at time slice
t in observation l + 1, network throughput should
show a behavior close to some derivation from
several past observed behaviors of the network
throughput at the same time slice, t.
The second category of time dependence stands
for the effect of currently obtained network
throughput. The idea is, if at time slices t - 1,
t - 2, t - 3 etc. in observation l + 1 we have some
amount of network throughput, the probable
60
80
100
120
140
160
180
200
220
240
00:00 03:00 06:00 09:00 12:00 15:00 18:00 21:00 00:00
]
s
p
b
k
[
t
u
p
h
g
u
o
r
h
T
k
r
o
w
t
e
N
Hours of day
Mon
Tue
Wed
Thu
Fri
Figure 2. Medians of network trafc at different times on different days of a week
throughput at time t may be partly characterized
by those observations. This is a sort of rolling over
of network performance from one time slice to
another. Therefore, we call the effect the roll over
factor. In our current approach, the roll over factor
has been considered to be determined by a roll over
function. The roll over function itself is considered to
be dependent on the characteristic function, g
t
(l + 1)
and the roll over ratio, r
t
(l + 1). In this model, we
have assumed a simple characteristic function con-
sidering dependence on only one time slice t - 1
at observation l + 1 as follows:
Generally, there is some dependence between
two different time slices. However, we have noted
that at a certain point in time, there is no relation
between the performance index at some time slice
with the next time slice.
It is also observed that at times the network
performance shows some abrupt changes. We
have introduced a dynamic cross over operator,
described later in the paper, to adjust to the abrupt
change in network behavior.
In our model, instead of estimating some single
value to be the expected network performance, we
have introduced a new concept of predicting a
range within which we would wish the network
throughput to probably be. At the two ends of this
range lie the ONPI and the RNPI. If the net-
work operates at nearly the highest efciency, the
throughput would be near ONPI. Of course, the
fraction of time when throughput can be this
high would be small. On the other hand, if the
network operates nearly at its normal efciency,
i.e. at least at an efciency which is expected in
most of the cases, then the network throughput
would be near RNPI. Most of the time the network
is expected to operate near this value. We may
consider ONPI and RNPI to be analogous to the
rst and third quartiles or 10 and 90-percentiles,
respectively, and so on. ONPI and RNPI vary with
time and load on the network. Therefore, if we
need to know what is the least we should expect
the network throughput to be at an instance of
time, we should consider RNPI as our index.
However, if some applications are too sensitive to
network bandwidth availability, ONPI should be
used as the predicted network throughput.
g l
t
+ ( ) =

1
0 if there is no dependence
1 if there is some dependence
W
e propose a range bounded by two
performance indices rather than
predicting one single value.
3.2. Mathematical Modeling
Using the mathematical framework and assump-
tions described in the previous subsection, we
employ techniques having some resemblence to
those of a Genetic Algorithm to predict our perfor-
mance indices.
In our model, we consider the statistical para-
meters of interest to be genes of a chromosome. We
may look upon a chromosome, c
t,i
corresponding to
the statistical parameters of interest at time slice t
in observation i, to be a vector as follows:
Here,
m
t,i
is the median of performance index at time t in
observation i
s
ot,i
is the optimistic performance index at time t in
observation i
s
rt,i
is the robust performance index at time t in
observation i
n
t,i
is the number of accesses at time t in observa-
tion i
s
ot,i
and s
rt,i
may correspond to 10 and 90-
percentiles or the rst and third quartiles. In this
paper, we consider the former pair, i.e., the per-
centiles. The number of accesses, n
t,i
, indicates the
representativeness of the data in consideration. A
higher value of n
t,i
assures a higher weight for cor-
responding historical performance data.
Once we have our chromosomes dened, we can
describe the alive population as a matrix:
Therefore, at time t, we have l chromosomes in
the alive population. That is, each row i of P
t
cor-
responds to an observation i for the time slice t.
P
t
t ot rt t
t ot rt t
t l ot l rt l t l
m s s n
m s s n
m s s n
=
, , , ,
, , , ,
, , , ,
1 1 1 1
2 2 2 2
M M M M
c
t i t i ot i rt i t i
m s s n
, , , , ,
, , , = ( )
The ages of different chromosomes differ. We are
considering a model of xed alive population.
Then, in order to accommodate places for newly
born offspring, the old ones have to die. Here,
a new offspring, for example c
t,l+1
, corresponds
to a new observation l + 1 for time slice t. The
chromosome that dies due to the inclusion of a
new offspring normally goes beyond our active
consideration.
In order to obtain the prediction of network per-
formance for observation l + 1 at time slice t, we
normally employ a subset of all these observations
or chromosomes. Therefore, we will use a sub-
matrix of P
t
to be our learning set or learning
window. The learning window can be dened as:
In order to account for the rst type of time
dependence, we use a strategy that the nearest
past bears the closest resemblance with the imme-
diate future. As such, while generating the statis-
tics indicating historical time dependence, we assign
highest preference on the chromosome corre-
sponding to the nearest past for a xed time slot.
In doing so, we use the following normalized weight
vector p,
We then compute the historical time dependence,
h
t,l+1
as the product of normalized weight vector p and
the learning set, W
t
.
The second type of time dependence, i.e. the roll
over factor, r
t,l+1
may be computed with the aid of
roll over function, which itself depends on several
parameters. These are: the roll over ratio, r
t,l+1
dened as:
the characteristic function g
t
(l + 1) and the chromo-
some c
t-1,l+1
.
The vector roll over factor, r
t,l+1
, can then be
dened as
r c
t,l 1 t 1,l 1 + + - +
= * + ( ) * r
t l t
g l
, 1
1
r
t
ot i i
i l w
l
rt i i
i l w
l
o t i i
i l w
l
r t i i
i l w
l
s p s p
s p s p
,
, ,
, ,
,
1 1
1 1
1
1
1
1
+
= - + = - +
- ( )
= - +
- ( )
= - +
=
* - *
* - *

h p W
t,1 1 t +
= .
p = ( )
- + - +
p p p
l w l w l 1 2
, ,L
W
t
t l w ot l w rt l w t l w
t l ot l rt l t l
m s s n
m s s n
m s s n
=
- + - + - + - +
- + - + - + - +
, , , ,
, , , ,
, , , ,
1 1 1 1
2 2 2 2
M M M M
The roll over factor, however inuences our pre-
diction model depending on its importance deter-
mined by the following parameter:
which will later be normalized for actual pre-
diction. Here n
t-1,l+1
corresponds to the chromo-
some c
t-1,l+1
. The implication of taking this number
of accesses into account is that the roll over factor
would be more meaningful if there were a greater
number of accesses and less meaningful if there
were fewer accesses.
The other factor involved in this process is,
We then dene the roll over weight as follows:
Mutation. Once equipped with the above tools, we
describe prediction as a mutation of all genes, in this
case in the chromosomes of the learning set, W
t
and
c
t-1,l+1
. That is, all chromosomes concerned in this
case change to some extent and thus we have the
prediction vector:
It is worth noting that we have employed a
somewhat new type of mutation to have an esti-
mate for prediction. In the ideal case, the predic-
tion vector v
t,l+1
should be close to the vector
corresponding to the chromosome for actual
observed performance, i.e. c
t,l+1
.
Birth and death. Using mutation, we have an esti-
mate or prediction of how the network is expected
to behave. Then, as we observe the actual perfor-
mance, we have a new chromosome c
t,l+1
. As men-
tioned before, similar to the natural process of life
and death, the oldest chromosome dies in order to
make room for the new chromosome in the alive set.
The implication is to emphasize the latest set
of network performance information for use in
the future. The matrix corresponding to the alive
population set after one such birthdeath process
is as follows:
P
t
t ot rt t
t ot rt t
t l ot l rt l t l
m s s n
m s s n
m s s n
=
+ + + +
, , , ,
, , , ,
, , , ,
2 2 2 2
3 3 3 3
1 1 1 1
M M M M
v h r
t,l 1 t,l 1 , t,l 1 + + + + +
= - ( ) * + * 1
1 1
a a
t l t l ,
a
n
n n
t l
A l
A l A l
,
,
, ,
+
+
+ +
=
+
1
1
1 1
= { }
+
n median n i l
A l t i , , 1
1
n n
A l t l , , + - +
=
1 1 1
The learning set after a birth and death process
would also change accordingly as:
Cross-over. Sometimes network performance
changes abruptly. Then it is likely that the predic-
tion model would not be able to provide so good
an estimate of the throughput. Mathematically, we
shall call an estimate to be not so good if the differ-
ence between the predicted and actually observed
chromosomes:
exceeds some threshold T, i.e.,
In such cases, if d
t,l+1
remains high for a consid-
erably long time, a cross over would take place.
The idea is to replicate all the genes of the latest
chromosome in the oldest ones in the learning set
excluding the one corresponding to the number of
accesses. Then the system is tested for determin-
ing whether d
t,l+1
is below the threshold value. If
not then the process is repeated. However, the
number of such iterations has a limit at the
window size. The value of T could be some a per
cent of v
t,l+1
.
The matrix for the learning set after one birth
and death and one cross over would be something
like:
4. Experiments and Evaluation
4.1. Experimental Set-up
Since we have proposed an engineering ap-
proach rather than a complex mathematical one,
experimental verication is essential for its vali-
dation. We therefore tested our model with long
time data between several networks. Our model
performed equally well in these cases to predict
W
t
t l ot l rt l t l
t l ot l rt l t l w
m s s n
m s s n
m s s n
=
- + - + - + - +
+ + + +
+ + + - +
, , , ,
, , , ,
, , , ,
3 3 3 3
1 1 1 1
1 1 1 2
M M M M
d T
t l , +
>
1
d
t l , + + +
= -
1
c v
t,l 1 t,l 1
W
t
t l ot l rt l t l
m s s n
m s s n
m s s n
=
- + - + - + - +
- + - + - + - +
+ + + +
, , , ,
, , , ,
, , , ,
2 2 2 2
3 3 3 3
1 1 1 1
M M M M
network throughput dynamically. In the following
text, we therefore focus on only one such pair. This
particular pair consists of inbound trafc to RIEC-
net from goo-net. RIEC-net (riec.tohoku.ac.jp) is an
academic network of Tohoku University Research
Institute of Electrical Communication. Goo-net
(goo.ne.jp) is a popular commercial network in
Japan providing a variety of services including
free web mail accounts, important news, adver-
tisements etc.
The data trafc log for downstream trafc be-
tween RIEC-net and goo-net for a period of one
year was analyzed. To maintain conformance with
observations and characterization efforts in our
previous works, we used the same data set as in
References 1, 21 and 23 for evaluation purposes.
We considered some portion of the network trace
data as the learning set and tried to make predic-
tions, using our model, for the the remainder of it.
This is done with a view to imitate real-time
dynamic prediction of network trafc. We then
compare the degree to which the predictions and
the actual values are in agreement. We also
analyze the effects of varying learning set sizes.
Furthermore, the behavior of the prediction model
facing some abrupt changes in network perfor-
mance is also analyzed.
As described earlier, no existing work predicts
network throughput dynamically like our one. At
the same time, we propose a range bounded by
two performance indices rather than predicting
one single value. This is completely new and prac-
tical in addressing the problem. Therefore valida-
tion and performance comparison of our approach
can be made solely with actual data which we
present in the following subsections. We compare
our predicted ONPI with the 10th percentile and
RNPI with the 90th percentile of the descending
network throughput data.
4.2. Accuracy of Prediction
We rst consider how accurately our prediction
model could perform. In presenting our results,
we naturally emphasize the characteristics of our
model for the period of the day when the demand
for network bandwidth is most crucial. A correct
prediction for this period is likely to be more
important than for any other time. Also, we show
the accuracy for a particular day in Figure 3. Other
days follow similar convergence characteristics.
Figure 3 shows our predicted RNPI and ONPI
network throughput for a particular learning set
size of 8, with broken lines and the corresponding
actual values using solid lines. We can easily notice
that both ONPI and RNPI predictions by our
model t quite well with the actual values. For all
the cases, except some abrupt change in network
trafc, both ONPI and RNPI match within 10% of
the actual value. Therefore we can infer that under
normal network behavior, our model works well
enough.
One interesting point to note would be that, in
the case of ONPI, there is a higher difference
between what we expected to see and what we
observed actually. However, in the case of RNPI,
both the prediction and the observation are quite
close. This is important because at RNPI, we want
a robust estimate of performance index. The sig-
nicance is that the reliability of the model
increases with the increase in requirement.
4.3. Effect of Learning Set Size
In our experiments, we have observed that the
size of the learning set often inuences the accuracy
of the prediction. More interestingly, although a
reasonably large learning set size is essential for
optimum prediction, a much larger learning set
size may not always be something we should opt
for.
In Figure 4 we observe that in comparison with
a reasonable learning set size, a learning set with
higher cardinality (here l = 12) performs worse in
predicting both the ONPI and RNPI. It should still
be noted that the downgrading effect caused by
too large a learning set is more prominent for ONPI
than for RNPI. This is good for the users of this
model as the error in the predicted RNPI is still
near the range of acceptability.
4.4. Effect of Abrupt Changes
Any model that learns through experience is not
supposed to predict well at the occurrence of an
impulse behavior. Therefore, it is quite natural that
the accuracy of prediction of our model suffers
when some abrupt changes occur in network per-
formance. Figure 5 shows one such example. One
interesting point to note in this context is that here,
also, the predicted ONPI differs to a great extent
from actual behavior. But on the other hand, the
predicted RNPI does not differ that much from the
0
10000
20000
30000
40000
50000
60000
70000
80000
90000
100000
13 14 15 16 17 18 19
0
10000
20000
30000
40000
50000
60000
70000
80000
90000
100000
13 14 15 16 17 18 19
ONPI(predicted)
RNPI(Predicted)
ONPI(Actual)
RNPI(Actual)
]
s
p
b
k
[
t
u
p
h
g
u
o
r
h
T
k
r
o
w
t
e
N
Hours of Day
Figure 3. Accuracy of prediction
0
10000
20000
30000
40000
50000
60000
70000
80000
90000
100000
13 14 15 16 17 18 19
0
10000
20000
30000
40000
50000
60000
70000
80000
90000
100000
13 14 15 16 17 18 19
ONPI(Predicted)
RNPI(Predicted)
ONPI(Actual)
RNPI(Actual)
]
s
p
b
k
[
t
u
p
h
g
u
o
r
h
T
k
r
o
w
t
e
N
Hours of Day
Figure 4. Effect of learning set size
0
50000
100000
150000
200000
250000
300000
350000
13 14 15 16 17 18 19
0
50000
100000
150000
200000
250000
300000
350000
13 14 15 16 17 18 19
ONPI(Predicted)
RNPI(Predicted)
ONPI(Actual)
RNPI(Actual)
]
s
p
b
k
[
t
u
p
h
g
u
o
r
h
T
k
r
o
w
t
e
N
Hours of Day
Figure 5. Erroneous prediction in case of abrupt change
actual value. This, once again reafrms that our
model has been working well where it is needed.
However, the model shows some much better
characteristics in some other occurrences of abrupt
behavior (Figure 6) when learning about the
abrupt change is achieved fast.
5. Conclusions
Network performance prediction based on
network measurement traces appears to be a
daunting challenge to network researchers.
Several works in different contexts have been done
in the past. These contributions, however, mostly
focus on devising complex mathematical formula-
tions to predict mostly pseudo-static network
behavior in the future. In this paper, we address
network throughput prediction as an engineering
problem. The main contribution of this paper is to
predict network throughput dynamically for the
immediate future. Our proposal also considers the
practical implication of prediction. Therefore,
instead of following the conventional approach to
predict one single value, we predict a range within
which network performance may lie. This range is
bounded by our two newly proposed indices,
namely, Optimistic Network Performance Index
(ONPI) and Robust Network Performance Index
(RNPI). This approach has practical implications
for various applications. We examine our model
through a one-year-long data transaction between
several pairs of networks. It is found that our
model performs quite well in predicting network
performance dynamically. Besides forecasting net-
work throughput, this model can also be a
candidate for prediction of other network per-
formance parameters like round trip delay,
unidirectional delay etc.
References
1. Chakraborty D, Ashir A, Suganuma T, Keeni GM,
Roy TK, Shiratori N. Self-similar and fractal nature
of Internet trafc. International Journal of Network
Management 2004;14:119129. DOI: 10.1002/nem.512.
2. Ahmed A, Manseld G, Shiratori N, A meeting
scheduling system for global events on the Internet.
Proceedings of the Internet Societys 8th Annual Net-
working Conference, INET98, July 1998, Geneva,
http://www.isoc.org/inet98/proceedings/1b/1b_
1.html.
3. Claffy KC. Internet measurement: Myths about
internet data, in Proceedings of the 24th North
0
20000
40000
60000
80000
100000
120000
140000
160000
180000
200000
13 14 15 16 17 18 19
0
20000
40000
60000
80000
100000
120000
140000
160000
180000
200000
13 14 15 16 17 18 19
ONPI(Predicted)
RNPI(Predicted)
ONPI(Actual)
RNPI(Actual)
]
s
p
b
k
[
t
u
p
h
g
u
o
r
h
T
k
r
o
w
t
e
N
Hours of Day
Figure 6. Abrupt change not affecting the prediction considerably
American Network Operators Group (NANOG24),
February 2002.
4. IETF-IPPM-WG: Framework for IP performance
metrics, http://www.ietf.org/html.charters/ippm-
charter.html.
5. Paxson V, Almes G, Mahdavi J, Mathis M. Frame-
work for IP performance metrics. ftp://ftp.isi.edu/
innotes/rfc2330.txt May 1998.
6. Koide K, Keeni GM, Kitagata G, Shiratori N. DCAA:
a dynamic constrained adaptive aggregation me-
thod for effective network trafc information sum-
marization. IEICE Transactions on Communications
2004;E87-B:413420.
7. Paxson V, Almes G, Mahdavi J, Mathis M. An archi-
tecture for large-scale Internet measurement. IEEE
Communications 1998;36(8):4854.
8. Graham ID. Non-intrusive and accurate measure-
ment of unidirectional delay and delay variation on
the Internet. Proceedings of the Internet Societys 8th
Annual Networking Conference, INET98, July 1998,
Geneva, http://www.isoc.org/inet98/proceedings/
6g/6g_2.html.
9. Murayama Y, Yamaguchi S. DBS: a powerful tool for
TCP performance evaluations. Proceedings of SPIE
Volume 3231, November 1997.
10. Ashir A, Manseld G, Shiratori N. Estimation of
network characteristics and its use in improving
performance of network applications. IEICE Trans-
actions 1999;E82-D:747755.
11. Ashir A, Manseld G, Shiratori N. Network
processes: Scheduling and server selection for ef-
cient operation. Proceedings of Internet Conference 98:
IC98. December 1998, Kyoto, Japan.
12. Raisanen V, Grotefeld G, Morton A. Network per-
formance measurement with periodic streams. RFC
3432, 2002.
13. Squid Internet Object Cache, http://www.nlanr.net.
14. Kamiya H, Ohta K, Kato N, Manseld G, Nemoto Y.
Improving efciency of network services. Asia-
Pacic Network Operations and Management Sym-
posium, Sep. 1998.
15. PTOPOMIB Working Group (concluded), Physical
Topology MIB, http://www.ietf.org/html.charters/
ptopomib-charter.html.
16. Seshan S, Stemm M, Katz RH. SPAND: Shared
Passive Network Performance Discovery. http://
www.cs.berkley.edu:80/ss/papers/usits97/html/
photo.html.
17. Manseld G, Jayanthi K, Ashir A, Shiratori N.
Network maps: Synthesis and applications. Interna-
tional Conference, APSITT99, August 1999, Mongolia.
18. Groschwitz NK, Polyzos GC. Atime series model of
long-term NSFNET backbone trafc. Proceedings of
IEEE ICC94.
19. Baryshnikov Y, Coffman E, Rubenstein D,
Yimwadsana B. Trafc prediction on the Internet.
Technical Report EE200514-1, Computer Network-
ing Research Center, Columbia University, May
2002.
20. Drossu R, Lakshman TV, Obradovic Z,
Raghavendra C. Single and multiple frame video
trafc prediction using neural network models.
Computer Networks, Architecture and Applications,
Raghavan SV, Jain BN (eds), Chapman and Hall,
1995, pp. 146158.
21. Zabir SMS, Ashir A, Shiratori N. Estimation of
network performance: an approach based on net-
work experience. Proceedings of IEEE ICOIN-15,
January 2001.
22. Jain R. The Art of Computer Systems Performance
Analysis, Wiley: U.S.A., 1991.
23. Ashir A, Suganuma T, Kinoshita T, Roy TK,
Manseld G, Shiratori N. Network trafc character-
ization and network information services-R and D
on JGN. Computer Communications 2001;24:
17341743.
If you wish to order reprints for this or any
other articles in the International Journal of
Network Management, please see the Special
Reprint instructions inside the front cover.

p151 Abusina PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

p151 Abusina PDF

Uploaded by

Copyright:

Available Formats

Zalal Uddin Mohammad Abusina is with the National Institute of Information and Communications Technology (NICT), Japan.

You might also like