Professional Documents
Culture Documents
Enrolment: 19BCE523
Subject: Computer Networks
Subject Code: 2CS502
Topic: End to End Routing Behaviour
in the Internet
Introduction:
When we talk about routing, there has been a lot of research in routing algorithms
but no significant research has been done on how routing act in the internet on a
large scale. The main purpose of this report is to answer the question what kind
of failures occur in routing and what pathologies occur? Do routes change
frequently or they remain stable for a long amount of time? Are the routes same
from A to B and from B to A i.e. are they symmetric in nature? To answer this
question, we have taken into consideration large sample of internet routes which
are spread over different location. More than 40,000 routes have been analysed
with help of 37 websites. We consider that the sample of routes take into
consideration will represent the same scenario occurring internet in general. We
will also analyse how routes change with respect to time, this is done to assess
what changes in the internet routing occurs with respect to time.
Here we will be making clear distinction between routing protocols and routing
behaviour. Routing protocols are nothing but a procedure for spreading
information related to routing within a particular network this information can
even be useful to forward traffic. Routing behaviour conveys how routing
algorithms perform in practice. It is very crucial to understand the difference
between this two, since routing protocols are frequently studied while on the other
hand routing behaviour is rarely taken up for research. The same thing is reflected
in the literature regarding these topics. When we consider research and literature
available for routing behaviour, majority of the research is available based on
only simulation same is the case with literature related to routing behaviour.
Although few researches have been conducted, but they are almost always
qualitative in nature rather than working on large networks researches have been
done in small scale networks. however a researcher chinoy has dedicated his
research for routing behaviour in large scale.
Chinoy found vast majority of changes in the dynamics of routing information.
Even though no connectivity information has changed, there were routers who
were sending routing information from time to time. Chinoy found that most of
the changes which were occurring in the routing information were happening at
the edges of the network, while the backbone of the network hardly had any
changes in the same. There were situation during outages in which the network
was unreachable. This unreachability was ranging anywhere from a few minutes
to few hours. Most of the networks were dormant, however a few of them were
not.
To denote the abstraction of a direct link between two internet hosts at network
level the term virtual path will be used. For e.g. if host A wishes to have a network
level connection with host B, we will denote it by the notation of the virtual path
from A to B as A=>B.
At any point in time, the virtual path at network layer consist of a single route,
which is nothing more but a sequence of routers. Along this sequence of routers,
packets are going to be sent. the virtual path may change from time to time or
they might remain quite stable.
Thus, the area of research suggested by chinoy is that given two host, how will
the virtual path act? This is the principle question whose answer we are trying to
find out through this study.
Routing in internet:
The internet has been divided into a disjoint set of autonomous system for the
purpose of routing. A collection of routers and hosts which are unified into one
by running a single IGP (Internet Gate protocol) was the original ideal for
autonomous system. However, with evolution in time, the idea has developed to
be essentially compatible with the administrative domain, wherein the hosts and
routers are merged by a set of IGP’s and a single administrative authority. If we
want to achieve the highest level of internet interconnection, it is provided by
routing between autonomous systems. All the significant autonomous systems
use BGP which is currently in its 4th version. Arbitrary interconnection topologies
between autonomous Systems are allowed by BGP. A procedure for preventing
routing loops, between autonomous systems, is also provided by BGP. The
stability of inter-autonomous systems routing is the key to whether the using BGP
will scale to a very large extent. The phenomena, in which the routes between
autonomous system vary frequently, is known as flapping. If flapping is there
then a lot of time will be spent on updating the routing table by BGP also time
will be spent on spreading the changes in routing information.
Since autonomous systems are large entities which are capable of crucial internal
instabilities it is very significant to note that having a stable inter-Autonomous
System routing does not imply stable end-to end routing.
Methodology:
In this section we will be explaining what is the procedure used for our study.
Experimental apparatus:
A number of internet sites have been recruited to conduct our experiments. The
list of sites has been given in a table in the next section. Several networking sites
were recruited so that we can run NPD (Network probe daemon) on them. NPD
is an entity which provides several measurement services related to networking.
A control program “npd_control” contacted these NPD’s from time to time,
which were running on local workstation. Then using traceroute they were asked
to measure the distance to other NPD.
With a mean interval of 1-2 days each virtual path between two of the NPD sites
were measured. This first set of measurement was termed as D1. Now 60% and
40% with a mean inter-measurement interval of 2hours and 2.75 days
respectively was the second set of measurements which is termed as D2. In order
to make each NDP a traceroute measurement every two hours on average the D1
interval was chosen. Once we start adding more sites to the experiment, the NDP
rate of particular remote NDP sites measurements was decreased. This was done
to maintain the average load of one measurement per two hours, which finally
took us to the 1-2 days mean measurement. After looking through the data of D1,
we came to notice that having large sampling interval will prohibit us from
solving a lot of questions regarding stability of routing. Thus learning from the
lesson in D1, we used a new strategy in which we make measurements between
pairs of NPD sites in bursts which will have a mean interval of two hours among
the measurement in each burst. Since we needed data to assess routing stability
over long periods of time, we decided to go on with lower frequency
measurements between pair sites. In order for 50% to come in bursts and 50 in
widely spaced we arranged the measurements accordingly. we had measurements
from traceroute which was obtained from TCP study conducted using NPD
framework. Apparently even these were made with a two hours apart time interval
on average, hence when we included them the measurements tilted towards 40%
widely spaced and 60% bursts.
We had also paired the bulk measurements of D2, this meant that first we will
measure the virtual path A=>B and then straight away measured virtual path
B=>A.
Exponential Sampling:
It is well known from the introduction that 37 internet hosts had participated in
the study of routing. This is very small fraction of the actual 66 lakhs internet
hosts. This numbers are estimated in the latter half of 1995. So from this we can
say that the behaviour which is observed by us cannot represent the actual
scenario. Also, these hosts belong to 34 different stub networks which again is
a very small fraction of the actual half a lakh stub networks. However, we also
justify that the 37 internet hosts do actually represent the internet, since a non-
negligible fragment of the autonomous system is included which together
incorporate the internet.
Confidence Interval:
Often in our study we will seek to assign some kind of confidence interval to the
probability available to analyse our data.
of much use unless we also have an idea of its possible error. For example, if,
i i i i i i i i i i i i i i i i
the interval being a range of values that, with high confidence, includes p. In [,
i i i i i i i i i i i i i i i
We have been and will be using 95% confidence interval, correlating to c=0.95
and c’≈0.553.
Failures in measurement:
In the two experiments conducted the traceroutes failed for about 5-8% of the
time. We were simply not able to contact the remote NPD. Because of the
inability of the npd_control to contact the remote NPD these failures were
caused. These failures to contact the remote NPD results into losing a chance for
observing of lack of connectivity. This will escort to a prejudice towards
underestimating the internet connectivity failures.
Routing Pathologies:
First, we classify the occurrences of routing pathologies into the routes with
clear performance, inferior performance, out and out broken behaviour.
Routing Loops:
Here we will talk over the pathology of a routing loop. For that we distinguish
them into three types. Forwarding loop; a loop in which the packets return to
the router which were forwarded by itself. Information loop; based on the
information provided to router, it acts on the connectivity. Traceroute loop; in
this loop the measurements done by traceroute gives the same order of routers.
Normally, the routing algorithms try to avoid forwarding loop. Thus lop will be
formed when there is a change in the connectivity in network. This change is not
instantly reflected in the routers. Since forwarding loops represent connectivity
failure, it is important that forwarding loops are resolved by itself as soon as
possible.
For the purpose of our analysis, any traceroute which shows that a loop is
unresolved is considered as persistent loop. In our study 10 traceroutes showed
persistent loops in D1.
Similarly, there were 50 persistent loops in D2. Upper bound on how long the
loops preserved can be placed. This can be done by observing for neighbouring
measurement among the same host which has no loop. Sometimes the
neighbouring measurements show the loop which can allow us to assign upper
bound. The below table shows persistent routing loops in D2.
Erroneous Routing:
An example of erroneous routing was found in D1, in which wrong path was take
by the packets. This erroneous routing was connix=> ucl route in which a trans-
Atlantic route was to Rehvohot, Israel instead of London. However, no such
erroneous routing was found in D2.
This makes us believe that you cannot make assumption about where your
packet goes to in the internet.
In the 10 of the traces of D1 it was observed that the routing connectivity which
was earlier reported in traceroute was later nowhere to be found, maybe
altered. This indicated a failure in routing. A few of these failures were because
of outages. During the outages the intermediate routers were updating the
information of the view of current topology which led them to drop the packets,
maybe because it didn’t know how to forward them. We have noticed a wide
range in recovery time, some take less time around 100 msec. however, others
took about a minute or so to resolve. The recovery taking more time creates
problem for application requiring real time support.
Fluttering:
Apart from persistent routing loops leading to traceroute failures and erroneous
routing, 125 traceroutes from D1 and 617 f tracerouted from D2 didn’t reach
the expected destination for some reasons. We term these reasons as
infrastructure failure, where in a route stops working in the middle of the
network.
Summary of pathologies:
The above table gives a summary of pathologies.
Definitions of stability:
Routing stability has two different definitions. first one is “given that a route r is
observed now, what is the possibility that the same route r is going to observed
again in future”. This idea of stability is referred as prevalence.
The second one is that “given that a route r is observed a time t, till how much
time that route will remain same” this idea of stability is known as persistence.
This impacts how effectively the route state will manage.
Here our analysis is confined to just D2 measurements, since they were made at
a wide range of intervals. Hence, we can tackle persistence ambiguity and over
many time scales assess stability. Out of the 35000 D2 measurements these
pathologies were omitted and also for the one for which traceroutes hops were
missing were omitted. Atlast, we were left with 31,709 measurements.
i assess the differences in ^πdom between the sites in our study. To do so, for
i i i i i i i i i i i i i i
Now the more difficult task is to determine the persistence of the routes. How
likely are they going to live through before being changed. routing persistence
We need to determine whether routing changes on short time scales first so that
we can accurately analyse the requirements of persistence. If the routes does
not change in short time span then we can rely on short measurements which
observes the same route. This can also be used to analyse if the route changes
on medium time scales.
After doing research we found that except for a few sites changes in routes does
not occur in less than one hour. So, we will now assess the measurements which
are done an hour or less so that we can determine whether it is same about
medium scale routing persistence. For measurements which are made for less
than hour, Let Phrsrc s and Phrdst s be the analogs of P10src s and P10dst s.
Summary:
Here the report is being made after doing an analysis of more than 40,00
internet routes. This analysis was conducted between various internet sites.
This research differentiates pathological routing situations, routing symmetry
and routing stability.
A continuous subject which is going through our research is we have a lot of
variation. We have time and again seen that various sites of group of sites have
encountered separate routing characteristics. This characterizes that the
difference in internet traffic statistics among sites are critical to the point that
we haven’t found any typical internet site. Also, no such internet path has been
found. However, the extent of our findings has given us a pretty good insights in
the breadth of the behaviour and also how it works from an end point of view.