You are on page 1of 8

COL334: Computer Networks (CSE, IITD, Semester-I-2020-21) Assignment-2

By: Pratik Pranav - 2018CS10368

1. Use wireshark to grab all packets on your wireless interface, while visiting the website www.cse.iitd.ac.in
from your browser. Do an ipconfig /flushdns1 before you do this activity to clear your local DNS cache.
And also clear your browser cache. Report the following:
(a) Apply a “dns” filter on the packet trace, and see if you can find DNS queries and responses for
www.cse.iitd.ac.in. What DNS server was used? How long did it take for the DNS request-response
to complete?

Solution: We could easuly observe DNS packets in the wireshark being transferred. We could
easiy observe the DNS server in the standard query respnse for www.cse.iitd.ac.in.
DNS Server: 2409:4064:4db0:4fc5::3b
Time Taken: 11.943916147 seconds(Response Time) - 11.939556312 seconds(Query Time)
Time Taken: 0.004359835 seconds

(b) Apply an “http” filter on the packet trace and report the approximate number of HTTP requests
that were generated to download all the objects on the home-page. What can you tell from this
observation about how web-pages are structured, and how browsers render complex pages with
multiple images and files?

Solution:
• Number of HTTP request that were generated are: 23(GET)
• I think data are received in raw bytes and then processed into HTML, CSS and JS as per need.
• I think first HTML and CSS data object appears during package transfer which implies that
they are rendered first or we can say simulataneously.
• After that comes the Javascript queries which are added to already build HTML and CSS
rendered pages. JavaScript is then added to the Webpage to make it more interactive.
• At last Images are received by the browser and then added to the webpage.

(c) Apply a filter such as “((ip.src==192.168.1.3 && ip.dst==10.7.174.111) || (ip.src==10.7.174.111


&& ip.dst==192.168.1.3)) && tcp.” As would be self-explanatory, this will filter for TCP packets
moving between your browser and the web-server. Recall that the source and destination IP ad-
dresses are a part of the network layer header, which is also called the IP layer since IP (Internet
Protocol) is the most common network layer protocol in use. Find the number of TCP connec-
tions that were opened between your browser and the web-server. Recall that a TCP connection is
identified by the 4- tuple (source IP, destination IP, source port, destination port).

Solution:
• Around 1000 TCP packets were observed to and fro source.
• We could find the number of TCP connections by checking the number of SYNs/connections
from my computer by adding tcp.flags.syn == 1 to above command then adding tcp.flags.ack
== 0 as the source host will also be sending back SYN,ACKs to accept those connections, so
to filter those out.
• After applying the above filter we get 6 TCP connections.

(d) . In the previous part, do you find that several content objects are fetched over the same TCP
connection?

1 of 8
COL334: Computer Networks (CSE, IITD, Semester-I-2020-21) Assignment-2

Solution: Yes, Several content objects are fetched over the same TCP connections as listed below:
• HTTP is an application layer protocol which normally works on underlying transport layer
protocol and TCP is commonly used. Same case is here.
• Every item which is transferred over HTTP as could be seen wireshark trace is in turn is
transferring over TCP.
• The items that are transferred over HTTP include HTML, CSS, JS and images for rendering
the final webpage.

(e) . Notice that before an HTTP message is sent on a new TCP connection, a 3-way handshake is
first performed to establish the TCP connection. The client sends a SYN message to the server,
the server replies with a SYN-ACK message, and the client then sends an ACK. You will find that
several TCP connections were opened between your browser and the web-server. How much time
does it take for this handshake, before the connection can be used to send/receive data? Given this
latency, what kind of optimizations do you think the browser might want to follow to minimize the
overall page-load time?

Solution: Looking at first SYN, SYN/ACK, ACK

• Time Differences:
1. Timestamp at which first SYN was sent: 4.529099825 seconds
2. Timestamp at which first SYN/ACK was received: 4.642857252 seconds
3. Timestamp at which first ACK was sent: 4.642976792 seconds
4. Timestamp at which data is sent first time: 4.756410129 seconds
5. Time taken in handsake: 114 ms
6. Time taken from first SYN to first data packet received: 227 ms
• Optimizations done by browsers to minimize overall page-load time:
1. Most importantly, Browsers do caching and memoization for some of the content on the
web page which greatly reduces the total time needed to load a web-page.
2. P2P resource sharing is considered another method which is optimizing the the delivery
of web-pages nowadays. Here, Each consumer already received the packets share them to
requires peer if a network connection is present between them which in place decreases the
page-load time.
3. Browser also reduces image transfer latency between networks by compressing while getting
them, also optimizes dependencies which includes just to load necessary plug-ins only,
removing multiple tracking softwares and reducing cookies sizes.

(f) . Report the total time taken for download of the entire webpage measured as the time at which the
first DNS request was sent and the time https://m.cricbuzz.com/cricket-commentary/30489/kxip-
vs-srh-43rd-match-indian-prebamier-league-2020en the last content object was received. This is
called the page load time. Another useful metric that browsers try to optimize is called the above
fold time, which is the time taken to first download objects that are sufficient to render the part of
the web-page that’s visible on the screen, ie. above the fold.

2 of 8
COL334: Computer Networks (CSE, IITD, Semester-I-2020-21) Assignment-2

Solution:
1. Time for page to load:
• Timestamp at which DNS query for m.cricbuzz.com was sent: 2.54s
• Timestamp for last packet sent before Keep-Alive packets: 43.44s
• Time taken to download the entire webpage: 43.44s-2.54s = 40.90s
2. Time taken to load above the fold:
• Timestamp at which DNS query for m.cricbuzz.com was sent: 2.54s
• Timestamp at which first TCP ACKed unseen segment was received: 8.75 s
• Above fold time: 5.21s

(g) .Try doing a trace for www.indianexpress.com and filter for “http”. What do you find, is there
any HTTP traffic? Browse through the entire trace without any filters, are you able to see the
contents of any HTML and Javascript files being transferred? Why is that, while you were able to
do it easily earlier for www.cse.iitd.ac.in?

Solution:
1. Findings:
• No HTTP packets are observed in wireshark trace.
2. Observations after browsing through the wireshark trace:
• I am not able to see the content of any HTML and Javascript being transferred as well.
• As per my observation, the packets sent by indianexpress.com is encrypted as a lot packets
initially are mentioning ’Encrypted Alert’.
• This encryption might be the main reason why we are not able to see the HTML and
Javascript content and the highest protocol visible being TLS which primarily provides
privacy and data integrity between two or more communicating computer applications.

2. Now open the Chrome browser and go to Developer Tools. Open the Network tab. Go to
www.indianexpress.com. Report the following:
(a) Why are you able to see the different content objects in the browser, which you were earlier not
able to see through Wireshark?

Solution:
• The data packets sent to the source from the indianexpress.com as mentioned in earlier answer
is encrypted to a third party software like wireshark.
• However, Chrome must be decrypting it before showing it to us as web-page, That’s why I
think chrome is able to show all the content objects.

(b) How many content objects were downloaded to render the home-page of www.indianexpress.com?
You will see that many of these objects are not from the indianexpress.com domain. Where are
they from? What do you think is the purpose of these objects?

3 of 8
COL334: Computer Networks (CSE, IITD, Semester-I-2020-21) Assignment-2

Solution:
1. Totally 503 content objects are downloaded to render the homepage of
www.indianexpress.com.
2. Yes, many of these objects are not from indianexpress.com. They mostly appears from
• Google syndicate: It’s a Google platform (more specifically, a domain) used to store ad
content and other related sources for Google AdSense and DoubleClick. Also, it does not
use any client-side tracking methods.
• Double Click: Doubleclick is a business owned by Google that makes it money from online
advertisers and publishers.
• Taboola: It is also a private advertising company.
• Outbrain: Another advertising company.
3. Purpose of these objects:
• They are used to create personalized advertisement for the user.
• These ad-services also used to collect data from the user.

(c) Look at the timing information for any object. Try finding one of the larger objects and look at
the Timing sub-tab for this object to see the breakup of the time taken to download the object.
What average throughput was observed during the content download period, ie. when the content
was actually being downloaded, not counting other latencies like DNS lookup or TCP connection
establishment delay?
Solution:
• Throughput with cache enabled: Largest object that was downloaded was of 114 Kb and time
taken is around 264 ms(Only content download time). Throughput = 114*1000/264 kB/s =
431 kB/s
• Also now the largest object is of size 191 Kb which took around 331ms(Only content download
time). Throughput = 87.3*1000/331 kB/s = 263 kB/s
• I think second one should be the actual throughput of the network while first one is better due
to caching done by browser.

(d) Now do the same thing for www.nytimes.com and compare the total amount of content downloaded
to render the NY Times home-page, with the content downloaded to render the Indian Express
home-page. What does this tell you about creating websites?

Solution:
• Indian Express have around 550 content objects which needed 5 MB of data, while NYtimes
have around 400 content objects which took needed 14 MB of data.
• Most of the content objects are add plugins and advertisements. Although there is fair amount
of share of HTML, CSS and JavaScript codes. These codes form the backbone of the page.
• Hence, Creating a websites requires handing a large number of content objects at once. Also,
These websites earn revenue mainly by advertisements which only increase the content object
needed to render a web page.

(e) Do you agree that from a user-experience point of view, since web-pages are constituted of many
small objects and which could be hosted on multiple domains, factors like the roundtrip delay and
optimizations by the browser to pipeline downloads of multiple objects, are more important than
the network throughput that is obtained? Explain your answer.

4 of 8
COL334: Computer Networks (CSE, IITD, Semester-I-2020-21) Assignment-2

Solution: Yes, The idea of hosting a web-page on multiple domains is really good.
• It would decrease the delay caused by various factors like round trip delays and pipelining.
• It could also be helpful in case if some error occured at a domain. Single error at any domain
would not let the web-page going down.
• It would also betters the maintainence of web pages as components are distributed over multiple
domains.

(f) Chrome allows you to test websites by emulating different network conditions. Look at the Throt-
tling dropdown in the Network tab and change the network to Fast 3G, Slow 3G, etc. You can
also build your own custom network profile by specifying the mean downlink throughput, uplink
throughput, and latency. Experiment with different values, such as the following:
(technology, downlink throughput in kbps, uplink throughput in kbps, latency in ms)
Regular 2G, 250, 50, 300
Good 2G, 450, 150, 150
Regular 3G, 750, 250, 100
Good 3G, 1000, 750, 40
Regular 4G, 4000, 3000, 20
DSL, 2000, 1000, 5
How do you think Chrome is able to emulate such different networks?
Chrome also allows you to emulate different devices which is particularly useful when the computa-
tion capabilities of the device may begin to affect the user experience. Why do you think the device
computation capabilities can have this effect?

Solution:
1. Chrome is able to emulate different networks as mentioned above. I think chrome somehow
manages to throttle downlink throughput and uplink throughput and latency accorording to
the network specification provided. It does so by slowing the object content content removal
from queue which in turn increases the observed queuing delay.
2. Each device has different computation capability depending upon its RAM, Storage etc. When
we design a software we must kept in mind that our product must be providing same user
experience across different devices. If we donot check that it might also happen that connection
timed out.

(g) Go to www.indianexpress.com and export the Chrome trace as a HAR file. Open this file using a
text editor understand the information it contains, look up on the web if needed to understand the
structure of HAR files. Examine the requests going to ad networks like Double Click or analytics
services like Google Analytics and report what third-party domains do you see being accessed?
What user-specific information seems is being sent to these websites? What kind of information
do you think these third-party domains are requesting to be saved as cookies locally? Go to your
Chrome privacy settings, then to the cookie settings. Do you have third party cookies blocked?

5 of 8
COL334: Computer Networks (CSE, IITD, Semester-I-2020-21) Assignment-2

Solution:
1. Third party domain which I see being accesed are:
• securepubads.g.doubleclick.net
• imasdk.googleapis.com
• googleads.g.doubleclick.net
• pagead2.googlesyndication.com
• www.google − analytics.com
2. User specific information which I think is being sent to the websites are:
• Events: Events like which content or news article is being shared, which specific functions
or options we click which includes ads, pop-ups etc.
• Number of users active on a website or a particular section on website.
• In case of Ecommerce-sites, User specific information like product impressions, product
clicks, viewing product details, adding a product to a shopping cart, initiating the checkout
process, transactions, refunds etc. which I think is maintained by google info.
3. Information which I think is being used by third-party domains by cookies they set up are:
• They are used to track user like which specific types of ads a particular user is interested
in etc.
• They also provide website owner certain services like live chat etc.
4. No, I don’t have currently third parties cookies blocked in open browser however they are
blocked when I use incognito mode.

3. . Let’s go back to Wireshark now. Disconnect from the network, release the DHCP address by invoking
ipconfig /release2, then start Wireshark, and connect back to the network.

(a) Filter for DHCP packets and describe what seems to be the protocol through which DHCP is
operating. Draw a transaction diagram of the DHCP messages that you see being sent and received
by your device. Report what underlying transport layer protocol is being used.

6 of 8
COL334: Computer Networks (CSE, IITD, Semester-I-2020-21) Assignment-2

Solution: We can filter the dhcp by checking UDP packets with port number 68 which in wireshark
could be obtained using udp.port==68.
1. DHCP is the protocol through which DHCP is operating.
2. Transaction Diagram:

3. Information about Transaction Diagram:


• Firstly, DHCP client broadcasts a DHCPDISCOVER message (a request message with
the DHCPDISCOVER option), using port 68.
• Then, the servers respond to the DHCPDISCOVER message with a DHCPOFFER mes-
sage. In these messages, the servers offer an IP address. They can also offer the lease
duration.
• Then client chooses one of the offers and sends a DHCPREQUEST message to the
selected server.
• Then Server replies with a DHCPACK message which creates the binding between the
client physical address and its IP address.
4. UDP is the protocol which is the necessary transport layer protocol on which DHCP is oper-
ating

(b) Run a traceroute to www.google.com and similarly report for DNS messages, of what messages are
sent and received by your device, and the underlying transport layer protocol.

Solution:
1. Working of DNS:
• Traceroute maps out nodes along the path from your computer to the web server. It maps
the route one node at a time, and records each of the “hops” to a node on our screen.
• We then get the IP address of router in path as an ICMP message reply from router.
• Then, I could see my machine sending queries to DNS server and getting back responses as
well. Each query consists of the IP address received from router as (reverse of ip address).in-
addr.arpa
2. Underlying protocol used is UDP.

(c) . Now filter for ICMP messages and similarly report what messages traceroute seems to be sending.

7 of 8
COL334: Computer Networks (CSE, IITD, Semester-I-2020-21) Assignment-2

Solution: ICMP messages Traceroute seems to be involved with:


• ICMP is an protocol mostly used by routers to report errors, problems etc.
• I could see each of the replies from routers is using ICMP.
• Routers are replying with ICMP messages of type 3 and code 3 if Destination is not reachable
and of type 11 and code 0 if Time To Live Exceeded in Transit.
• Above two messages are only ones to be received by my machine while tracerouting
www.google.com other messages include Source Quench message, Redirect messages and Pa-
rameter Problem messages.

(d) Run different streaming applications while capturing their packets, like when watching a video on
youtube, or talking on Skype, or on Zoom. Report what underlying protocols seem to be in use for
the data streams.

Solution:
• UDP seems to be used when we use videoconferencing app like zoom and youtube.
• UDP seems the best choice as well owing to its queries processing is fast as it only consist of a
single request followed by a single reply packet.
• Although UDP lacks reliability, as there is always some packet loss but some losses in video
streaming might not affect much owing to the query processing speed it provide.

8 of 8

You might also like