Analyzing the BitTorrent community
Ville SaaloAalto University School of Science
BitTorrent is a popular peer-to-peer ﬁle sharing protocol.The key idea with the protocol was that when all download-ers also upload it helps everyone to also download faster.It has later been shown that this claim does not necessarilyhold. This paper presents some measurements on the currentBitTorrent networks using client programs that have differ-ent concepts of fairness. We measure download speeds of two different ﬁles and also compare different torrent distri-bution sites, brieﬂy touching their usability as well. We ﬁndthat testing in real BitTorrent swarms is difﬁcult but our testsseem to support the previous observations.KEYWORDS: BitTorrent, tracker, swarm, distribution, P2P,distributed systems, selﬁshness, fairness
BitTorrent is a peer-to-peer ﬁle sharing protocol as well asthe ﬁrst program that used the protocol. The protocol op-erates so that instead of linearly downloading each ﬁle theﬁles are divided into several
and downloaded pieceby piece, effectively in a random order. Any peer that has anentire ﬁle, that is, all the pieces that belong a given ﬁle, iscalled a
. The peers who do not have all the pieces arecalled
. A peer can download any piece from eitherany seed or any other leecher that has the piece. The set of peers seeding and leeching a given ﬁle is called a
.A ﬁle is initially made available by creating a small
ﬁle that contains a cryptographic hash of each piece of the ﬁle and the addresses of one or more
services,i.e. the services that keep track of the peers belonging tothe swarm. Initially, when a ﬁle is made available, the onlypeer in the swarm is called the
. The tracker thenallows peers to ﬁnd each other. In all other senses BitTor-rent is a decentralized service with no global coordinator,and even the tracker can be replaced with a distributed hashtable (DHT) based implementation. In addition, peers maylearn about other peers currently in the swarm by communi-cating using the peer exchange protocol (PEX).The original BitTorrent client used a tit-for-tat (TFT)heuristic for choosing which peers to upload to. This meansbasically that a peer will sort its neighbours in a descend-ing order of download rates and share its upload bandwidthequallywith the ﬁve or so neighboursthat providethe fastestdownload speed.This subset of neighbours is then called
. Addingand removingpeers to andfromthe active set is called
, respectively.At regular intervals a client performs
torandompeers in order to ﬁnd new peers with possibly higherupload rates than the current ones.To decide which pieces to download BitTorrent uses a
policy. It means that the client always requeststhose pieces of which the least copies exist in the swarm.This guarantees a rather even availability for all pieces.However, Bharambe et al. note that the uploader shouldchoose to upload those pieces ﬁrst that it has uploaded theleast amount of times. This improves the diversity of piecesand improves uplink utilization even further.Today, the most popular BitTorrent clients are Xunlei,µTorrent and Azureus, which together have got a marketshare of almost 80 %.The rest of the paper is organized as follows. Sec-tion2discusses a few different BitTorrent clients. Section3
presents some statistical data on a few different torrent dis-tribution sites and Section4presents download speed mea-surementswithdifferentﬁles andclients. Section5discussesthe ﬁndings of this paper further and highlights some otherresearch papers. Finally Section6concludes the paper.
The BitTorrent protocol is similar to TCP in the sense thatmany different implementations exist but they are still fullyinteroperable. In BitTorrent the differences are related to thechoking and unchoking and reciprocation logic.
is a research software which changes the TFTheuristic a little: instead of sharing the upload bandwidthequally it shares it in proportion to the download and up-load speed ratios, choosing to upload to those peers fromwhom it gets the best download speeds for the lowest uploadspeeds.The problem with these types of a heuristics, however,is that calculating the current download speed is a difﬁcultproblem. First, it takes a long time to accurately estimate thedownload speeds from other peers. Second, a peer’s currentdownloadspeedis notareliablewaytopredictits futurecon-tribution, allowing
to exploitthe system.A free-rider is a client that contributes little to no uploadbandwidth but consumes a lot of download bandwidth. Astrategic peer on the other hand usually contributes the sameamount of upload bandwidth as everyone else but tries toallocate it cleverly to boost its own download speed. Un-fortunately this may hurt the performance of non-strategicpeers, especially if the strategic peer is also
and triesto minimize its overall upload speed.