P. 1
2sd06cs071

2sd06cs071

|Views: 3|Likes:
Published by Akshay Kandul

More info:

Published by: Akshay Kandul on Feb 18, 2013
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less

02/18/2013

pdf

text

original

VISVESVARAIAH TECHNOLOGICAL UNIVERSITY BELGAUM

DHARWAD – 580 002

A seminar report on BITTORRENT PROTOCOL

Submitted by Rajani .B. Paraddi 2SD06CS071 8th semester

1

Dept of CSE

VISVESVARAIAH TECHNOLOGICAL UNIVERSITY BELGAUM

DEPARTMENT OF COMPUTER SCIENCE ENGINEERING

CERTIFICATE Certified that the seminar work entitled “BITTORRENT
PROTOCOL” is a bonafide work presented by Rajani.B.Paraddi bearing USN 2SD06CS071 in a partial fulfillment for the award of degree of Bachelor of Engineering in Computer Science Engineering of the Vishveshwaraiah Technological University, Belgaum during the year 2009-10. The seminar report has been approved as it satisfies the academic requirements with respect to seminar work presented for the Bachelor of Engineering Degree.

Staff In Charge H.O.D CSE Name: Rajani .B. Paraddi USN: 2SD06CS071
2 Dept of CSE

2.1.4. 2. 1. Overview History 2.1.5. Attacks on bittorrent Solutions 7.2. Metainfo File Tracker Peers Data Bittorrent Clients 6.1. 2. 5. Terminology 5. 2. Working of BitTorrent 4. 5. Architecture of BitTorrent 5. 6.2. Conclusion 8. References 3 Dept of CSE . 5. 5.Index 1.3.3.2. Other P2P Methods Typical HTTP File Transfer The DAP method The BitTorrent Approach 3. BitTorrent and Other approaches 2. Vulnerabilities of BitTorrent 6.1. Introduction 1.4.

Here not just the sources are responsible for file transfer but also the clients or users who want to obtain the file are involved in this process. it has gained its popularity because of the sharing policy that it imposes on its users. BitTorrent has become one of the most popular file transfer mechanisms in today’s world. It makes transfer of such files easier by implementing a different approach. Thus the larger the number of users the more is the demand and more easily a file can be transferred between them. It is said to be a lot better than the conventional file transfer methods because of a different principle that is followed by this protocol.1. 1. This makes the load get distributed evenly across the users and thus making the main source partially free from this process which will reduce the network traffic imposed on it.1 Overview BitTorrent is a peer-to-peer file sharing protocol used to distribute large amounts of data. This is what has made a big difference between this and the conventional file transfer methods. It also evens out the way a file is shared by allowing a user not just to obtain it but also to share it with others. BitTorrent protocol has been built on a technology which makes it possible to distribute large amounts of data without the need of a high capacity server. This is the most striking feature of this file transfer protocol. Though the mechanism itself is not as simple as an ordinary file transfer protocol. would actually work and would actually be fun". BitTorrent is one of the most common protocols for transferring large files. After inventing this new technology he said. there were other techniques for file sharing but they were not utilizing the bandwidth effectively. Before this was invented.2 History BitTorrent was created by a programmer named Bram Cohen. "I decided I finally wanted to work on a project that people would actually use. and expensive bandwidth. Introduction[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 1. Because of this. A user can obtain multiple files simultaneously without any considerable loss of the transfer rate. It makes a user to share the file he is obtaining so that the other users who are trying to obtain the same file would find it easier and also in turn making these users to involve themselves in the file sharing process. Its main usage is for the transfer of large sized files. 4 Dept of CSE . The transferring of files will never depend on a single source which is supposed the original copy of the file but instead the load will be distributed across a number of such sources.

or both. 2. However. to make the maximum utilization of all the users’ bandwidth who are involved in the sharing of files. This meant that most of the users can simply download the files without being needed to upload. and never to each other. In this method." or there is a long queue that you have to wait through. The first usable version of BitTorrent appeared in October 2002. BitTorrent and Other approaches[3] 2. The main advantages of this method are that it's simple to set up. Here the client can only depend 5 Dept of CSE . Cohen invented this protocol in April 2001. this model has a significant problem with files that are large or very popular. 2. and are always on and connected to the Internet. Perhaps you may have tried to download a demo of a new game just released. every person who wants to download a file had to contribute towards the uploading process also.1 Other P2P Methods The most common method by which files are transferred on the Internet is the clientserver model. This new and novel concept of Cohen gave birth to a new peer to peer file sharing protocol called BitTorrent. it takes a great deal of bandwidth and server resources to distribute such a file.e. This was the main intention behind Cohen’s invention. and found that all the servers report "too many users. i.33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 The bandwidth had become a bottleneck in such methods. or CD images of a new Linux distribution.. The concept of mirrors partially addresses this shortcoming by distributing the load across multiple servers. This led to inefficient usage of bandwidth of the remaining users. The clients only speak to the server. BitTorrent really started to take off in early 2003. A central server sends the entire file to each client that requests it. and it's usually only feasible for the busiest of sites.2 A Typical HTTP File Transfer The most common type of file transfer is through a HTTP server. By doing so. but the system needed a lot of fine-tuning. But it requires a lot of coordination and effort to set up an efficient network of mirrors. So this again put a lot of network load on the original sources and on small number of users. a HTTP server listens to the client’s requests and serves them. Namely. and the files are usually always available since the servers tend to be dedicated to the task of serving. since the server must transmit the entire file to each client. this is how both http and ftp work.

to pause and resume downloads. DAP immediately senses when a user begins downloading a file and identifies available mirror sites that host the requested file.1: HTTP/FTP File Transfer 2. A single server can handle many such clients and serve the requested file simultaneously to all the clients. DAP's key features include the ability to accelerate downloading of files in FTP and HTTP protocols. The file being served will be available as one single piece. As soon as it is triggered. which means that if the download process stops abruptly in the middle the whole file has to be downloaded again. The file is downloaded in several segments simultaneously through multiple connections from the most responsive server(s) and reassembled at the user's PC. 75 76 77 78 79 80 81 82 83 84 85 86 87 88 Fig 2. and to recover from dropped internet connections.3 The DAP method Download Accelerator Plus (DAP) is the world's most popular download accelerator. DAP's client side optimization begins to determine . Also this kind of transfer of file is subjected to single point of failure. where if the server crashes then the whole download process will seize. The overall download scheme will be limited to the limitations of that server.66 67 68 69 70 71 72 73 74 on the lone server that is providing the file. BitTorrent protocol has overcome all these shortcomings seen in this type and thus it is more robust due to which it is chosen by many people over this traditional method of file transfer.which mirror sites offer the fastest response for the specific user's location. On the Internet the same file is often hosted on numerous mirror sites.in real time . such as at universities and on ISP servers. This results in better utilization of the user's 6 Dept of CSE .

Clients download blocks from other (randomly chosen) clients who claim they have the corresponding data. Once a user has some considerable number of such pieces of a file then even he can start sharing them with other users who are yet to receive those pieces. It does not require a user to download a file completely from a single server. called the seed will initiate the download by transferring pieces of file to the users. the data to be shared is divided into many equal-sized portions called pieces. This concept enables a client not to depend on a server completely and also it reduces overall load on the server.89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 available bandwidth. called a torrent. that contains the location of the tracker along with a hash of each piece. and reduces download times for users while allowing them to receive maximum benefit from their available bandwidth.2 : BitTorrent File Transfer Each client independently sends a file. This in turn effects an efficient balancing of the load among available servers across the entire World Wide Web. Accordingly. 2. A user who has the complete file. 108 109 110 111 112 113 Fig 2. Instead a file can be downloaded from many such users who are indeed downloading the same file. All clients interested in sharing this data are grouped into a swarm. each of which is managed by a central entity called the tracker. Clients keep each other updated on the status of their download.4 The BitTorrent Approach In BitTorrent. Each piece is further sub-divided into equal-sized sub-pieces called blocks. BitTorrent has revolutionized the way files are shared between people. clients also send data that they have 7 Dept of CSE . DAP's resume functionality and the ability to continue downloading even when one of the participating connections has dropped also provides users with a more reliable download experience. This ensures that each available mirror server is utilized to serve the users that most benefit.

BitTorrent scales well and is a superior method for transferring and disseminating files between interested peers while limiting free riding (peers who download but do not upload) between those same peers. The load is distributed across the network between peers and servers. which has made it the most popular one. This is the uniqueness of this protocol.torrent file is stored and a complete copy of the file being exchanged. Once a client receives all the blocks for a given piece. he can be confident that he has the complete data. Working of BitTorrent[4] As previously explained.114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 previously downloaded to other clients.torrent file. Each of these components is described in the following paragraphs. In BitTorrent the users participate actively in sharing files along with servers. But BitTorrent has many such features that DAP doesn’t. a web server where the . This makes BitTorrent far better than its competing peers like DAP and others. a . The file being exchanged is the essence of the torrent and a complete copy is 8 Dept of CSE . Cohen’s vision of peers simultaneously helping each other by uploading and downloading has been realized by the BitTorrent system. 3. BitTorrent’s design makes it extremely efficient in the sharing of large data files among interested peers. Also this needs an implementation of a dedicated server called tracker to handle the peers connected in the network. Pareto efficiency is an important economic concept that maximizes resource allocation among peers to their mutual advantage. These components include a tracker server. Thus once a client has downloaded and verified all pieces. BitTorrent’s is based on a “tit for tat” reciprocity agreement between users that ultimately results in pareto efficiency. This is not the case in BitTorrent since the whole process is not depending on servers alone. If these servers are flooded with requests then the breakdown and the transaction will terminate. The file transfer in DAP takes place through the traditional HTTP or FTP protocol which means that the transfer rate will always be limited by the server’s bandwidth. For a torrent to be alive or active it must have several key components to function. Also the files are divided into pieces in both approaches. The protocol shares data through what are known as torrents. he can verify the hash of that piece against the provided hash in the torrent. Both BitTorrent and DAP download files from multiple sources.

These . leechers may both download from seeds and upload to other leechers. Since BitTorrent has no built in search functionality.147 148 referred to as a seed.torrent file is opened by the peer’s client software.1 : A Typical BitTorrent System Peers lacking the file and seeking it from seeds are called leechers. . While seeds only upload to leechers.torrent files can be viewed as surrogates for the files being shared. the peer then connects to the tracker server responsible for coordinating activity for that specific torrent. BitTorrent’s protocol is designed so leeching peers seek each other out for data transfer in a process known as “optimistic unchoking”. assigned name.torrent files are usually located via HTTP through search engines or trackers. Together seeds and leechers engaged in file transfer are referred to as a swarm.torrent file. 9 Dept of CSE . A seed is a peer in the BitTorrent network willing to share a file with other peers in the network.torrent file from a server.torrent files is to provide the metadata that allows the protocol to function. The role of . A swarm is coordinated by a tracker server serving the particular torrent and interested peers find the tracker via metadata known as a . The tracker and client communicate by a protocol layered on top of HTTP and the tracker’s key role is to coordinate peers seeking the same file for Cohen envisioned “The tracker’s responsibilities are strictly limited to helping peers find each other”.torrent files contain key pieces of data to function correctly including file length. . In reality the tracker’s role is a bit more complex as many trackers collect data about peers engaged in a swarm. When a . 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 Fig 3. Torrent files can be created using a program such as MakeTorrent. The first step in the BitTorrent exchange occurs when a peer downloads a . another open source tool available under the free software model. hashing information about the file and the URL of the tracker coordinating the torrent activity.

Peer : A peer is another computer on the internet that you connect to and transfer data.173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 Leechers and seeds are coordinated by the tracker server and the peers periodically update the tracker on their status allowing the tracker to have a global view of the system. The data monitored by the tracker can include peer IP addresses. Reseed : When there are zero seeds for a given torrent. But the main difference between the two is that a leech will not upload once the file is downloaded. As previously mentioned. and the ratio of sharing among peers.torrent. the complete file is actually stored on peer seed nodes and not the tracker server. then eventually all the peers will get stuck with an incomplete file.torrent files are not the actual file being shared. Since . 4.torrent files without prohibitive server or bandwidth requirements. Torrent : this refers to the small metadata file you receive from the web server (the one that ends in . not the data itself.torrent files are small and require little space to store. Terminology These are the common terms that one would come across while making a typical BitTorrent file transfer. he can continue to upload the file which is called as seeding. length of time connected to the tracker. Usually a tracker coordinates multiple torrents and the most popular trackers are busy coordinating thousands of swarms simultaneously.) Metadata here means that the file contains information about the data you want to download. amount of data uploaded/downloaded for specific peers. Generally a peer does not have the complete file. data transfer rates among peers. rather . one server can easily host thousands of . Leeches : They are similar to peers in that they won’t have the complete file. the percentage of the total file downloaded.torrent files are the metadata information which allow which trackers and peers to coordinate their activities. Seed : A computer that has a complete copy of a certain torrent. Once a client downloads a file completely. It should be noted that . since no one in the swarm has the 10 Dept of CSE . This is a good practice in the BitTorrent world since it allows other users to have the file easily.

a file which contains all details necessary for the protocol to operate.A server which helps to manage the BitTorrent protocol. Swarm : The group of machines that are collectively connected for a particular file. and choking the connections it was just using. Client . In such cases. it marks a connection as snubbed. Optimistic unchoking : Periodically. 5. A ratio of 1 means that one has uploaded the same amount of a file that has been downloaded. the connection is said to be choked. Share ratio : This is ratio of amount of a file downloaded to that of uploaded.206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 missing pieces. Such copies are called distributed copies. When this happens. Then the downloader is said to be interested in the other end.The program which sits on a peers computer and implements the protocol. Distributed copies : Sometimes the peers in a swarm will collectively have a complete file. in that the peer on the other end has chosen not to send in a while. Peers . Choked : It is a state of an uploader where he does not want to send anything on his link. 11 Dept of CSE 232 233 234 235 236 237 . The clients are in constant touch with this server to know about the peers in the swarm. Interested : This is the state of a downloader which suggests that the other end has some pieces that the downloader wants.The files being transferred across the protocol. the client shakes up the list of uploaders and tries sending on different connections that were previously choked. Snubbed : If the client has not received anything after a certain period. This is called reseeding. Tracker . Data . This is called optimistic unchoking. Architecture of BitTorrent The BitTorrent protocol can be split into the following five main components: Metainfo File . a seed must connect to the swarm so that those missing pieces can be transferred.Users exchanging data via the BitTorrent protocol. Tracker : A server on the Internet that acts to coordinate the action of BitTorrent clients.

and is discussed in the next section. Peers communicate with the tracker via the plain text via HTTP (Hypertext Transfer Protocol) The following diagram illustrates how peers interact with each other.238 239 240 241 242 243 244 245 Peers use TCP (Transport Control Protocol) to communicate and send data. The tracker allows peers to query which peers have what data.torrent' extension. 246 247 248 249 250 251 252 253 254 255 Fig 5. or lost all together. they must create a metainfo file.1 : Architecture of a BitTorrent System 5. and contains all the information about a torrent. This is a program which runs on the user computer. UDP cannot give such guarantees. and implements the bittorrent protocol. Every metainfo file must contain the following information. and the data is extracted from the file by a BitTorrent client. and IP address of the tracker to connect to. and data can become scrambled. A tracker is a server which 'manages' a torrent. and allows them to begin communication. This protocol is preferable over other protocols such as UDP (User Datagram Protocol) because TCP guarantees reliable and in-order delivery of data from sender to receiver. such as the data to be included.1 Metainfo File [2] When someone wants to publish data using the BitTorrent protocol. and also communicate with a central tracker. The file is given a '. (or 'keys'): 12 Dept of CSE . This file is specific to the data they are publishing.

or the directory structure for more files.. Hashes for every data piece. in SHA 1 format are stored here. Delimiters are not used for byte strings. the keys contained in the metainfo file are encoded before they are sent.256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 • info: A dictionary which describes the file(s) of the torrent. 'length': 38190848L.1. Bencoding supports byte strings. } 'announce': 'http://tracker. • announce: The announce URL of the tracker as a string The following are optional keys which can also be used: • • announce-list: Used to list backup trackers creation date: The creation time of the torrent by way of UNIX time stamp (integer seconds since 1-Jan-1970 00:00:00 UTC) • • comment: Any comments by the author created by: Name and Version of programme used to create the metainfo file These keys are structured in the metainfo file as follows: {'info': {'piece length': 131072.'. Either for the single file.cc:6969/announce'. integers..1 Bencoding: Bencoding is used by bittorrent to send loosely structured data between the BitTorrent client and a tracker.. 'creation date': 1089749086L } Instead of transmitting the keys in plain text format.var. 'pieces': '\xcb\xfaz\r\x9b\xe1\x9a\xe1\x83\x91~\xed@\. 'name': 'Cory_Doctorow_Microsoft_Research_DRM_talk. Bencoding Structure: • • • • Byte Strings : <string length in base ten ASCII> : <string data> Integers: i<base ten ASCII>e Lists: l<bencoded values>e Dictionaries: d<bencoded string><bencoded element>e 13 Dept of CSE . lists and dictionaries respectively. lists and dictionaries. Bencoding uses the beginning delimiters 'i' / 'l' / 'd' for integers.. Encoding is done using bittorrent specific method known as 'bencoding'.mp3'. Ending delimiters are always 'e'. 5.

However '0' is allowed. it can provide a random list of peers who are participating in the torrent. to find peers with the data they require. It stored statistics about the torrent. The parameters accepted by the tracker are: • • info_hash: 20-byte SHA1 hash of the info key from the metainfo file. Whenever a peer contacts the tracker. when another peer queries the tracker.1.286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 Minus integers are allowed."eggs"] d4:spaml1:a1:bee // represents the dictionary {"spam" => ["a" . it reports which pieces of a file they have. and separating parameters with a "&". peer_id: 20-byte string used as a unique ID for the client. which are handled by the BitTorrent client running on the users computer. i. BitTorrent clients communicate with the tracker using HTTP GET requests.e. A seed will upload the file. Examples of bencoding: 4:spam // represents the string "spam" i3e // represents the integer "3" l4:spam4:eggse // represents the list of two strings: ["spam". Multiple trackers can also be specified. The address of the tracker managing a torrent is specified in the metainfo file. and then others can download a copy of the file over the HTTP protocol and participate in the torrent. and have the required piece. A tracker is a HTTP/HTTPS service and typically works on port 6969. this file can easily be distributed via other protocols. and as the file is replicated. but its main role is allow peers to 'find each other' and start communication. The most popular method of distribution is using a public indexing site which hosts the metainfo files. This consists of appending a "?" to the URL.2 Metainfo File Distribution : Because all information which is needed for the torrent is included in a single file. a single tracker can manage multiple torrents. 14 Dept of CSE . "b"] } 5. Peers know nothing of each other until a response is received from the tracker. which is a standard CGI method. but prefixing the number with a zero is not permitted. the number of peers can increase very quickly. as backups.2 Tracker[2] A tracker is used to manage users participating in a torrent (known as peers). 5. That way.

ip: (optional) The IP address of the client machine. The peer list can then be replaced by a 6 bytes per peer.2 : Tracker downloaded: The total amount downloaded since the client sent the 'started' event to the tracker in base ten ASCII. The tracker then responds with a "text/plain" document with the following keys: 15 Dept of CSE . uploaded: The total amount uploaded since the client sent the 'started' event to the tracker in base ten ASCII. and the last 2 bytes are port. in dotted format. numwant: (optional) The number of peers the client wishes to receive from the tracker. key: (optional) Allows a client to identify itself if their IP address changes. event: If specified. must be one of the following: started. stopped. completed. left: The number of bytes the client till has to download. compact: Indicates that the client accepts compacted responses. it should be set here. trackerid: (optional) If previous announce contained a tracker id.316 317 318 • • port: The port number the client is listed on. in base ten ASCII. 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 • • • • • • • • Fig 5. The first 4 bytes are the host.

you must start with the announce URL. The value is a human readable error message as to why the request failed. IP and ports of all the peers.335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 • failure message: If present. 5. interval: The number of seconds a client should wait between sending regular requests to the tracker. then no other keys are included. tracker id: A string that the client should send back with its next announce. Examples: Announce URL http://example.2.1 Scraping Scraping is the process of querying the state of a given torrent (or all torrents) that the tracker is managing.php Scrape URL http://example.com/a/annnounce http://example. find the last '/' and if the text immediately following the '/' is 'announce'.com/a/scrape http://example. • • warning message: Similar to failure message. The result is known as a "scrape page". but response still gets processed. To get the scrape. incomplete: the number of active downloaders (lechers) name: (optional) the torrent name 16 Dept of CSE . Each key is made up of a 20-byte binary hash value.com/scrape.php 351 352 353 354 355 356 357 358 359 The tracker then responds with a "text/plain" document with the following bencoded keys: • files: A dictionary containing one key pair for each torrent. complete: Number of peers with the complete file. incomplete: number of non-seeding peers (leechers) peers: A list of dictionaries including: peer id. then this can be substituted for 'scrape' to find the scrape page. The value of that key is then a nested dictionary with the following keys: • • • • complete: number of peers with the entire file (seeds) downloaded: total number of times the entire file has been downloaded. • • • • • min interval: Minimum announce interval.com/announce.com/scrape http://example.com/annnounce http://example.

3. 5. and have the partial file.3 Peers[4] Peers are other users participating in a torrent. or most commonly because of bandwidth issues.e. Once this happens.3 Rarest First When a peer selects which piece to download next. and unlike other protocols. Therefore the tracker is constantly replying to the peer with a list of peers who have the requested pieces. the 'rarest first' strategy begins. then no one will reach completion. This means that the most common pieces are left until later. and focus goes to replication of rarer pieces. 5. there will be only one seed with the complete file. There are three stages of piece selection. depending on the status of the peer. BitTorrent uses TCP (Transmission Control Protocol) ports 6881-6889 to send messages and data between peers.2 Random First Piece When downloading first begins.3. This could be because of cost reasons. or the complete file (known as a seed). as peers begin to download from one another. At the beginning of a torrent. There would be a possible bottle neck if multiple downloaders were trying to access the same piece. Pieces are requested from peers. the piece held by the lowest number of peers. i. Losing a seed runs the risk of pieces being lost if no current downloaders have them. Random pieces are then chosen until the first piece is completed and checked. Eventually the original seed will disappear from a torrent.1 Piece Selection Peers continuously queue up the pieces for download which they require. but are not guaranteed to be sent. a piece is selected at random to get the download started.360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 5. Which piece is requested depends upon the BitTorrent client. as the peer has nothing to upload. rarest first avoids this because different peers have different pieces. which change depending on which stage of completion a peer is at. As more peers connect. If the original seed goes before at least one other peer has the complete file. the rarest piece will be chosen from the current swarm. 17 Dept of CSE . rarest first will the some load off of the tracker. Rarest first works to prevent the loss of pieces by replicating the pieces most at risk as quickly as possible. unless a seed re-connects.3. does not use UDP (User Datagram Protocol) 5.

3. a client will only maintain a default number of simultaneous uploads (max_uploads).3 : Choking by a peer 18 Dept of CSE . Peers can block others from downloading data if necessary. To maintain the integrity of the data which has been downloaded. i. Usually the default for max_uploads is 4. completion may be delayed. To prevent this. and waiting for a piece from a peer with slow transfer rates.e. From then on. it can opt to refuse to transmit that piece.4 Endgame Mode When a download nears completion.5 Peer Distribution The role of the tracker ends once peers have 'found each other'. a peer does not report that they have a piece until they have performed a hash check with the one contained in the metainfo file. The set of peers a BitTorrent client is in communication with is known as a swarm.3. 5.3. Peers will continue to download data from all available peers that they can. This can be done for different reasons.6 Choking[2] When a peer receives a request for a piece from another peer. but the most common is that by default. communication is done directly between peers.389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 5. If this happens. All further requests to the client will be marked as choked. the peer is said to be choked. This is known as choking. 5. the remaining sub-pieces are requested from all peers in the current swarm. peers that posses the required pieces. and the tracker is not involved. 409 Fig 5.

7 Optimistic Unchoking[2] To ensure that connections with the best data transfer rates are not favoured.10 Message Stream[2] This constant stream of messages allows all peers in the swarm to send data. followed by a never-ending stream of length-prefixed messages. Connections are symmetrical. Another example of when a peer is choked would be when downloading from a seed. The peer will then remain choked until an unchoke message is sent. After handshaking.410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 3. The peer which is assigned to this is rotated every 30 seconds. This is known as optimistic unchoking. A 20 byte peer id is sent which is then used in tracker requests and included in peer requests. then data will be transferred. 5. connections start out as choked. 5.9 Handshaking[2] Handshaking is performed as follows: 1. A peer will be 'interested' in data if there is a peer which has the required pieces.8 Communication Between Peers Peers which are exchanging data are in constant communication. and the seed requires no pieces. and not interested. 5. there is a system in place which rotates which peers are downloading. A 20 byte SHA1 hash of the bencoded info value from the metainfo is then sent. each peer has a reserved 'optimistic unchoke' which is left unchoked regardless of the current transfer rate. and control interactions with other peers. 19 Dept of CSE . the connection is closed. The handshake starts with character 19 (base 10) followed by the string 'BitTorrent Protocol'.3. If the peer which has this data is not choked. by default. These messages are made up of a handshake. and therefore messages can be exchanged in both directions. If the peer id does not match the one expected.3. 2. This is enough time for the upload / download rates to reach maximum capacity. where the downloader responds in one period with the same action the uploader used in the last period. 5.3. If this does not match between peers the connection is closed. The peers then cooperate using the tit for tat strategy. To ensure fairness between peers.3.

Optional. A user is interested if a peer has the data they require. Fixed length. The payload contains integer values specifying the index. Variable length. Details the pieces that peer currently has. 20 Dept of CSE .Prefix Message Structure Additional Information 0 choke <len=0001><id=0> Fixed length. X is the length of the block. no payload. upload will begin. Fixed length. 1 unchoke <len=0001><id=1> 2 interested <len=0001><id=2> 3 not interested have <len=0001><id=3> 4 <len=0005><id=4><piece index> 5 bitfield <len=0001+X><id=5><bitfield> 6 request <len=0013><id=6><index><begin ><length> 7 piece <len=0009+X><id=7><index><be gin><block> 8 cancel <len=13><id=8><index><begin>< Fixed length. Payload represents pieces that have been successfully downloaded. Sent together with request messages. and if they are still interested in the data. used to request a block of pieces. Fixed length. begin location and length. Sent immediately after handshaking. payload is the same as ‘request’. begin location and length. Payload is the zerobased index of the piece. used to cancel block length> requests. The payload contains integer values specifying the index. Fixed length. Fixed length. This enables a peer to block another peer’s request for data. no payload. and only sent if client has pieces. Unblock peer. Fixed length. The peer does not have any data required. X is the length of bitfield. no payload. Typically used during ‘end game’ mode. no payload.

a 1. The piece size a torrent is allocated depends on the amount of data. 512kb and 1mb. from kilobytes to hundreds of gigabytes. and handles 21 Dept of CSE . As the number of pieces increase. and a final piece of 120kb. 462 463 464 465 466 Fig 5. The most common piece sizes are 256kb. Therefore.4 Data BitTorrent is very versatile. pieces should be selected so that the metainfo file is no larger than 50 . The main reason for this is to limit the amount of hosting storage and bandwidth needed by indexing servers. whereas if the piece sizes are too small. more hash checks will need to be run. It runs together with the operating system on a users machine.1 Piece Size Data is split into smaller pieces which sent between peers using the bittorrent protocol. contained within any number of directories. of multiple files of any type. These pieces are of a fixed size.4Mb file could be split into the following pieces. This shows 5 * 256kb pieces.5 BitTorrent Clients A BitTorrent client is an executable program which implements the BitTorrent protocol. This also breaks the file into verifiable pieces.440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 5. Piece sizes which are too large will cause inefficiency when downloading (larger risk of data corruption in larger pieces due to fewer integrity checks). 5. which enables the tracker to keep tabs on who has which pieces of data. each piece can then be assigned a hash code. as a rule of thumb. and can be used to transfer a single file. more hash codes need to be stored in the metainfo file. For example. The number of pieces is therefore: total length / piece size. File sizes can vary hugely.4.4 : Pieces of a file 5. which can be checked by the downloader for data integrity. These hashes are stored as part of the 'metinfo file'. The size of the pieces remains constant throughout all files in the torrent except for the final piece which is irregular.75kb.

This means the client will only use one port. the necessary data is extracted. To find an available port. The attacker parses the torrent files with a modified BitTorrent client and spoofs his IP address and port number with the victims as he announces he is joining the swarm. But it has been exposed to various attacks in the recent past due to the vulnerabilities that are being exploited by the hacker community.1 Attacks on BitTorrent As we have seen so far. Vulnerabilities of BitTorrent 6.1. opening sockets etc. This attack is possible because of the fact that BitTorrent Tracker has no mechanism for validating peers. and opening another BitTorrent client will use another port. the client will start at the lowest port. The client sits on the operating system and is responsible for controlling the reading / writing of files. This means there is no way to trace the culprit in these kind of attacks. BitTorrent clients use TCP ports 6881-6999. A metainfo file must be opened by the client to start partaking in a torrent. A client can handle multiple torrents running concurrently. Also attacks of this stature are possible because of the modifications that can be done to the client software. and work upwards until it finds one it can use. Here are some of the attacks that are commonly seen. One peer contacts the attacker for a chunk of the file. 6. The attacker downloads a large number of torrent files from a web server. 6. As the tracker receives requests for a list of participating peers from other clients it sends the victims IP and port number. Attacker requests all chunks from swarm and wastes their upload bandwidth. 22 Dept of CSE . 3. and a socket must be opened to contact the tracker. 4.2 DDOS attack DDOS stands for Distributed denial of service. The attacker sends back a false chunk. 5.1 Pollution attack 1. 6. 2. Once the file is read. BitTorrent is one of most favoured file transfer protocol in today’s world. This false chunk will fail its hash and will be discarded.467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 interactions with the tracker and peers. The peers receive the peer list from the tracker. 1. 3. 2.1.

or even remove the on-responding trackers from the tracker list in the torrent. 6. The first method is to encrypt the packets sent by the means of BitTorrent protocol. By doing this.1 Pollution attack The peers which perform such attacks are identified by tracing their IPs. the filters that sniff packets will not be able to detect such packets belonging to BitTorrent protocol.1. The peers then attempt to connect to the victim to try and download a chunk of the file.2.500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 4. which download the list of blacklisted IPs from internet.2 Solutions Here are a few solutions to the attacks that were discussed above. Another measure could be to restrict the size of the tracker list to reduce the effectiveness of such an attack. This means that the filters are fooled by the encrypted packets and thus packets can 23 Dept of CSE . Then. ISPs make use of filters to find out such packets and block them from passing their servers.2. such IPs are blacklisted to avoid further communication with them. These blacklisted IPs are blocked by denying them connections with other peers. This is done by using software like Peer Guardian or moBlock. 6.2 DDOS attack The main solution to this kind of attack is to have clients parse the response from the tracker. 6. or set a high retry interval for that specific tracker. The peer should then exclude hat address from its tracker list. This can be done by sniffing the packets that pass through and detecting whether they oblige BitTorrent protocol. Another fix would be for web sites hosting torrents to check and report whether all trackers are active. 6. In the case where a host (tracker) does not respond to a peer’s request with a valid BitTorrent protocol message it should be inferred that this host is not running BitTorrent. 6. This is because BitTorrent is usually used to transfer large sized files due to which the traffic over the ISPs increase to a large extent.3 Bandwidth Shaping Many ISPs don’t encourage the use of BitTorrent from their users.3 Bandwidth Shaping There are broadly two approaches followed to counter this type of attacks. To avoid such exploding traffic on their servers many ISPs have started to avoid the traffic caused by BitTorrent.2.

References 1. further analysis and a more thorough study in the protocol will enable one to discover more ways to improve it. it is still not perfected as it is still prone to malicious attacks and acts of misuse. Bram (2003) Incentives Build Robustness in BitTorrent. but with different emphasis or twists. 7. and strives to derive an optimal schedule that could minimize the total elapsed time.org/BitTorrentSpecification 3. However. Tunnels are dedicated paths where the filters are avoided by using VPN software which connects to the unfiltered networks.wikipedia. BitTorrent’s application in this information sharing age is almost priceless.pdf 24 Dept of CSE . Other Information http://www. Cohen.org/wiki/BitTorrent_(protocol) 2. the lifespan of each torrent is still not satisfactory.533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 sneak through such filters. This work takes a different approach to the mesh-based file distribution problem by considering it as a scheduling problem. Thus. Most followon research used similar distributed and randomized algorithms for peer and piece selection.org/BitTorrent/bittorrentecon. 8. May 22 2003 http://www. Another approach is to make use of tunnels.theory. This results in successfully bypassing the filters and thus the packets are guaranteed to be transmitted across networks. which means that the length of file distribution can only survive for a limited period of time. Conclusion BitTorrent pioneered mesh-based file distribution that effectively utilizes all the uplinks of participating nodes. BitTorrent Specifications http://wiki. Moreover.bitconjurer.dessent.net/btfaq/#compare 4. Information on BitTorrent Protocol en.

You're Reading a Free Preview

Download
scribd
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->