You are on page 1of 125

FONDUL SOCIAL EUROPEAN Investete n oameni!

Programul Operaional Sectorial pentru Dezvoltarea Resurselor Umane 2007 2013 Proiect POSDRU/6/1.5/S/19 Pregtirea competitiv a doctoranzilor n domenii prioritare ale societii bazate pe cunoatere

UNIVERSITATEA POLITEHNICA DIN BUCURETI


Facultatea de Automatic i Calculatoare Catedra de Calculatoare

Nr. Decizie Senat 211 din 15.09.2011

TEZ DE DOCTORAT

mbuntiri ale comunicaiei n sistemele Peer-to-Peer folosind reele de acoperire Overlay Communication Improvements in Peer-to-Peer Systems

Autor: Ing. George Milescu

COMISIA DE DOCTORAT
Preedinte Conductor de doctorat Referent Referent Referent

Prof. Dr. Ing. Dumitru Popescu Prof. Dr. Ing. Nicolae pu Prof. Dr. Ing. Valentin Cristea Prof. Dr. Ing. Victor Patriciu Prof. Dr. Ing. Ion Smeureanu

de la de la de la de la de la

UPB UPB UPB ATM ASE

Bucureti 2011

University POLITEHNICA of Bucharest


Faculty of Automatic Control and Computers Computer Science Department

Overlay Communication Improvements in Peer-to-Peer Systems

Scientic Adviser: Prof. Dr. Ing. Nicolae Tapus , ,

Author: Ing. George Milescu

Bucharest 2011

To my parents. Thank you for all your love and support.

Abstract

Overlay networks are basic foundations on top of which P2P protocols are built. The current thesis analyses the structures and functions of the overlay. The BitTorrent protocol is presented as the le sharing protocol with the largest chunk of Internet trafc. Key areas of the BitTorrent protocol are identied where inter-node communication can be improved. The thesis proposed novel models and solutions for enhancing the efciency of BitTorrent data transfer. A novel BitTorrent protocol extension is introduced that uses a series of proxy nodes for distributing available bandwidth among peers (thus increasing transfer performance) and offering an enhanced privacy protection. A proxy discovery protocol is proposed, disseminating the relays available in the network. An overlay behavior model is introduced, with three components evaluating the performance, efciency and reliability of an le-sharing P2P system. To control P2P experiments, an evaluation testbed is introduced, that reproduces realistic network conditions within a computer cluster. Lightweight virtualization solutions are being used to create clusters with up to 350 machines. The obtained results are validated in a series of experiments and technical trials. A cluster of 350 machines is used to recreate complex scenarios. A trial using 20.000 clients over a period of two months places the proposed solutions in real network environments.

ii

Acknowledgements At the beginning, when i rst walked on the path of academic research, the future looked unclear and full of obstacles. I was lucky to collaborate with a lot of people whose positive inuence helped me complete this journey. I want to thank everyone for supporting me into reaching the goals of my PhD activities and writing this thesis. First i want to thank my PhD supervisor, Prof. Nicolae Tapus, for his guidance and , , support. The clear view he often offered me and the useful suggestions are invaluable. Ive learned a lot from him during these years. I want to address special thanks to Prof. Johan Pouwelse, from the Delft University of Technology for his never ending energy, his continuous ow of research ideas and the coordination of research activities within the P2P-Next WP4 group. With a high level of experience, Boudewijn Schoon and Arno Bakker helped me with guidance and useful advice through all our collaboration. I want to thank them and all the Tribler Group from the Delft University of Technology for the opportunity to work with such skilled researchers and engineers. I would like to thank Alexandru Iosup, Mihai Capota and Andrei Pruteanu, from the Delft University of Technology, for the enthusiasm, passion and high quality research they inspired me with. I want to thank two very close collaborators, Razvan Rughinis and Razvan Deaconescu. , Razvan Rughinis followed my PhD path and pushed me to stay on track by focusing on , important elements. Razvan Deaconescu brought pragmatism to my research and had an encouraging presence at all moments. The Systems Group from UPB offered me the possibility to work with, and learn from some of the most enthusiastic people i know. Thank you for the energy you have to drive the world forward. My close friends and collaborators, Andreea Urzica and Mircea Bardac, offered me their unconditional support in the last years. I want to thank Mircea for his extraordinary technical and engineering skills and the variety of cute things he keeps discovering that enrich our research. I give my thanks to Andreea for her analytical approach and curious spirit that brought a shed of light into my research. I also want to thanks my colleagues and fellow researchers Eliana Trsa, Mugurel , Andreica and Florin Pop, with whom ive collaborated on many occasions, for their support and excellent opportunities for exchange of ideas. My parents, Maria and Mironel, had one of the largest contributions to the development of my professional career. Their huge constant support, love and encouragements gave me the energy push forward every time. It is impossible to measure all the help i received from them in the last years because there is no scale large enough. I love you.

iii

Contents
1 Introduction 1.1 Advantages of P2P in the Era of Cloud Computing 1.2 Resources in P2P Systems . . . . . . . . . . . . . 1.3 Problem Statement . . . . . . . . . . . . . . . . . 1.4 Thesis Outline . . . . . . . . . . . . . . . . . . . . 1 1 2 2 3 6 7 7 8 10 11 12 13 13 14 17 19 19 20 21 21 23 23 23 24 24 25 25 25 27 28 28 29 30 31

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

Peer-to-Peer Systems 2.1 The Evolution of P2P Systems . . . . . . . . . . . . 2.1.1 First P2P Generations . . . . . . . . . . . . 2.1.2 Current Status of P2P in the Internet . . . . 2.2 BitTorrent . . . . . . . . . . . . . . . . . . . . . . . 2.2.1 Concepts . . . . . . . . . . . . . . . . . . . 2.2.2 Client - Tracker Communication in BitTorrent 2.2.3 BitTorrent Swarms . . . . . . . . . . . . . . 2.2.4 Beyond BitTorrent - Tribler . . . . . . . . . . 2.2.5 Peer Discovery in BitTorrent . . . . . . . . . 2.2.6 Content Discovery in BitTorrent . . . . . . . 2.2.7 Rich Metadata Overlay in Tribler . . . . . . 2.3 Reputation in BitTorrent Communities . . . . . . . . 2.3.1 Public and Private Trackers . . . . . . . . . 2.3.2 Decentralized Reputation Mechanisms . . . 2.3.3 BarterCast Overlay in Tribler . . . . . . . . 2.4 UDP Based P2P Protocols . . . . . . . . . . . . . . 2.5 Commercially Deployed Architectures . . . . . . . . 2.5.1 Skype . . . . . . . . . . . . . . . . . . . . 2.5.2 Spotify . . . . . . . . . . . . . . . . . . . . 2.6 Social Components in P2P Systems . . . . . . . . . 2.7 Peer-to-Peer Applications . . . . . . . . . . . . . . . 2.7.1 File Sharing in Peer-to-Peer Systems . . . . 2.7.2 File Sharing Service Characteristics . . . . 2.8 Conclusions . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

A New Model for Overlay Behavior in Peer-to-Peer Systems 3.1 Modeling Peer-to-Peer Systems . . . . . . . . . . . . . . . . . . . 3.2 A Proposed Model for Evaluating Overlay Behavior in P2P Systems 3.2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.2 Parameters Inuencing the Swarm Behavior . . . . . . . .

. . . .

. . . .

. . . .

iv

CONTENTS 3.2.3 Parameter Collecting Approach . . . . . . . . . . 3.2.4 Details on Collected Parameters . . . . . . . . . 3.2.5 Expressing Swarm Characteristics by Parameters 3.2.6 Modeling Overlay Performance . . . . . . . . . . 3.2.7 Modeling Overlay Reliability . . . . . . . . . . . . 3.2.8 Modeling Overlay Efciency . . . . . . . . . . . . Experimental Validation of the Proposed Model . . . . . . 3.3.1 Experimental Runs . . . . . . . . . . . . . . . . 3.3.2 Result Analysis . . . . . . . . . . . . . . . . . . Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

v 33 34 35 36 37 37 38 39 39 41 42 42 42 43 45 45 46 46 47 47 48 56 56 58 59 59 63 65 65 66 66 67 67 67 67 69 70 71 74 76 77 77 83 84 85

3.3

3.4 4

Improving Overlay Communication in P2P Systems 4.1 Resource Availability in P2P Systems . . . . . . . . . . . . . . . . 4.1.1 Relevant Resources for Peer-to-Peer Systems . . . . . . . 4.1.2 Bandwidth Availability in File Sharing Applications . . . . . 4.1.3 BitTorrent Monetization of Bandwidth . . . . . . . . . . . . 4.2 Privacy Challenges in P2P Systems . . . . . . . . . . . . . . . . . 4.2.1 Information Shared with the Community . . . . . . . . . . 4.2.2 Breaking BitTorrent Privacy . . . . . . . . . . . . . . . . . 4.3 A Proposed Explicit Relay Architecture for the BitTorrent Protocol . 4.3.1 Existing Relay Architectures . . . . . . . . . . . . . . . . 4.3.2 A Proposed 1-Hop Proxy Architecture . . . . . . . . . . . 4.3.3 A Proposed Proxy Discovery Mechanism . . . . . . . . . 4.3.4 A Proposed Multihop Data Relay Architecture . . . . . . . 4.4 Designing an Implicit P2P Relay Architecture . . . . . . . . . . . . 4.4.1 Design Goals . . . . . . . . . . . . . . . . . . . . . . . . 4.4.2 Design Elements of an Active Cache Architecture for Swift 4.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Evaluating P2P Overlay Improvements in Realistic Scenarios 5.1 Characteristics of P2P Scenarios . . . . . . . . . . . . . . 5.2 Virtualized Computer Clusters . . . . . . . . . . . . . . . . 5.2.1 Virtualization Solutions . . . . . . . . . . . . . . . 5.2.2 Network Emulation . . . . . . . . . . . . . . . . . . 5.2.3 Scalability of Virtualization Solutions . . . . . . . . 5.3 Deploying Realistic Scenarios . . . . . . . . . . . . . . . . 5.3.1 Design Goals . . . . . . . . . . . . . . . . . . . . 5.3.2 Related Work . . . . . . . . . . . . . . . . . . . . 5.3.3 Infrastructure Overview . . . . . . . . . . . . . . . 5.3.4 Infrastructure Design . . . . . . . . . . . . . . . . 5.3.5 Infrastructure Implementation . . . . . . . . . . . . 5.3.6 Introducing Churn in P2P Scenarios . . . . . . . . 5.3.7 Bandwidth Limitation in P2P Scenarios . . . . . . . 5.3.8 Simulating Connection Dropouts . . . . . . . . . . 5.4 Running Experiments . . . . . . . . . . . . . . . . . . . . . 5.5 Use Case . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

CONTENTS 6 Improvement Evaluation for Overlay Communication Protocols 6.1 Explicit Relay Architecture Implementation . . . . . . . . . . 6.2 Analysis of Architecture Performance . . . . . . . . . . . . . 6.2.1 Small Scale Analysis . . . . . . . . . . . . . . . . . 6.2.2 Large Scale Analysis . . . . . . . . . . . . . . . . . 6.3 Large Scale Technical Trials . . . . . . . . . . . . . . . . . . 6.3.1 Trial Description . . . . . . . . . . . . . . . . . . . . 6.3.2 Trial Analysis . . . . . . . . . . . . . . . . . . . . . . 6.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . .

vi 87 87 90 90 91 95 95 96 99

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

Conclusions 101 7.1 Overview of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 7.2 Summary of Contributions . . . . . . . . . . . . . . . . . . . . . . . . . 103 7.3 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 106 108 109 110

Publications A B Campaign Conguration File Scenario Conguration File

Bibliography

List of Figures
1.1 Overview of main thesis components . . . . . . . . . . . . . . . . . . . . 2.1 2.2 2.3 2.4 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 5.1 5.2 5.3 5.4 5.5 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9 Internet host count evolution . . . . . . . . . . . . . . . . . . . . . BitTorrent Tracker Role . . . . . . . . . . . . . . . . . . . . . . . . The Buddycast protocol stack . . . . . . . . . . . . . . . . . . . . Seeders and leechers in a swarm sharing a Fedora 11 DVD image Average day (subscribers and trafc) - Europe, xed access . . . . Average day (subscribers and trafc) - North America, xed access Proxy Service high level overview . . . . . . . . . . . . . . . . . . BitTorrent and Overlay communication in a Proxy Service scenario Proxy Service protocol details . . . . . . . . . . . . . . . . . . . . Proxy Service with a malicious node in the swarm . . . . . . . . . Proxy Service with a malicious node in the proxy layer . . . . . . . The Proxy Service with a multi-hop proxy layer . . . . . . . . . . . Cache overlay on top of the Swift protocol . . . . . . . . . . . . . Infrastructure design overview . . . . . . . . . Detailed scenario_setup components . . . . . Detailed scenario_clean components . . . . . Time recovery with respect to dropout interval Scenario output: download speed evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

. 8 . 11 . 17 . 26 . . . . . . . . . . . . . . . . . . 44 45 48 50 54 55 55 57 60 72 72 73 82 84 89 94 96 97

Proxy Service implementation architecture . . . . . . . . . . . . . . . . Large scale analysis results . . . . . . . . . . . . . . . . . . . . . . . . The daily distribution of total number of reporting IP addresses . . . . . The distribution of clients participating in the trial per country . . . . . . The correlation of the download speed for the rst 25% and last 25% of the download . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The correlation of the download speed for the rst 50% and last 50% of the download . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The evolution of the average number of nodes discovered by each client for the rst 120 minutes of the trial . . . . . . . . . . . . . . . . . . . . . The average number of nodes discovered daily . . . . . . . . . . . . . . 1-CDF of nodes discovered daily . . . . . . . . . . . . . . . . . . . . .

. 97 . 98 . 98 . 99 . 100

7.1 Overview of main thesis components . . . . . . . . . . . . . . . . . . . . 102

vii

List of Tables
3.1 Description of experimental validation runs . . . . . . . . . . . . . . . . . 39 3.2 Result analysis for the 5 experiment runs . . . . . . . . . . . . . . . . . . 40 4.1 Proxy Service protocol messages . . . . . . . . . . . . . . . . . . . . . . 52 4.2 Proxy Service protocol payloads . . . . . . . . . . . . . . . . . . . . . . . 53 4.3 Possible solutions for the cache biteld content . . . . . . . . . . . . . . . 63 5.1 Recovery timeout for different scenarios . . . . . . . . . . . . . . . . . . . 81 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 proxyservice_status settings . . . . . . . . . . . . . . . doe_mode settings . . . . . . . . . . . . . . . . . . . . proxyservice_role settings . . . . . . . . . . . . . . . . Preliminary performance evaluation . . . . . . . . . . . Large scale performance analysis - regular BitTorrent . Large scale analysis performance analysis - 5 proxies . Large scale analysis performance analysis - 10 proxies Large scale analysis performance analysis - 15 proxies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 88 90 91 92 92 93 94

viii

Chapter 1 Introduction
The last 20 years have seen the birth and expansion of the Internet from a small network of academic and government institutions to a global network spanning borders, cultures and homes. With the ever increasing network bandwidth, le and data transfer is the Internet service that is responsible for the largest chunk in the Internet backbone. HTTP and Peer-to-Peer systems are nowadays the main bandwidth consumers in the Internet, with video content as the most common type of trafc going through the Internet links [ipoque, 2009]. Peer-to-Peer systems have emerged as the most suitable solution to capitalize on the huge unexploited network bandwidth available on the Internet. Since the inception of Napster in the late 90s, Peer-to-Peer systems have evolved to a variety of solutions and applications that continuously raise the interest of institutions (be them academic or commercial) across the world. The most eloquent example of Peer-to-Peer systems success story is the BitTorrent protocol, currently responsible for the larges chunk in Internet Peer-to-Peer trafc [ipoque, 2009]. With simple, yet highly effective features such as optimistic unchoking, tit-for-tat and rarest-piece rst, the BitTorrent protocol is one of the best suited solutions for large data distribution. Recent research focus has been in integrating features such as social networking, reputation management and video streaming as core features or overlays on top of the protocol.

1.1

Advantages of P2P in the Era of Cloud Computing

In the era of cloud computing the main goal is to transfer the application load to a limited set of dedicated physical or virtual computers. The users machine becomes a simple terminal used to provide a human interface for the features supported by the application. Cloud computing presents a set of challenges that have to be taken into account when a distributed application is being designed (for example single point of failure and scalability).

CHAPTER 1. INTRODUCTION

The Peer-to-Peer applications have been present in the distributed systems area for since the beginning of the Internet. In the last decade, the usage of Peer-to-Peer became prominent with the use of le sharing applications. The advantages of Peer-to-Peer are simple and straight-forward. First, there is no single point of failure (hardware, software or network failure). All hardware resources are provided by the clients using the application. As the size of the system increases, the probability for all the clients to malfunction simultaneously decreases rapidly. Secondly, the application load is distributed automatically within the network peers. There is no single provider for the application services, as all the peers can function, simultaneously, as clients and servers. Third, the scalability of peer-to-peer applications is built within the application design. As each node joins the system, it donates its resources to be used as part of the application backend. The amount of available resources is proportional to the number of clients being part of the system.

1.2

Resources in P2P Systems

Commonly, the P2P systems are designed to access commodity hardware offered by the participating nodes. In the case of public le sharing systems, the commodity hardware is formed by the users machines and is based on three types of resources: CPU cycles, bandwidth and disk storage. The CPU cycles allocated to the P2P system are required to execute the client application and to manage network and disk usage. Disk storage is one of the key resources in le sharing applications and decentralized distributed database applications, allowing each node to store an amount of data that is being made available to the other nodes in the system via the application-specic protocols. Bandwidth is the key resource required by the system. The amount of bandwidth that a node can access is directly proportional to the quality of the application and the user experience. The distribution of bandwidth between the participating nodes is more rigid than the distribution of CPU or storage resources. The P2P le sharing applications do not redistribute the available, unused bandwidth between nodes, limiting the total amount of bandwidth the system offers.

1.3

Problem Statement

The central goal of this thesis is to analyze the overlay network created in a Peer-to-Peer system and, by improving the overlay, to enhance the inter-node communication. This goal is divided into multiple areas of research, each with a specic objective.

CHAPTER 1. INTRODUCTION

The bandwidth available to the nodes of a Peer-to-Peer system is not uniformly distributed. A goal of the thesis is to create a mechanism that adapts to a nodes requirements and redistributes available bandwidth from nodes which are not using it. User privacy is a real concern for all users in the last decade. If any action performed in a P2P system can be traced to a specic node, the user behavior ca easily be identied and reproduced. The thesis targets the introduction of a solution for plausible deniability, offering the user a protective layer to conceal its actions. Any change in a Peer-to-Peer application protocol has effects on the application performance and, in the case of the BitTorrent protocol, in the state of the swarm. One of the thesis goals is to propose a model for evaluating the swarm behavior, model that is based on a set of measurable metrics. Testing a distributed application presents a set of challenges given the limited resources available to reproduce a large set of scenarios. A testbed infrastructure is required to evaluate the proposed improvements, reproducing realistic network conditions in a computer cluster.

1.4

Thesis Outline

This section will present an overview of the thesis. The outline of the chapters is detailed, presenting a summary of the contents for each chapter of the thesis. This thesis is structured in 7 chapters. The main components of the thesis are presented in gure 7.1. The central goal, to improve the communication within the P2P systems by changes at the overlay level, is supported by a set of principal contributions, detailed in their respective chapters. Chapter 1 introduces the context of the thesis, presents a set of problems related to the Peer-to-Peer domain and details the goals of the thesis. The outline of the thesis is presented, summarizing the content of each chapter. In Chapter 2 an overview of the Peer-to-Peer systems is presented. The chapter includes a brief history of the different P2P applications and introduces the BitTorrent protocol. Currently the main Peer-to-Peer platform, BitTorrent is a le sharing protocol with proved scalability and robustness. Three BitTorrent connected areas were presented, classifying the overlay protocols in peer discovery, content discovery and reputation mechanisms. The Tribler project is presented as a deployed P2P client used in academic research as a playground for designing, implementing and testing decentralized P2P technologies. Tribler aims at offering a serverless completely self-organized technology for delivering Video on Demand to end-users. In the peer discovery area, DHT, PEX and BuddyCast are presented. The content discovery includes the concept of a BitTorrent indexer, as well as Vuze and Tribler solutions for content search integrated in the user interface. Public and private trackers

CHAPTER 1. INTRODUCTION

Figure 1.1: Overview of main thesis components are presented in the context of reputation and BarterCast is analyzed as a decentralized reputation mechanism. Chapter 3 proposes the analysis of the system behavior and the quality of its services by three major metrics: performance, reliability and efciency. In le-sharing, performance translates mainly, but not only, into throughput. Although the service performance depends on the resources a client has, the overall metric captures the systems health. The reliability may be dened in multiple ways, meaning in general the capacity of a system to perform according to its design, to resist to failures and to maintain its parameters over a dened period of time. In le-sharing, reliability translates into the capacity of a system to maintain the performance (mainly throughput) in a variety of conditions. For Peer-to-Peer systems, where the peer dynamics is large, reliability offers an image of the stability of the system. Efciency is a metric useful to the company or entity offering a service. It compares the available resources with the resources required to keep the performance of the services. In Peer-to-Peer le-sharing systems, initial resources are duplicated on the nodes participating in the swarm. Because of the automatic data replication, the overall available resources might easily overcome the required resources, keeping useless nodes in the network. As the P2P systems are dynamic, it is difcult to estimate the required resources for a denite amount of time in the future. But a capture of the current efciency will provide feedback for future scenarios. The validity of the information provided by the proposed metrics is tested in 5 different scenarios, each reproducing a real-life use case for a le-sharing service.

CHAPTER 1. INTRODUCTION

Chapter 4 presents a proposed extension for the BitTorrent protocol that uses an overlay of proxy nodes to intermediate the BitTorrent connections between peers. The chapter analyzes the types of resources required by P2P systems in general, and File Sharing systems in particular. It presents the resource availability and efciency of use. Bandwidth and disk storage are identied as important resources for BitTorrent. Bandwidth, the most important of the two resources, is perceived as a xed commodity and is not currently being relocated to areas with high demand. The proposed BitTorrent extension is detailed at the protocol level. It allows the nodes participating in the same system to transfer bandwidth between each other, creating the rst steps for a bandwidth-as-a-currency market. A proxy discovery mechanism on top of the Buddycast protocol is introduced, taking advantage of its epidemic design. As Buddycast maintains an overlay between active Tribler peers, the same peers would be best candidates to use as Proxy nodes, if the Proxy service is enabled. Chapter 5 presents a new approach to building automated infrastructures for Peer-to-Peer testing. The proposed infrastructure is tested on top of a thin virtualization layer allowing easy deployment of experimental scenarios involving Peer-to-Peer clients. Main design goals for the infrastructure are providing an extensive tool for managing both clients and log les, using a common interface for accessing remote systems, offering support for bandwidth control and allowing the user to introduce churn in the environment. The infrastructure uses a hierarchical set of script and conguration les and has been deployed for a variety of Peer-to-Peer experiments. The main advantage of the proposed infrastructure when compared to other solutions is automation coupled with easy deployment. The use of a single commanding station, shell scripts, SSH and rsync allows the user to rapidly deploy a given scenario. The use of OpenVZ virtualization allows consolidation a small number of hardware nodes are used to create a complete virtualized framework capable of running 100 sandboxed BitTorrent clients. LXC is later used to create a similar virtualized setup with 350 nodes. With the use of Linux specic networking tools, the user may dene bandwidth limitation and network topology characteristics in order to simulate realistic scenarios. In Chapter 6 details are presented on the implementation of the Proxy overlay on top of the Tribler client. Two analysis campaigns are executed, to test the proposed proxy mechanism in series of different contexts by measuring the improvements it has compared to the regular BitTorrent protocol. The rst analysis places the proxy mechanism in an environment without any bandwidth constrains. The second analysis is used to evaluate the impact of the proxy layer in swarms with different seeder/leecher ratios. A large technical trial is executed using approximately 20.000 Internet clients. The trial focused on observing the behavior of the proxy core in real-life networking conditions and on evaluating the proxy discovery mechanism. Chapter 7, the last chapter of the thesis, presents the conclusions of this work and includes a list of thesis contributions. The chapter also presents future research directions.

Chapter 2 Peer-to-Peer Systems


Peer-to-Peer systems appeared as a natural evolution of the distributed systems area. Respecting the denition of a distributed system (a collection of autonomous computers that communicate through a computer network and interact with each other in order to achieve a common goal), Peer-to-Peer systems became widely spread as a result of the Internet Boom in the 1990s [ISC, 2010]. Distributed computing and the newly appeared cloud computing are the main development lines for running services on multiple computers in a controlled environment, where one or multiple organizations manage all the computers that are part of the system. Peer-to-Peer systems are used in the design of Internet-scale applications. Such systems can scale to millions of nodes across the Internet without relying on dedicated servers. Having self-organizing capabilities, P2P applications adapt to a dynamic peer population while maintaining operational the provided services. Peer-to-Peer systems came to the public attention when le-sharing applications were developed. In P2P le-sharing models decentralization had a key role both in the distribution and in the availability of the content. File-sharing is not the only widely used application of P2P systems. Skype Internet telephony proved the cost-effectiveness of distributing the load to end-clients. Also, Adobe introduced P2P support on the Flash platform [Kaufman, 2009] to allow live media transmissions and to reduce the load on its servers. Two trends emerged in the evolution of P2P systems: structured architectures (for example the P2P systems described in [Istin et al., 2010], [Visan et al., 2010] or [Ghit et al., 2010]) and unstructured architectures. The structured solutions maintain a strict set of communication rules within the P2P system, controlling the overlay topology. Unstructured systems do not constrain the overlay topology. Subsequently, the growing interest in improving these systems to offer better performance, security and robustness led to the development of many research areas. P2P systems are expected to become a general-purpose layer upon which a wide range of applications can be built (content delivery, social networking, etc).

CHAPTER 2. PEER-TO-PEER SYSTEMS

2.1

The Evolution of P2P Systems

The basic design element of Peer-to-Peer systems, an architecture where all nodes offer and use services as equal communication partners, met a continuous evolution over the last decades. This section presents the main stages of evolution for P2P designs and describes the current status of P2P systems in the Internet.

2.1.1

First P2P Generations


[Istin et al., 2010], [Visan et al., 2010],

Two trends: structured and unstructured. [Ghit et al., 2010]

The rst Peer-to-Peer system composed of computer nodes was designed at the very beginnings of the Internet: ARPANET [Lukasik, 2010]. Created in 1969 by the research groups at MIT (Massachusetts Institute of Technology) and DARPA (Defense Advanced Research Projects Agency), it connected the top US universities using a packet-switching technology. Although a more appropriate terminology would be Host-to-Host, the basic communication paradigms of ARPA were the same as the paradigms used in a P2P system. In the following decades inter-host communication developed, and the main TCP/IP based protocols were designed: e-mail, FTP, WWW. Almost eight years after the WWW became publicly available, in 1999 a new method for publishing les was introduced by Napster. Being the rst generation of Peer-to-Peer applications, it had a major draw-back in the design: it was a hybrid system, having a central server that coordinated all the network. The central server was used to maintain lists of connected peers and the content they provided, while actual transactions were conducted directly between peers. Two years after it was released, Napster was shut down by a court order for copyrighted infringement. Napsters short live span coincided with the boom of Internet hosts [ISC, 2010]. As Figure 2.1 presents, after 1999 the number of Internet connected hosts increased at an exponential rate, bringing more and more end-users in the Network. The Internet Boom was supported by the wide deployment of dial-up connections and the development of core optical networks (as described in [Rughinis et al., 2011]). , Napsters inuence extended over the media market. In the pre-Napster era, most multimedia content was delivered using a physical support, such as a CD. Napster opened this market to digital formats that were transferred directly from the Internet, allowing portals like iTunes to reach an important market in only a few years. As Napster was facing legal problems, the second-generation protocols appeared: Gnutella and Kazaa adopted a different architecture, in which the role of a central server was diminished, distributing the searches and transfers directly to the peers. Gnutella was introduced in 2000, being the rst completely decentralized le sharing network. In the Gnutella network all the peers are considered equal, and therefore the

CHAPTER 2. PEER-TO-PEER SYSTEMS


Internet host count evolution

600,000,000

500,000,000

Number of hosts

400,000,000

300,000,000

200,000,000

100,000,000

1981

1985

1989

1993

1997

2001

2005

2009

Date

Figure 2.1: Internet host count evolution network has no central point of failure. eDonkey2000 was a peer-to-peer le sharing application, developed in 2000, using the Multisource File Transfer Protocol. The original client stop being maintained in 2005, the eDonkey Network being still available through other clients such as eMule or Shareaza. In 2001 Kazaa was released. It uses the FastTrack network protocol that assigned more trafc to supernodes to increase routing efciency. In the same year, LimeWire became available. LimeWire uses the gnutella network as well as the BitTorrent protocol. BitTorrent was designed and released in 2001 and had a major growth, both in the number of clients implementing it, and in the percentage of Internet trafc. Direct Connect was also released in 2001. Direct Connect, like Napster, is a hybrid system. The DC clients connect to a central hub and can download les directly from one another. The main difference compared to Napster is that the system uses multiple central hubs and the clients can choose the hub they use.

2.1.2

Current Status of P2P in the Internet

Currently P2P is one of the key factors of the complex Internet ecosystem. It is responsible for a large percentage of the global trafc and has good pespectives for ofoading trafc from central servers.

CHAPTER 2. PEER-TO-PEER SYSTEMS Peer-to-Peer Trafc in the Internet

The popularity of P2P application is growing and the impact of P2P trafc is controversial topic. Rigorous Internet measurements at a global scale are difcult to implement, but common estimations say that a large amount of the Internet trafc is caused by P2P applications. In [Corp, 2008], Sandvine Corp, one of the largest ISPs in USA, shows that P2P trafc is about 50% of their trafc. The main issue behind these values is that a small percentage of the network users can overload the capacity of the network. Since most users are charged by ISPs on a at rate or based on the time spent online, the cost of P2P trafc usage is payed only by ISPs. Another important issue of P2P applications is that broadband networks were not designed for this type of trafc. P2P trafc is symmetric, each peer giving back to the system the content it downloaded. On the other hand, most of the broadband network implementations are asymmetric, with the downstream channels having ve-six times more bandwidth than the upstream channels. Thus, the ISPs networks are extra overloaded by upstream data. Protocols such as libswift1 try to create a more network-friendly approach that takes into account the Internet connection capabilities of each node. P2P applications can also have a positive inuence on the Internet trafc. The basic content distribution model used by the Flash Platform was a client-server one: one client pushes the information to a Flash Media Server using RTMP (Real Time Messaging Protocol) connections, and the server redistributes it. Scaling this approach requires multiple servers to be used, increasing the costs for system.

P2P in Adobe Flash P2P applications can also have a positive inuence on the Internet trafc. The basic content distribution model used by the Flash Platform was a client-server one: one client pushes the information to a Flash Media Server using RTMP (Real Time Messaging Protocol) connections, and the server redistributes it. Scaling this approach requires multiple servers to be used, increasing the costs for system. Flash Player 10.0 (released in October 2008) introduces support for RTMFP (Real Time Media Flow Protocol) allowing point to point connections between clients. The Flash Media Server is still used as a rendezvous point, but ash players can redistribute data directly to a small number of other clients. For the Flash 10.0 applications, peer-topeer was mainly seen as a point-to-point solution for publishing audio and video streams locally, while other clients could connect to the rst one and receive data directly from it. In Flash Player 10.1 (beta, released in October 2009 [Kaufman, 2009]) the Peer-to-Peer support has been improved. The clients can join and participate in a self-organizing peerto-peer network and use it for direct routing, object replication, posting or application-level
1

http://libswift.org/

CHAPTER 2. PEER-TO-PEER SYSTEMS

10

multicast and use native IP multicast together with application-level multicast. This scales the publishing mechanism to millions of users. The additional features of RTMFP included In Flash Player 10.1 open a wide range of possibilities for Flash-based Peer-to-Peer applications. From DHT databases to virtual conferences and Groove-like sharing solutions, the new wave of applications can be integrated in the browser and will enrich the online user-experience. All these applications reduce the load and trafc on the server-side, making the architectures more robust.

P2P in the Browser The development trend of both network infrastructures and distributed systems led to the migration of the most important types of applications to a cloud-based support. Documents, pictures and les in general can now be stored online and interfaced with using the browser. The next step is for the P2P paradigm to be integrated in the design of browser-based applications. As Section 2.1.2 mentioned, Adobe Flash introduced support for P2P designs. Another initiative, called SwarmPlayer [Swarmplayer, 2011a], aims to deliver the video content using a BitTorrent-powered engine. Thus it combines the performance level of a CDN with the scalability of a P2P system. The advantage of the SwarmPlayer is that webpages with popular high-quality videos are no longer very expensive to operate and difcult to manage. By combining for the rst time the HTML5 <video> tag with Bittorrent streaming technology the SwarmPlayer makes video distribution easy. The SwarmPlayer software is developed by the P2P-Next consortium, an EU-funded project exploring with the future of television, in close cooperation with the Wikimedia Foundation. Wikipedia.org has enabled the SwarmPlayer and Bittorrent swarming for all their video content in September 2010 [Swarmplayer, 2011b].

2.2

BitTorrent

Currently, the main P2P le-sharing protocols are BitTorrent, Kazaa, Direct Connect and Gnutella. Gnutella was the rst decentralized le sharing network. In 2007 [Bangeman, 2008], it was the most popular le sharing network on the Internet, covering more than 40% of the P2P le-sharing market, with BitTorrent having close to 30%. In 2009, BitTorrent became [ipoque, 2009] the most popular le sharing protocol. Depending on the region, it is responsible for more than 45-78% of all P2P trafc, and 27-55% of all Internet trafc.

CHAPTER 2. PEER-TO-PEER SYSTEMS

11

2.2.1

Concepts

In BitTorrent, in order to share data (a single le or a group of les), a peer must rst create a small meta-le called a .torrent (a le with the .torrent extension). This le contains metadata about the les that are to be shared and the addresses of one or more trackers (central nodes in a BitTorrent network that maintain lists of available peers). Peers that want to download the shared data must rst obtain a .torrent le for it, using other le-distribution mechanisms (most frequently HTTP-based). The client connects to the tracker(s) specied in the .torrent le and receives from the tracker a list of other peers that share the pieces of the le. The tracker information is refreshed periodically. Figure 2.2 presents the role of a tracker in a BitTorrent network.

Figure 2.2: BitTorrent tracker role. A new client connects to the tracker and receives from it a list of available peers that it can later directly contact As the tracker is a single point of failure, various solutions for a trackerless system (also called decentralized tracking) were implemented, every peer acting as a tracker. There are two main solutions for a trackerless system: DHT and PEX. DHT (Distributed Hash Table) stores the information about available peers in a distributed database. PEX (Peer EXchange) allows peers to exchange information about the swarm directly without making a query to a tracker or to DHT, enhancing the speed and reducing the load on the tracker. Currently, all three peer-discovery mechanisms (tracker, DHT, PEX) can coexist, bootstrapping each other and allowing BitTorrent clients to be a feasible solution for demanding problems such as live streaming. Having all the peer-management done per-shared-le (or per-torrent), the peer groups (called swarms) are independent from one another. If, for example, a client shares 5 les, than it will be part of 5 independent swarms.

CHAPTER 2. PEER-TO-PEER SYSTEMS

12

Peers that have a complete copy of the le are called seeders. The peer that provides the initial copy of the content is called the initial seeder. All the peers that are downloading a le (do not have a complete copy of the content) are called leechers. In BitTorrent, leechers seed back to the swarm enforced by a mechanism called tit-for-tat. However they are not considered full-leechers. The BitTorrent protocol species that the pieces of a le must be download in a "rarest-rst" sequence, ensuring high availability. This feature allows BitTorrent to offer redundancy and resistance to "ash crowds". In [Bardac et al., 2011b] methods are being introduced for evaluating client resource utilization for different BitTorrent downloading strategies, with emphasis on the resource needs of the video-streaming strategies. Memory and CPU utilization are analyzed for both rarest-rst and sequential piece downloading strategies. Another BitTorrent specic mechanism is tit-for-tat. During a download session, if the client shares already downloaded pieces of data, than it will have a higher download speed. If it only downloads without giving back, the download speed will be reduced as other peers will limit its access to shared resources. Although this mechanism ensures fairness, it has the drawback of requiring an amount of time for a node to receive sufcient data before it becomes a good uploader.

2.2.2

Client - Tracker Communication in BitTorrent

As specied in section 2.2, the tracker is a central peer-management server in BitTorrent. Each client must exchange a number of messages with the tracker prior from being able to download the shared content. The steps a client takes when using the services of a BitTorrent system are the following:

The client obtains a .torrent le, and receives from it the address of the tracker The client contacts the tracker and announces his presence. The tracker sends
back to the client a list of peers

The client connects to each of the peers it received, and asks them for pieces of
the downloaded le

From time to time the client contacts the tracker and sends it a progress report.
The client can receive from the tracker more peers. The time interval between two tracker updates can be congured in the client, but most frequently it is specied in the .torrent le

If DHT or PEX are available, the client can nd more peers directly from the peers
it is connected to

After the download is complete, the client announces the tracker that it has a full
copy of the le and that it will become a seeder

If the client is closed it can send a message to the tracker announcing his
disconnection

CHAPTER 2. PEER-TO-PEER SYSTEMS

13

If the client returns at a later moment, it contacts the tracker to announce its
presence and the fact that it has a full copy of the les, and then it waits to be contacted by other peers In the .torrent le multiple trackers can be specied. If this is the case, then all the trackers will be contacted in a round-robin approach. Without PEX and DHT, the tracker has a complete image of the swarm. When DHT or PEX are used, peers nd out about each other without involving the tracker. Thus, the tracker can not hold all the status information about the swarm.

2.2.3

BitTorrent Swarms

As discussed in section 2.2, a swarm is a group of clients, seeders and leechers, who are sharing the same le(s) and are associated with the same .torrent le. Given the possible size and the dynamics of such a group, the number of connections established between the peers of a swarm is limited. The total number of seeders and leechers in a swarm is limited to the trackers view. This view may not reect entirely the reality, as some peers might have disconnected (being no longer active). In [Saroiu et al., 2002b] the behavior of peers was analyzed inside the Gnutella and Napster le-sharing systems, showing that there is signicant heterogeneity in peers bandwidth, availability and transfer rates. Although Gnutella and Napster work on different architectures from BitTorrent, the user distribution is very much alike. Thus, in BitTorrent, not only that the nodes hardware resources are extremely different between the peers, but also the operating systems and the network capabilities vary across the groups. This requires the protocols to be adaptive, robust and to handle malicious users without affecting the rest of the swarm.

2.2.4

Beyond BitTorrent - Tribler

BitTorrent proved to be one of the most successful technologies of the last decade. However, the design of the protocol is not enough to transform it into a long term reliable solution. As [Pouwelse et al., 2008b] mentions, le-sharing systems focus mainly on technical elements and are unable to use the power of human communities. Such a system, for example, could not solve the problem of freeriding [Adar and Huberman, 2000] by taking advantage of the fact that people tend not to steal (bandwidth) from a social group they belong to. One of the most interesting BitTorrent-related projects is Tribler [Pouwelse et al., 2008b]. Tribler is a deployed P2P client used as an academic research playground for designing, implementing and testing decentralized P2P technologies. Tribler aims at offering a serverless completely self-organized technology for delivering Video on Demand to end-users.

CHAPTER 2. PEER-TO-PEER SYSTEMS

14

The Tribler approach is based on building a complex overlay on top of BitTorrent that integrates the main functionalities: content discovery, discovery of peers with similar tastes, reputation mechanisms and social awareness. Part of these mechanisms will be detailed in the following sections, together with the challenges that need to be faced in each area.

2.2.5

Peer Discovery in BitTorrent

Section 2.2.1 presented the main components of a BitTorrent system. As the tracker is a single point of failure for peer management, various solutions for a trackerless system (called decentralized tracking) were researched and implemented. In a trackerless scenario every peer acts as a tracker informing its neighbors about other peers from the network. There are two main solutions for a trackerless system: DHT and PEX. DHT (Distributed Hash Table) stores the information about available peers in a distributed database. PEX (Peer EXchange) allows peers to exchange information about the swarm directly without making a query to a tracker or to DHT, enhancing the speed and reducing the load on the tracker. Currently, all three peer-discovery mechanisms (tracker, DHT, PEX) can coexist, bootstrapping each other and allowing BitTorrent clients to be a feasible solution for demanding problems such as live streaming. A problem related to the discovery of new peers is the classication of the peers based on application-specic criteria. The paper [Bardac et al., 2011a] presents a generic approach for context-aware entity classication with emphasis on integration and use of contextual information in Peer-to-Peer systems. The designed peer classication engine isolates high-latency update processes in order to minimize the latencies of the lookup queries. By using a key-value data-store with support for sorted sets, high complexity context-classifying functions can be executed asynchronously without impacting the lookup queries. The performance of the system is evaluated through experimental and complexity analysis, identifying directions for improving and scaling the peer classication engine. The Tribler client introduced a third approach. Part of the complex Tribler Overlay design, peers search and build lists of similar buddies and use this lists to maintain the Overlay connections.

DHT Overlay in BitTorrent A distributed hash table (DHT) is decentralized distributed system that provides a hash-based lookup service. DHT stores (key, value) pairs and any participating node can efciently retrieve the value associated with a given key. Key-value mappings are distributed among the nodes in such a way that a change in the set of participants

CHAPTER 2. PEER-TO-PEER SYSTEMS

15

causes a minimal amount of disruption. This allows a DHT to scale to extremely large numbers of nodes and to handle continual node churn. DHTs form a generic infrastructure that can be used to build more complex services, such as distributed le systems or P2P le sharing systems. Notable distributed networks that use DHTs include BitTorrents distributed tracker[Loewenstern, 2008], the Kad network and the Coral Content Distribution Network. In [Dinger and Waldhorst, 2009] a thorough analysis of BitTorrent DHT bootstrapping capabilities is made. As presented in [Loewenstern, 2008], BitTorrent uses a DHT for storing peer contact information for "trackerless" torrents. The protocol is based on Kademila [Maymounkov and Mazires, 2002] and is implemented over UDP. In DHT terminology, a "peer" is a client/server listening on a TCP port that implements the BitTorrent protocol. A "node" is a client/server listening on a UDP port implementing the distributed hash table protocol. The DHT is composed of nodes and stores the location of peers. BitTorrent clients include a DHT node, which is used to contact other nodes in the DHT to get the location of peers to download from using the BitTorrent protocol. Each node has a globally unique identier known as the node ID. A distance metric is used to compare two node IDs or a node ID and an infohash for closeness. Nodes must maintain a routing table containing the contact information for a small number of other nodes. Nodes know about many other nodes in the DHT that have IDs that are "close" to their own but have only a handful of contacts with IDs that are very far away from their own. When a node wants to nd peers for a torrent, it uses the distance metric to compare the infohash of the torrent with the IDs of the nodes in its own routing table. It then contacts the nodes it knows about with IDs closest to the infohash and asks them for the contact information of peers currently downloading the torrent. If a contacted node knows about peers for the torrent, the peer contact information is returned with the response. Otherwise, the contacted node must respond with the contact information of the nodes in its routing table that are closest to the infohash of the torrent. The original node iteratively queries nodes that are closer to the target infohash until it cannot nd any closer nodes. After the search is exhausted, the client then inserts the peer contact information for itself onto the responding nodes with IDs closest to the infohash of the torrent. Not all learned nodes are equal. Some are "good" and some are not. Many nodes using the DHT are able to send queries and receive responses, but are not able to respond to queries from other nodes. It is important that each nodes routing table must contain only known good nodes. A good node is a node has responded to one of our queries within the last 15 minutes. A node is also good if it has ever responded to one of our queries and has sent us a query within the last 15 minutes. After 15 minutes of inactivity, a node becomes questionable. Nodes become bad when they fail to respond to multiple queries in a row. Nodes that we know are good are given priority over nodes with unknown status. The BitTorrent protocol has been extended to exchange node UDP port numbers between peers that are introduced by a tracker. In this way, clients can get their routing

CHAPTER 2. PEER-TO-PEER SYSTEMS

16

tables seeded automatically through the download of regular torrents. Newly installed clients who attempt to download a trackerless torrent on the rst try will not have any nodes in their routing table and will need the contacts included in the torrent le.

BitTorrent PEX Overlay Different from DHT, Peer EXchange (PEX) [wiki.theory.org, 2011] is a BitTorrent extension protocol that allows peers in a swarm to exchange lists of active peers directly with each other. PEX was rst implemented in Azureus to reduce the load on trackers and later it was adopted by most BitTorrent clients. Currently there is no ofcial standard for the PEX protocol, and the documentation is not well structured. Multiple versions of PEX protocols are implemented by different clients. The best description for PEX is available from [wiki.theory.org, 2011]. In [Wu et al., 2010] an analysis of the PEX mechanism is made using PlanetLab nodes. There are two common extension protocols, AZMP (implemented by the Azureus client) and LTEP (implemented by the uTorrent client). The common peer exchange mechanism on AZMP is AZ_PEX and on LTEP, it is ut_pex. Both types of peer exchange sends messages containing a group of "added" peers and "removed" peers - though not necessarily in the same list. It has been agreed between the Azureus and Torrent developers that any clients which implement either of the following peer mechanisms try and obey the limits described below when sending PEX messages:

There should be no more than 50 added peers and 50 removed peers sent in a
PEX message (so 100 in total).

A peer exchange message should not be sent more frequently than once a minute.
Some clients may choose to enforce these limits and drop connections which dont obey these limits. However, there should be some tolerance to clients that send a PEX message just under a minute since the last PEX message.

Tribler Overlay - BuddyCast BuddyCast [Pouwelse et al., 2008a] is the rst one of the rst large-scale Internet-deployed epidemic protocols. It is the basic protocol that powers the Tribler Overlay, allowing the peers to maintain active overlay connections and to exchange overlay-specic information (Figure 2.3). Epidemic protocols have three key problems that need to be solved. The rst problem is peer selection. The second problem is selecting the gossip information. The third problem is deciding which information received through gossiping is to be stored permanently and which to be dropped quickly.

CHAPTER 2. PEER-TO-PEER SYSTEMS

17

Figure 2.3: The Buddycast protocol forms the base of the Tribler Overlay stack As described in [Pouwelse et al., 2008a], the main features of the BuddyCast protocol are peer discovery, content discovery, and semantic overlay maintenance. Every BuddyCast message contains the top 50 content preferences of a sending peer, information about 10 other peers which have a similar taste (called taste buddies), and information about 10 random peers. The BuddyCast message has a last seen eld that indicates when a particular peer was last seen online. Information about discovered content and peers, and their preferences contained in incoming messages, are stored in a local Tribler database called the MegaCache. The taste buddies form the important semantic overlay of BuddyCast. Peers send a BuddyCast message periodically every 15 seconds, or in response to a received BuddyCast message (so in effect, BuddyCast is an epidemic exchange protocol). To prevent BuddyCast from contacting the same peers too often, a peer is contacted at most once per cycle of 4 hours. A connection candidate cache is used to store the 100 discovered freshest peers that have not yet been contacted in the current cycle. A connection is being made to either the freshest peer or the most similar peer in this cache, subject to the random peer selection probability p. Currently p = 0.5. In order to exclude unconnectable peers from BuddyCast messages, a mechanism was created to detect connectivity problems. Using a DialBackMessage, peers can detect their own connectivity and report it in outgoing messages, signaling to others not to include them in their BuddyCast messages. In order to exclude ofine peers from BuddyCast messages, BuddyCast maintains a live overlay. For this purpose, every peer maintains open TCP connections to 10 taste buddies and to 10 random peers, thus continuously verifying their online status. This means that when a peer establishes a connection to one of the (connectable) peers in a BuddyCast message it has just received, in effect it opens a connection based on rsthand information about the target peer being online. Connections in the live overlay are dropped when a peer stops being connectable, when a peer was included as a random peer for a number of times, or when a peer with a superior similarity is encountered.

2.2.6

Content Discovery in BitTorrent

Section 2.2.1 presented the main components of a BitTorrent system. In BitTorrent, in order to share data (a single le or a group of les), a peer must rst create a small

CHAPTER 2. PEER-TO-PEER SYSTEMS

18

meta-le called a .torrent (a le with the .torrent extension). This le contains metadata about the les that are to be shared and the addresses of one or more trackers (central nodes in a BitTorrent network that maintain lists of available peers). Peers that want to download the shared data must rst obtain a .torrent le for it, using other le-distribution mechanisms. There are two possible solutions for this problem. A centralized content distribution center, generally based on HTTP, and a decentralized mechanism that uses a Peer-to-Peer overlay.

Centralized Content Discovery The Internet contains thousands of BitTorrent trackers, most of which provide extensive websites around them. In addition to the actual tracker server, such a website (called BitTorrent indexer) would index and store .torrent les that users can download via HTTP. Most of the BitTorrent indexers offer classication mechanisms for the .torrent they index. These mechanisms include, apart from the type of the shared content (video material, audio le, software, etc), data about the swarm (number of seeders and leechers, as seen from the tracker), size of the shared content or the number of times it was downloaded. The tracker-associated indexes can be classied in two categories:

BitTorrent indexers. As mentioned above, an indexer stores the .torrent les and
offers them to the users for download via HTTP

BitTorrent directories. As opposed to the indexers, directories do not store any


.torrent le. Instead a directory contains a collection of links to .torrent les stored on other BitTorrent indexers. Associated with the .torrent, a BitTorrent directory presents all the pieces of information that the indexer provides. The users can perform keyword searches on the collection of an indexer or a directory and retrieve the .torrent les requested.

BitTorrent Indexer Aggregation in Vuze Some of the BitTorrent clients integrate the content search functionality in the client user interface. The main goal is to simplify the download process and to enhance the user experience. Vuze [Vuze-Incorporated, 2011] offers a Meta Search option that aggregates search results from the main BitTorrent directories (Btjunkie, Extratorrent, Isohunt). Vuze offers the ability to add additional sites to its search box functionality. For each torrent, the swarm size (number of leechers and seeders), swarm age and source of the torrent are displayed. While this solution does not perform queries over the Peer-to-Peer network, it integrates the two distinct operations (discovering the content and accessing the content) in a single

CHAPTER 2. PEER-TO-PEER SYSTEMS shell.

19

Decentralized Torrent Collecting Overlay in Tribler As mentioned in Section 2.2.5, the Tribler Buddycast protocol [Pouwelse et al., 2008a] plays an important part in the Tribler Overlay protocol stack. BuddyCast has grown from a simple epidemic protocol into the substrate of a complete epidemic protocol stack. The substrate selects the peers to synchronize with and the higher layer protocols do the synchronization of MegaCache differences. The Tribler Overlay includes a TorrentCollecting function that traverses the network in search of the complete metadata associated with a certain SHA1 hash. A list of recently discovered .torrent les is included in a BuddyCast message, enhancing the discovery of rare information. Due to the added dynamic round time, a connectable peer starts collecting 540 torrents per hour [Pouwelse et al., 2008a].

2.2.7

Rich Metadata Overlay in Tribler

Using the torrent collecting mechanism described in Section 2.2.6, every peer can publish .torrent les using BuddyCast. No metadata publishing mechanism is implemented in the BuddyCast protocol. The Tribler concept of Channels [Channels, 2011] offers a decentralized gossip-based communication layer that adds rich metadata to Bittorrent and enables RSS-like subscriptions in P2P. Channels allow every user to inject metadata and extend Bittorrent swarms with additional information. Every user can associate the published content with his own channel and use the BuddyCast protocol to distribute this list. Metadata (including content digest) of each torrent is signed by the originating peer making the protocol more robust. Any peer can perform a keyword-based search for channels he is interested in. The result is a list of channels matching the keyword, along with the torrents published by them. After a channel is found, the user can subscribe to it and RSS feed is used for the user to receive periodic updates when the channel content changes. No server is needed for creating, spreading, or searching contents of the channel.

2.3

Reputation in BitTorrent Communities

The BitTorrent protocol species a set of rules used by clients, part of the same swarm, to exchange pieces of information. This set of rules is limited at the node-to-node communication, and does not include solid rewarding mechanisms for the peers.

CHAPTER 2. PEER-TO-PEER SYSTEMS

20

As BitTorrent is swarm-centered, a client can not determine the general behavior of any peer it is exchanging information with. The only evaluation that a client has over its peers is based on the communication within a single swarm. As any Peer-to-Peer system is based on each peer donating its resources to the community, the above two problems need to be solved in order for the BitTorrent ecosystem to be self-sustained. The straight-forward solution is to use a reputation mechanism that rewards users with a good behavior and allows the peers to avoid malicious nodes. Two classes of solutions have been developed for reputation mechanisms: centralized solutions, that are tightly connected to the BitTorrent indexers and decentralized solutions that user an overlay to share information about neighboring peers.

2.3.1

Public and Private Trackers

Section 2.2.6 presented the concept of BitTorrent indexers. Such websites are associated with a tracker and store .torrent les that users can download via HTTP. The majority if the trackers have built a community of frequent-users around them. From the point of view of the access to the tracker and indexer services, two types of communities exist: public and private communities. In public communities the services are available for all clients. Private communities restrict the access to a membership base and reward users with a good behavior. Initially only public trackers and their associated public communities existed [Meulpolder et al., 2010]. Nowadays, a large number of private communities exist, having various types of membership and management mechanisms such as sharing-ratio enforcement and injection restrictions. Some of them serve a highly elite set of heavy users, sometimes even equipped with seed boxes that are dedicated to serving content 24 hours a day. For the exclusivist private communities a personal invitation is the only way to gain access to the content; such invitations are hard to get, and even the slightest abuse leads to unconditional banishment. The main purpose of building a private community is to enforce a proper behavior to all members of the community. A share ratio (the amount of uploaded data in all swarms divided to the amount downloaded data in all swarms) is used for every user to determine its behavior and thus its reputation. Positive rewarding is used to encourage the users to increase their share ratio. If an user respects the community rules, it will gain access to content ratings, comments, and forums. Also he will be able to subscribe to RSS feeds and access all the content available on the indexer. If a user does not contribute back to the community (has a low share ratio), the privileges he has gained will be eliminated gradually, terminating with his account being suspended.

CHAPTER 2. PEER-TO-PEER SYSTEMS

21

In [Meulpolder et al., 2010] an extensive measurement is presented on over half a million peers in two public and three private BitTorrent communities. The most important ndings are that: (1) in private communities, almost all data is supplied by seeders, therefore rendering the contribution and importance of BitTorrents tit-for-tat mechanism virtually irrelevant; (2) the download speeds in private communities are 3 5 times higher than in public communities; (3) the seeder/leecher ratios in private communities are at least 10 times as large as those in public communities; (4) peers seed for a signicantly longer duration in private communities, with more than 43% of the peers seeding longer than 1 day.

2.3.2

Decentralized Reputation Mechanisms

In decentralized reputation mechanisms peers evaluate the reputations of the system participants, being able to identify good service providers. In [Delaviz et al., 2010] it was identied that the two main properties of a distributed reputation mechanism are its accuracy, (how well a peer can approximate objective reputation values when calculating the reputation of other peers), and its coverage (the fraction of peers for which an interested peer is able to compute reputation values). Inaccurate or partial reputation evaluation may lead to misjudgment, poor behavior, and nally, system degradation. P2P le-sharing systems are characterized by large populations and high turnover. In such setting, two participants interacting will often have no previous experience with each other, and will be unable to estimate each others behavior in the system. If choosing among potential interaction partners is important, such limitation is an issue. The fundamental idea behind a distributed reputation mechanism is that individual behavior does not usually change radically over time, and past activity is a good predictor of future actions. Using this idea, a reputation mechanism collects information on the past behavior of the participants in a system and quanties these information into reputation values. In a distributed reputation mechanism, depending on how the information about peers behavior are disseminated or how the reputation values are computed, each participant may have different reputation values for the same participants.

2.3.3

BarterCast Overlay in Tribler

The BarterCast protocol [Meulpolder et al., 2009] is a lightweight Internet-deployed reputation mechanism used by Tribler to select good bartering partners and to prevent free-riding. In BarterCast, peers exchange messages about their upload and download actions, and use the collected information to calculate reputations. From the BarterCast messages it receives, each peer builds a local weighted, directed graph with nodes representing

CHAPTER 2. PEER-TO-PEER SYSTEMS

22

peers and with edges representing amounts of transferred data. This subjective graph is then used by each peer to calculate the reputation values of other peers by applying the maxow algorithm to the graph, interpreting the edge weights as ows. In [Meulpolder et al., 2009] it is noted that in real-life social networks a person has a sense of reputation regarding another person based on two factors: (1) the direct experience, and (2) the information about this person obtained from other people. The information obtained from a particular source is constrained by the reputation of the source. As a result, each person has a personal web of trust in which reputation is subjective and based on incomplete information. To a certain extent, this approach is vulnerable to abuse (e.g., by lying), and can lead to incorrect decisions. BarterCast applies the same reputation concept to BitTorrent. Following the analogy, direct experience is regarded as the aggregated amount of service a peer has received from another peer in the past. The network can then be represented as a graph where the peers are nodes, and directed edges represent the aggregated amount of service. Peers exchange information with others about their direct experience such that each peer can build its own, local representation of this graph. In order to identify freeriders, peers have to evaluate the upload and download behavior of other peers. The quantitative measure of performed service is therefore the total number of bytes transferred from one peer to another peer in the graph. The BitTorrent protocol maintains a limited number of simultaneous upload slots (4-7 depending on the implementation). Peers that do not yet have the complete copy of the le (leechers), assign their slots to those peers that currently provide the highest upload rate in return. Peers that have the complete le (seeders) assign their upload slots to those peers that have the highest download rate. Peers that get a slot are called unchoked, while the other peers are choked. There is one extra slot for optimistic unchoking. This slot is assigned via a 30 seconds round-robin shift over all the interested peers regardless of their upload rate. The protocol therefore creates a tit-for-tat data exchange based on the short-term behavior of a peer (i.e., the bandwidth it provides in return). Due to optimistic unchoking, new peers have a chance to obtain their rst pieces of data and bootstrap the process. When a reputation system such as BarterCast [Meulpolder et al., 2009] is available, a variety of policies can be designed that use the reputation information in the BitTorrent protocol. Two policies have been proposed that reduce the system-wide impact of lazy freeriders. The policies are:

Peers assign optimistic unchoke slots to the interested peers in order of their
reputation (rank policy). A peer can not get an upload slot while peers with a higher reputation are also interested and not yet served.

Peers do not assign any upload slots to peers that have a reputation which is below
a certain threshold (ban policy). The ban policy is superior, providing a much stronger disincentive to be a freerider. Many other policies can be proposed that make a more sophisticated use of the long term reputation provided by BarterCast.

CHAPTER 2. PEER-TO-PEER SYSTEMS

23

2.4

UDP Based P2P Protocols

In the recent years a new trend emerged in the P2P research areas. One of the limitations that the current P2P protocols (including BitTorrent) have is the TCP congestion that occurs on the participating nodes. As the P2P protocols are generally aggressive with regard to the generated trafc, mechanisms have been developed to prevent TCP from applying expensive congestion control mechanisms. One of such mechanisms is LED-BAT [Shalunov, 2011], proposed as an IETF Internet Draft. Despite the potential of such congestion-prevention protocols, the adoption rate was low. The research focus moved towards replacing all the TCP stack with the lightweight UDP alternative and two protocols have been proposed: Swift and uTP. Swift [Grishchenko and Bakker, 2010] is a generic multiparty transport protocol. Its goal is to disseminate content among a swarm of peers and can be easily associated with a BitTorrent-like mechanism implemented at the transport layer. In swift, the clients consuming the content participate in the dissemination by forwarding the content to other clients via a mesh-like structure. It is a generic protocol which can run directly on top of UDP, TCP, HTTP or as a RTP prole. Swift is developed by the P2P-Next consortium [P2P-Next, 2011]. BitTorrent Inc. proposed in 2009 a lightweight alternative to the BitTorrent protocol, called uTP [BitTorrent, 2011]. Is a redesign of the BitTorrent protocol using UTP at the transport layer. Even if the uTP protocol is incompatible with original BitTorrent protocol, it was proposed as a BitTorrent protorol extension [BitTorrent, 2009]. uTP and Swift are not compatible.

2.5

Commercially Deployed Architectures

P2P architectures were adopted by the industry to support large scale distributed applications. Two such applications, Skype and Spotify, are examples of VoIP and lesharing services implemented using a P2P design.

2.5.1

Skype

In the last years VoIP telephony gained extraordinary popularity, with an increasing number of operators offering VoIP-based phone services. Skype is the most remarkable example of this new phenomenon: developed in 2002 by the creators of KaZaa, it recently reached over 170 million users, and it accounts for more than 4.4% of total VoIP trafc [Rossi et al., 2009]. Skype [Skype, 2011] is a software application that allows users to make voice calls over the Internet. Calls within the Skype network are free, while calls to both traditional landline telephones and mobile phones are charged. Unlike other VoIP services, Skype does not run only on servers, but makes use of background processing on computers

CHAPTER 2. PEER-TO-PEER SYSTEMS

24

running Skype software creating a P2P infrastructure to exchange signaling information in a distributed fashion, with a dual benet of making the system both highly scalable and robust. However, Skype uses a proprietary design and adopted cryptography mechanisms to discourage trafc decoding and reverse-engineering. In [Rossi et al., 2009] it was noticed that Skype performs peer discovery and refresh using a large number of single packet probes. At the same time, the bulk of the signaling trafc is carried by a relatively small number of longer ows, exchanged with more stable contacts. In [Rossi et al., 2009] it was also noted that the main difference between most VoIP services and Skype is that the latter operates on a P2P model, except for user authentication, which is performed under a classical clientserver architecture by means of public key mechanisms. After the user (and the client) has been authenticated, all further signaling is performed on the P2P network, so that Skypes user information (e.g., contact lists, status, and preferences) is entirely decentralized and distributed among nodes. This allows the service to scale very readily, thereby avoiding a centralized (and expensive) infrastructure. Peers in the P2P architecture can be either normal nodes or SuperNodes. The latter are selected among peers with large computational power and good connectivity (considering bandwidth, uptime and absence of rewalls). They take part in a decentralized information distribution system that is based on a DHT.

2.5.2

Spotify

Spotify [Kreitz and Niemela, 2010] is an online music streaming service that offers access to a large library of music tracks. The streaming is performed by using a combination of client-server access and a peer-to-peer protocol. The service currently has a user base of more than 7 million users and is available in six European countries since 2008. Protocol measurements indicate that the combination of the client-server and peer-topeer paradigms can be applied to music streaming with good results. 8.8% of streamed data comes from Spotifys servers while the median playback latency is only 265 ms (including cached tracks). The architecture has proven to be robust and scalable, as the service popularity is increasing.

2.6

Social Components in P2P Systems

Having a mechanism that transfers data from point A to point B is, in most cases, only one of the two important elements that need to be taken into consideration during the application design. The second element is the user that will decide where to use the

CHAPTER 2. PEER-TO-PEER SYSTEMS

25

mechanism. The user behavior within the online associated community has a large inuence on the performance of the protocols.

2.7

Peer-to-Peer Applications

In this thesis, the term user will refer to a human person and client will refer to the application instance running on behalf of the user. Largely, there is a one-to-one correspondence between users and specic application instances in peer-to-peer systems. In general terms Peer-to-Peer covers a wide range of systems, from point-to-point connectivity to live application multicast, large data transfer in swarms or distributed data storage (DHT). Attempts have been made to create network le systems on top of peer-to-peer infrastructures ([Bardac et al., 2009b]). The main P2P implemented service is the data transfer, better known as le-sharing.

2.7.1

File Sharing in Peer-to-Peer Systems

One of the most common applications in P2P is le-sharing. In a typical P2P le-sharing application, an user has a number of les he wants to share with others. Using the local client, all the shared les are registered and later classied by title, date, format, or size. At a later time, other Internet users can search all the available les using specic queries sent to a central server or directly to other peers in the network. Users can receive multiple answers to their query and can select the les they want to retrieve. The les can then be downloaded to the local machine. In 2010, Twitter Inc. announced [Gadea, 2011] the development of an extremely efcient update deployment system on their servers. Turning a 40 minute deploy process into one that lasts just 12 seconds, the new solution, called Murder, is a combination of scripts written in Python and Ruby that make use of the advantages of BitTorrent protocol. Murder proves the high scalability potential of Peer-to-Peer le sharing systems in environments different from the ones they were initially designed for.

2.7.2

File Sharing Service Characteristics

Peer-to-Peer le sharing is responsible for a large volume of the current Internet trafc, being the most common implemented P2P service. The peer-to-peer design is based on the concept of peers voluntarily providing resources as well as consuming them. As a result, the system must dynamically adapt to keep the service continuity as client peers join and leave the network. Unlike the WWW, where the user interest is maintained by document changes, P2P clients have a fetch-at-most-once behavior, their interest being maintained by the

CHAPTER 2. PEER-TO-PEER SYSTEMS

26

addition of new objects. Object immutability in Peer-to-Peer networks (causing the fetch-at-most-once behavior and having an impact on popularity dynamics) require incentive mechanisms to be used to prevent free-riding, keep the users connected to the system and increase their attachment to the services. The fetch-at-most-once behavior causes a decrease in hit-rate over time and limits the life of the swarms. This phenomenon is called attrition or population turn-over (the population size declining at a more gradual rate than requested bytes). The amount of time a swarm is kept alive is proportional to the number of peers participating in the swarm and to the interest shared by the distributed content. Figure 2.4 presents the population turn-over in a real-live scenario taken from the Distro Torrent Experiment1 . This experiment aims at creating a BitTorrent based distribution infrastructure for Linux CD/DVD distribution images. As an incentive for user participation, real-time graphs are available on the projects website showing the swarm status. The analysis backend is presented in [Bardac et al., 2009a].

Figure 2.4: Seeders and leechers in a swarm used to share a Fedora 11 DVD image, from torrent.cs.pub.ro. In six-weeks time the user interest dropped signicantly Previous studies [Saroiu et al., 2002a, Adar and Huberman, 2000] have shown that in a Peer-to-Peer system, users become in general "greedy", consuming data from the system but providing little resources in return. Also, in [Saroiu et al., 2002b] it is shown that the Peer-to-Peer users have poor availability. There is no one single metric that could accurately reect the exact availability in a Peer-to-Peer environment, since any individual peer might be part of the system only for a limited fraction of the traced time period. One of the problems that affects the usability of P2P le-sharing applications is the problem of free-riders. A free-rider is a peer that uses the le-sharing application to access content from others but does not contribute content to the same degree to the community of peers [Buford et al., 2008]. There are multiple techniques for solving this problem, most of them offering incentives or monitoring user activity. Another problem specic to the P2P le-sharing is peer churn. As opposed to the population turn-over, where the users gradually leave the system and never return, the
1

http://torrent.cs.pub.ro/

CHAPTER 2. PEER-TO-PEER SYSTEMS

27

churn represents the periodical actions of join and leave that a user takes while still being interested in the content. A peers content can only be accessed by other peers while it is online. When a peer goes ofine, it takes time for other peers to be alerted of the status change. Meanwhile, content queries may go unanswered and time out [Buford et al., 2008].

2.8

Conclusions

The current chapter presented an analysis of the evolution and current state of the Peerto-Peer applications. Also, the main concepts associated with the BitTorrent protocol were introduced and three BitTorrent connected areas were presented, classifying the overlay protocols in peer discovery, content discovery and reputation mechanisms. The Tribler project was presented as a deployed P2P client used in academic research as a playground for designing, implementing and testing decentralized P2P technologies. Tribler aims at offering a serverless completely self-organized technology for delivering Video on Demand to end-users. In the peer discovery area DHT, PEX and BuddyCast were presented. The content discovery presents the concept of a BitTorrent indexer, as well as Vuze and Tribler solutions for user interface integrated content search. Public and private trackers are presented in the context of reputation and BarterCast if analyzed as a decentralized reputation mechanism. The large spectrum of P2P-related areas gained the interest of researchers in the last decade. Many of the elements introduces in this chapter will form the basic elements on top of which the thesis contributions are be build.

Chapter 3 A New Model for Overlay Behavior in Peer-to-Peer Systems


P2P protocols consist of a limited set of messages transmitted between peers. The transmission of the messages creates an overlay interaction that shapes the overall design of the system architecture. The sum of all interactions between system nodes can be used to dene the behavior of the overlay and, furthermore, the behavior of the system. This chapter presents a new model for evaluating the overlay characteristics using a generic approach. The presented model can be applied, with small changes, to any P2P system that implies sharing data between nodes grouped in swarms and the model can be adapted to place accent on the particularities of the studied system.

3.1

Modeling Peer-to-Peer Systems

Ever since the Peer-to-Peer systems received the wide-spread attention of the research community, one of the main research goals was to create a model of the system that would allow the reproduction of system behavior with respect to a specic set of input parameters. The models proposed by the literature can be classied into three distinct groups: nodecentric models, connection-centric models and system-centric models. The rst category (node-centric models) are based only on a description of the actions performed by each node or on a description of the node properties. Such models take into consideration the incoming or outgoing connections, the number of overlay neighbors, the bandwidth allocation for each of the neighbors, and so forth. The second category of models (connection-centric) present the system from the point of view of the connections (or interactions) between nodes. Different types of interaction are taken into consideration, each of them being associated with two nodes. The

28

CHAPTER 3. A NEW MODEL FOR OVERLAY BEHAVIOR IN P2P SYSTEMS

29

connections have an associated bandwidth, can be used for an one-way or a two-way communication, can be short-lived or long-lived, etc. The third category of models, the system-centric models, uses elements from the previous categories and tries to present the global system behavior. The main goal of these models is to give an overview of the system evolution rather than the specic node or connection behavior. The model presented in this chapter is part of this category.

3.2

A Proposed Model for Evaluating Overlay Behavior in P2P Systems

This section presents a proposed system-centric model [Milescu et al., 2010] used to evaluate the state and evolution of BitTorrent swarms from a global point of view. The model design is generic, allowing it to be applied to any P2P system with a minimal set of changes. Distributed systems, and in particular Peer-to-Peer systems, have a complex behavior. There are many statistical analysis that capture independent properties of the workloads. Yet the overall system view is difcult to estimate only from analyzing one of the swarm properties. One such metric, that would evaluate the quality of the services1 offered by the Peer-toPeer system (and in particular by a le-sharing system) could be used to:

Evaluate the overall system performance before and after making various
optimizations in the P2P protocols or in the system design

Evaluate the Quality of Experience that users Will have after using the le-sharing
services

Follow the overall systems evolution during a time frame or during a specic
scenario

Take informed decisions to update in real-time the system in order to increase the
quality of its services One possible scenario that uses an overall system performance metric is the following. Lets assume we have a contend distribution service based on BitTorrent. The system is planned to have 5 initial seeders, enough for the estimated ash-crowd resource requirements. One day after the service is launched, in the post-ash-crowd period, the number of users that start using the service increases rapidly. For various reasons the service performance degrades (either because there are not enough seeders, or because the average seeder bandwidth is low, either because the seeders only access a small number of incoming connections). If the overall performance metric shows the
The quality of the services must not be interpreted as QoS. It is an evaluation of the status and performance of the services contracted by the P2P system, not a commitment made by the system
1

CHAPTER 3. A NEW MODEL FOR OVERLAY BEHAVIOR IN P2P SYSTEMS

30

degrade in quality, than an extra number of seeders could be put in place in order to offer the required resources. The system behavior and the quality of its services can be captured by three major metrics: performance, reliability and efciency. The proposed model has three components, one for each of the three major metrics.

3.2.1

Introduction

The concept of Quality of Service (QoS) is mainly related to computer networking and packet-switched telecommunication networks and it refers to a series of mechanisms for resource reservation rather than the achieved service quality. Generally the resources are reserved for an end-to-end delivery of packets. QoS offers the ability to assign different priority to different protocols, users, or data ows, and to guarantee a certain level of performance for a data ow. For example one can guarantee a bit rate, jitter, delay, packet drop rate or bit error rate. QoS becomes important for applications that require a xed bit rate or are delay sensitive, like real-time multimedia streaming, voice over IP, online gaming and IP-TV. If the network capacity is sufcient and there is no network congestion, QoS mechanisms is not required. A network protocol designed to support QoS makes a trafc contract with the software application and reserves capacity in the network nodes. During the data transfer session it monitors the achieved performance level (based on the contracted parameters) and dynamically controls scheduling priorities in the network nodes. Reserved capacity may be released during a tear down phase. Because of the way it is implemented, QoS is often perceived only as a high level of performance, like for example high bit rate or low latency. As opposed to QoS, a best-effort protocol does not support trafc shaping. It is an alternative to the complex QoS control and update mechanisms and provides high quality communication by over-planning the network capacity so that it covers the expected peak trafc load. Todays Internet is based on a best-effort strategy. Implementing complete QoS at the application level in the network stack is not possible without support from the network protocols. Service guarantees could be realized at the application level if the network infrastructure of the underlying layers doesnt change, allowing the application to estimate the available resources. Above presented limitations prevent P2P systems from implementing QoS on a widerange.

User Expectations P2P applications transformed the users experience of using content and communication services from the Internet. Web-based applications offer, as well as P2P applications,

CHAPTER 3. A NEW MODEL FOR OVERLAY BEHAVIOR IN P2P SYSTEMS

31

free services to large numbers of end users. On the other hand, P2P offers a new selfscaling architecture, where increased participation increases the capacity of the system, making the service more "personal" and allowing the users to give their contribution. Quality of Experience (QoE), is a subjective measurement of a users experience with a service. QoE analyses a service from the point of vier of a customer asking what combination of characteristics would offer the perception that service works as expected. QoE than makes the user compare his view with the actual provided services. QoE is related to QoS, but differs from the latter which attempts to measure the services using and objective method. A service supplier may respect the terms of the contract it has with its clients (offering a high QoS), but the users may not he satised (having a low QoE). On the other hand, the users can be satised with a service (having a high QoE) while the company is not respecting its contract (having a low QoS). In [Gummadi et al., 2003] the Kazaa workload was analyzed based on a real-life measurement taken for several months. The results show incredible patience on part of Kazaa users being much more patient than the Web users, and waiting hours or even days days before for the content id downloaded.

3.2.2

Parameters Inuencing the Swarm Behavior

The overall swarm behavior is inuenced by a large number of parameters. Some of the parameters have a small inuence on the evolution of the swarm (a variance of these parameters has limited effects on the swarm behavior), while others can cause abrupt changes in the swarm evolution. This section presents the parameters that inuence the three components of the proposed model.

Parameters Inuencing Performance The systems performance is the most important evaluation metric used to determine the quality of its services. It measures the end-point characteristics that users will experience (and, from this point of view, it measures how close is a system to its functionality goal). In le-sharing, performance translates mainly, but not only, into throughput. From the users point of view, it is the answer to the question "How long does it take to get the le i need ?". Although the service performance depends on the resources a client has, the overall metric captures the systems health. In [Gummadi et al., 2003] it was noted that highly available peers are both necessary and sufcient to obtain high transfer rates (and thus high performance). Also, the impact of hit rate depends on which hosts are made more available. Adding an extra hour of availability to the most available host pays a higher hit rate dividend than adding that hour to the least available host. This is because the most available hosts also have more available resources.

CHAPTER 3. A NEW MODEL FOR OVERLAY BEHAVIOR IN P2P SYSTEMS

32

Having the same highly available peers for the whole duration of a swarms life is not feasible in practice (considering most of the P2P le-sharing systems). The dynamics of a swarm make it offer high performance both with a small number of peers, each offering large bandwidth, and with a large number of peers, each having a small bandwidth to offer to the system). In order to obtain the same performance, in the latter care a client has to open a signicantly more number of connections. Performance doesnt only mean number of seeders and leechers. It includes:

number of seeders compared to the number of leechers (more is better) number of client opened connections (lower is better) connection download throughput (more is better) time required to download the content (lower is better)
Considering that BitTorrent le-sharing architecture is content-centric, creating separate groups of users for each available le(s), the above-mentioned metrics are easy to associate with a specic swarm.

Parameters Inuencing Reliability In engineering, the reliability may be dened in multiple ways, meaning in general the capacity of a system to perform according to its design, to resists to failures and to maintain its parameters over a dened period of time. In le-sharing, reliability translates into the capacity of a system to maintain the performance (mainly throughput) in a variety of conditions. For Peer-to-Peer systems, where the peer dynamics is large, reliability offers an image of the stability of the system. From the users point of view, reliability is the answer to the question How long can this performance be sustained ?. Although the nodes in a P2P system may be unreliable (systems not professionally managed that may crash or fail at any time [Kubiatowicz, 2003]). Since failure rate grows linearly with system size, large P2P systems are almost guaranteed to have malfunctioning components. This does not prevent them from implementing reliable services, like for example reliable DHT (presented in [Rieche et al., 2004]). Reliability includes the following elements:

number of peers that join and leave the swarm (lower rate is better) variations of transfer speed during a period of time (lower is better) length of a bittorrent session (longer is better) size of the swarm (more is better)

CHAPTER 3. A NEW MODEL FOR OVERLAY BEHAVIOR IN P2P SYSTEMS Parameters Inuencing Efciency

33

Efciency is a metric useful to the company or entity offering a service. It basically compares the available resources with the resources required to keep the performance of the services. In Peer-to-Peer le-sharing systems, initial resources are duplicated on the nodes participating in the swarm. Because of the automatic data replication, the overall available resources might easily overcome the required resources, keeping useless nodes in the network. As the P2P systems are dynamic, it is difcult to estimate the required resources for a denite amount of time in the future. But an capture of the current efciency will provide feedback for future scenarios. Efciency includes the following elements:

overall activity fraction of a clients life (more is better) number of seeders and leechers in the swarm (more is better) number of BitTorrent connections per peer (more is better)

3.2.3

Parameter Collecting Approach

In order to assess the quality of the services offered by a BitTorrent Peer-to-Peer system, some measurements need to be taken to collect the required data. As detailed in the previous section, the tracker has a detailed (yet possible incomplete) image of the swarm. Some of the required measurements can be taken at the tracker level (like for example the number of peers that join the swarm in a given period of time). Other measurements have to be taken at the client level (like for example the average number of connections a client uses). The chapters 3.2.2, 3.2.2 and 3.2.2 presented a set of metrics that can evaluate the quality of the services offered by a P2P system. These metrics have to be expressed based on measured swarm parameters. The following parameters are sufcient for the required purpose, and will be detailed in the following sections of this chapter:

peer join rate peer leave rate number of peers number of seeders and leechers download speed upload speed

CHAPTER 3. A NEW MODEL FOR OVERLAY BEHAVIOR IN P2P SYSTEMS

34

download time upload time number of connections per peer activity fraction average session length
Some of the metrics above are connected to each other. Exploring these connections gives an overview of the swarm health. For example, the download time should be proportional with the download speed, average upload speed and the number of seeders.

3.2.4

Details on Collected Parameters

Peer join rate (JR , [1/s]) is the number of peers that join the swarm in given time frame. The join rate shows the interest that content has for new users and must be analyzed together with the leave rate. A high join rate means that the swarm is increasing in size, making it more robust. This value is measured at the tracker. Peer leave rate (LR , [1/s]) is the number of peers that leave the swarm in given time frame. As mentioned above, this value must be analyzed together with the join rate. A high leave rate is a sign of low interest for the shared content and possible stagnation in the swarms live in the near future. Also, when the leave rate is high, the incentive mechanisms are not efcient. If the number of peers participating in the swarm is decreasing, so does the reliability of the offered services. This value is measured at the tracker. Number of peers (NP , dimensionless quantity ) is the total number of clients registered to the tracker. By itself it only shows the size of the swarm, and is used as a reference value for other metrics. Generally, a large swarm means more stability, but the analysis must take into account the number of seeders and leechers. This value is an average one for a given time frame and is measured at the tracker. Number of seeders and leechers (NS , NL , dimensionless quantity ) is one of the most important pieces of information. A well seeded torrent (having a the seeder/leecher fraction greater than 1) offers quality services and is robust. A small number fraction of seeders is the sign of a ash-crowd or of poor resource availability. This value is an average one for a given time frame and is measured at the tracker. Download speed (DS , [KB/s]) is an average value for al the swarm that can be measured at the tracker or at the client. It is only valid for clients that download data. The tracker receives state updates from each client and can calculate the download speed by comparing consecutive update messages from a specic client. A client can also estimate its download speed, this value being a more accurate one. Upload speed (US , [KB/s]) is also an average value for all the swarm, and has the same specications as the download speed.

CHAPTER 3. A NEW MODEL FOR OVERLAY BEHAVIOR IN P2P SYSTEMS

35

Download time (DT , [s]) is the average value per swarm of download time for a leecher, dened as the difference between the start time of the rst transaction and the end time of the last transaction. This value is measured at the tracker or at the client. Long download times for small les show a low quality service, while short download times are a sign of swarm health. Upload time (UT , [s]) is an average value per swarm, measured for a seeder, between the moment the seeder starts the download and the moment it has a share ratio of 1 (the number of uploaded bytes is equal tot e number of downloaded bytes). This value can be measured at the tracker or at the client. Number of connections per peer (NC , dimensionless quantity ) is an average value per swarm, for a given time frame. The number of connections can only be measured at the client and has to be analyzed together with the number of peers. A large value for this metric shows that the swarm is getting close to a full-mesh, while a low number shows that the swarm has enough resources to satisfy client needs. Activity fraction (AF , dimensionless quantity ) measures the fraction of time a client is transferring content over the clients lifetime or over the duration of a given time frame. It is an average value per swarm. This activity fraction shows if the client participated at the overall system available resources. A low value for most of the client shows that the system has more resources than it needs. The activity fraction can be measured at the tracker, or at the client, and the client measurement is more accurate. Average session length (SL , [s]), in which a session is dened as an continuous period of time during which a client has one or more active transactions, can be measured only at the client. It is an average value per swarm and provides an estimation of the reliability of both the underlying network and of the quality of selected peers.

3.2.5

Expressing Swarm Characteristics by Parameters

As started, some of the metrics above are connected to each other. For example, the download time should be proportional with the torrent size, upload speed, download speed, number of peers and number of seeders:

when the US increases, the BitTorrent tit-for-tat mechanism is designed in such a way that it increases the download speed, so DT should decrease when the DS increases, DT should decrease when the percentage of seeders in the swarm is increasing, the download speed should increase (as there are multiple sources to get data from), so DT should
decrease Considering the above observations, the DT should be qualitatively approximated by:

DT
where Size is the torrent data size.

Size Size NP + US DS NS

[s]

(3.1)

CHAPTER 3. A NEW MODEL FOR OVERLAY BEHAVIOR IN P2P SYSTEMS Formula 3.1 can be rewritten as:

36

D T = cD T

Size Size NP + US DS NS

[s]

(3.2)

where the coefcient cDT is dimensionless such that the overall unit of measurement of DT is seconds.

3.2.6

Modeling Overlay Performance

The performance of a P2P system (as discussed in chapter 3.2.2), is dependent on a number of factors:

when the percentage of seeders is increasing, the the performance of the system
should increase as well

when the percentage of leechers is increasing, the performance should decrease when the NC is increasing, the performance should increase when the DS is increasing, the performance should increase when the DT is increasing, the performance should decrease
Therefore the performance could be estimated based on the formula:

1 NC 1 1 NS + cP 2 + cP 4 P = cP cP 1 1 + cP 5 N L + cP 3 NP NP 1 + DT 1 + DS 1+ N
P

(3.3)

where the cP =

1 . cP1 +cP2 +cP3 +cP4 +cP5

The coefcients cP1 , cP2 , cP3 , cP4 and cP5 have a measurement unit that makes the whole fraction dimensionless. After reordering, the formula leads to

P = cP cP 1
where the cP =

NS NP NC DS 1 + cP 2 + cP3 + cP 4 + cP 5 NP NP + NL NP DS + 1 DT + 1

(3.4)

1 . cP1 +cP2 +cP3 +cP4 +cP5

The performance has values in the interval [0..1), where 0 means bad performance and 1 means good performance. The cP1 , cP2 , cP3 , cP4 and cP5 coefcients allow the performance model to be adjusted to the particularities of the P2P architecture by placing weights on the different components. If cP = 0.2, cP1 = 1, cP2 = 1, cP3 = 1, cP4 = 1 and cP5 = 1 then the components of the performance model are all equally weighted.

CHAPTER 3. A NEW MODEL FOR OVERLAY BEHAVIOR IN P2P SYSTEMS

37

3.2.7

Modeling Overlay Reliability

Chapter 3.2.2 discussed the reliability of a system as being dependent of a number of factors:

when the JR increases, reliability should increase as well when the LR increases, the swarm population decreases, so the reliability should
also decrease

when the DS variation (DS ) increases, the system becomes unreliable when the US variation (US ) increases, the system becomes unreliable when the SL decreases, the systems reliability decreases when the NP increases, the systems reliability increases as well
The reliability can be approximated (from a qualitative point of view) as:

1 1 1 1 1 (3.5) R = cR cR1 + cR 3 + cR 4 1 + cR 5 1 LR + cR2 1 + DS 1 + US 1 + SL 1 + NP 1 + JR


where cR =
1 . cR1 +cR2 +cR3 +cR4 +cR5

The coefcients cR1 , cR2 , cR3 , cR4 and cR5 have a measurement unit that makes the whole fraction dimensionless. After reordering, the formula leads to

R = cR cR 1
where cR =

1 1 SL NP JR + cR 2 + cR 3 + cR 4 + cR 5 JR + LR DS + 1 US + 1 SL + 1 NP + 1
1

(3.6)

cR1 +cR2 +cR3 +cR4 +cR5

The reliability has values in the interval [0..1), where 0 means bad reliability and 1 means good reliability. The cR1 , cR2 , cR3 , cR4 and cR5 coefcients allow the reliability model to be adjusted to the particularities of the P2P architecture by placing weights on the different components. If cR = 0.2, cR1 = 1, cR2 = 1, cR3 = 1, cR4 = 1 and cR5 = 1 then the components of the reliability model are all equally weighted.

3.2.8

Modeling Overlay Efciency

Chapter 3.2.2 discussed the efciency of the system and presented its following dependencies:

when the AF if increasing, so does the efciency when the NS /NL increases, the efciency of the system decreases

CHAPTER 3. A NEW MODEL FOR OVERLAY BEHAVIOR IN P2P SYSTEMS

38

when the number of connections, compared to the number of peers, increases, so


does the efciency The efciency can be approximated (from a qualitative point of view) with the formula:

1 1 1 E = cE cE1 1 + cE2 NS + cE3 N 1 + AF 1+ N 1 + NP


L C

(3.7)

where cE =

1 . cE1 +cE2 +cE3

The coefcients cE1 , cE2 and cE3 have a measurement unit that makes the whole fraction dimensionless. After reordering, this leads to

E = cE cE1
where cE =
1 . cE1 +cE2 +cE3

NL NC AF + cE2 + cE3 AF + 1 NL + NS NC + NP

(3.8)

The efciency has values in the interval [0..1), where 0 means bad efciency and 1 means good efciency. The cE1 , cE2 and cE3 coefcients allow the efciency model to be adjusted to the particularities of the P2P architecture by placing weights on the different components. If 1 cE = 3 , cE1 = 1, cE2 = 1 and cE3 = 1, then the components of the reliability model are all equally weighted.

3.3

Experimental Validation of the Proposed Model

The model proposed in section 3.2 was validated experimentally, using a series of specially designed scenarios. The metrics described in chapter 3.2.5 have to be evaluated based on data collected both at the tracker level, and at the client level. Usually a tracker has built-in logging mechanisms, and the log les are verbose enough to perform such an analysis. For the clients log messages, things are different. There are many BitTorrent clients and each of them has different log types, if any. In order to have a coherent data set, a simulation was needed, where the tracker and the clients were placed in various scenarios. The simulated swarms are comprised of 30 peers, with different upload/download speeds. Each swarm was sharing a 15MB randomly-generated data le. For the conguration of peer parameters (address, port, upload/download limits) a conguration les was used. The total time of an experiment involving a single swarm was about 9-15 minutes. For all three model components (performance, reliability and efciency), all weight coefcients were considered to be numerically equal to 1.

CHAPTER 3. A NEW MODEL FOR OVERLAY BEHAVIOR IN P2P SYSTEMS

39

3.3.1

Experimental Runs

The three proposed metrics (performance, reliability and efciency) were tested in 5 experiment runs. The runs were designed to cover all the frequent usage scenarios, from the one where the swarm is heavy seeded to the one where there is only one seeder. Each run is described in a conguration le. The 5 test runs are presented in the table 3.1. For each test, the test started at time=0 and ended when all the leechers nished downloading the data. Table 3.1: Description of experimental validation runs Run no. 1 Description 1 seeder, 29 leechers all peers start at time = 0 it simulates a worst-case scenario at time = 0: 1 seeder, 4 leechers at time = 1 minute: 10 more leechers join the swarm at time = 2 minutes: 10 more leechers join the swarm at time = 3 minutes: 5 more leechers join the swarm the scenario simulates a mid-intense crowd 29 seeders, 1 leecher all peers start at time = 0 it simulates a best-case scenario at time = 0: 3 seeders, 5 leechers at time = 1 minute: 3 more leechers join the swarm at time = 2 minutes: 3 initial leechers and 2 other leechers leave the swarm at time = 3 minutes: 1 seeder leaves the swarm (one of the initial seeders) the scenario simulates a dynamic swarm at time = 0: 10 seeders, 10 leechers at time = 1 minute: 5 more leechers join the swarm at time = 2 minutes: 5 initial seeders leave the swarm at time = 3 minutes: 2 new seeders join the swarm the scenario simulates a dynamic swarm mid-intense crowd

3.3.2

Result Analysis

For each run, the values for the metrics specied in chapter 3.2.5 were calculated based on their denition. All weight coefcients were considered to be numerically equal to 1. The information about the peer join rate, peer leave rate, number of peers, number of seeders and leechers are known from the test design, and do not need to be calculated.

CHAPTER 3. A NEW MODEL FOR OVERLAY BEHAVIOR IN P2P SYSTEMS

40

The rest of the values are calculated for each client and then an average value (for all the clients) is calculated. In the end, the values for performance, reliability and efciency are calculated based on the proposed model. The results are presented in table 3.2. Table 3.2: Result analysis for the 5 experiment runs Metric Duration Run 1 856s 0 0 30 1 29 19.6 19.04 170.21 93 2.13 0.51 0.3 0.31402 0.25939 0.45690 Run 2 463s 0.05399 0 30 1 29 45.36 43.6 167.1 125.29 4.56 0.98 0.3 0.33564 0.44850 0.53119 Run 3 686s 0 0 30 29 1 25.27 0.92 289 0 0.23 0.12 0.3 0.58149 0.35148 0.04936 Run 4 774s 0.00387 0.00775 11 3 8 28.83 18.32 296.33 0 1.23 0.77 0.3 0.38667 0.31315 0.42096 Run 5 516s 0.01356 0.00968 27 12 15 32.4 20.97 161.2 104 2.38 0.74 0.3 0.43033 0.37080 0.35395

JR LR NP NS NL DS US DT UT NC AF SL P R E

For the performance values, the biggest performance is the one of the run number three (the best-case scenario with 29 seeders and 2 leecher). Although the value for the performance is not in the upper value range, the qualitative information of the proposed metric is in accordance with the testing scenario. The worst performance is the one of the rst test (the worst-case scenario with 1 seeder and 29 leechers). Again, this proves that the proposed metric evaluates correctly the quality of services provided by system. The maximum reliability occurs in the test number two, where peers join the network for the whole duration of the test. The minimum reliability is associated with the rst test, where 29 leechers try to download data from only one seeder. This metric also correctly evaluates the reliability of the le-sharing service provided by the network. The efciency has a minimum for the third test, where 29 seeders offer the le and only 1 leecher downloads it.

CHAPTER 3. A NEW MODEL FOR OVERLAY BEHAVIOR IN P2P SYSTEMS

41

Overall the qualitative information offered by the three metrics reects the testing scenarios. As the biggest performance is only 0.58 from a maximum of 1, the coefcients could be adjusted to distribute the numerical values over a larger numerical range.

3.4

Conclusions

The current chapter proposed the analysis of the system behavior and the quality of its services by three major components: performance, reliability and efciency. In le-sharing, performance translates mainly, but not only, into throughput. Although the service performance depends on the resources a client has, the overall metric captures the systems health. The reliability may be dened in multiple ways, meaning in general the capacity of a system to perform according to its design, to resists to failures and to maintain its parameters offer a dened period of time. In le-sharing, reliability translates into the capacity of a system to maintain the performance (mainly throughput) in a variety of conditions. For Peer-to-Peer systems, where the peer dynamics is large, reliability offers an image of the stability of the system. Efciency is a metric useful to the company or entity offering a service. It basically compares the available resources with the resources required to keep the performance of the services. In Peer-to-Peer le-sharing systems, initial resources are duplicated on the nodes participating in the swarm. Because of the automatic data replication, the overall available resources might easily overcome the required resources, keeping useless nodes in the network. As the P2P systems are dynamic, it is difcult to estimate the required resources for a denite amount of time in the future. But an capture of the current efciency will provide feedback for future scenarios. The validity of the information provided by the proposed metrics was tested in 5 different scenarios, each reproducing a real-life use case for a le-sharing service. Overall the qualitative information offered by the three metrics reects the testing scenarios.

Chapter 4 Improving Overlay Communication in P2P Systems


One of the key signatures of the P2P protocols is the overlay created between the participating nodes. The design of the overlay, its structure and the distribution of connections depend on the design and purpose of the P2P protocol. The overlay has a direct impact on the performance offered by the P2P architecture and, by introducing changes in the overlay structure, the performance of the system can be increased. This chapter presents two architectures, focused on the overlay patterns of the P2P systems, designed to enhance the transfer performance and user privacy over the BitTorrent and Swift protocols.

4.1

Resource Availability in P2P Systems

The basic principles of the presented architectures rely on redistributing available resources between requesting nodes. This section presents an overview of the resources existing in P2P systems in general with accent on the BitTorrent architecture.

4.1.1

Relevant Resources for Peer-to-Peer Systems

The spreading of Peer-to-Peer architectures in the last two decades introduced a new type of applications that adapt to the infrastructures on top of which they run, and scale to millions of nodes across the Internet. Commonly, the P2P systems are designed to access commodity hardware offered by the participating nodes. In the case of public le sharing systems, the commodity hardware is formed by the users machines and is based on three types of resources: CPU cycles, bandwidth and disk storage.

42

CHAPTER 4. IMPROVING OVERLAY COMMUNICATION IN P2P SYSTEMS

43

The CPU cycles allocated to the P2P system are required to execute the client application and to manage network and disk usage. Apart from this direct purpose, the CPU cycles could also be used as a primary resource in systems that are focused on processing large amounts of data. Such workloads are similar to the ones executed by the nodes participating in the SETI@home project [SETI@home, 2011]. The SETI@home project was introduced in 1999 and proposed doing radio SETI using a virtual supercomputer composed of large numbers of Internet-connected computers, as detailed in [Korpela et al., 2001] and [David et al., 2002]. The software executed by the nodes participating in the SETI@home project is detecting idle times and uses the CPU resources only when the machine is not used for other purpose. However, users do not usually limit the amount of CPU resources used by the system. Disk storage is one of the key resources in le sharing applications and decentralized distributed database applications. Each node stores an amount of data that is being made available to the other nodes in the system via the application-specic protocols. The amount of data stored by each node is usually not limited directly. The le sharing systems require and user-specic action to be taken for the disk space to be used. The decentralized distributed databases (with the best-known implementation being DHT [DHT, 2011]) use a collection of nodes to store information and retrieve it based on queries. Each of the nodes in the system stores a small amount of the database, with overlaps being usually applied for redundancy. In Peer-to-Peer le sharing, bandwidth is the key resource required by the system. The amount of bandwidth that a node can access is directly proportional to the quality of the application and the user experience. The distribution of bandwidth between the participating nodes is more rigid than the distribution of CPU or storage resources. The P2P le sharing applications do not redistribute the available, unused bandwidth between nodes, limiting the total amount of bandwidth the system offers.

4.1.2

Bandwidth Availability in File Sharing Applications

Bandwidth, the most important resource in le sharing systems, is perceived as a xed commodity. In most cases this resource is used inefciently, the users possessing more bandwidth than it is usually used. Although there are no studies focused on how much of the acquired bandwidth is used for an average user, a qualitative overview can be obtained by analyzing the data published by Sandvine Incorporated, a company that provides network equipment for major ISPs in Europe and North America. The two reports [Sandvine-Incorporated, 2011a] and [Sandvine-Incorporated, 2011b] published in spring 2011 offer insights on trafc distribution and peaks for the two main continents. Figure 4.1 shows that subscribers in Europe generally disconnect from the broadband networks overnight, as the subscriber trough bottoms out at 5 am at only 33% of the evening peak. The number of active subscribers increases rapidly from 6 am through 10 am before leveling off until mid-afternoon. From 3 pm to 8 pm there is a steady rise, and the number of active subscribers is within 5% of the peak for roughly 3 hours. The trafc curve closely matches that of the active subscriber curve, but generally lags by 1-2

CHAPTER 4. IMPROVING OVERLAY COMMUNICATION IN P2P SYSTEMS

44

hours. Aggregate trafc and active subscribers both peak at 9 pm, although the trafc peak period is condensed to slightly more than 2 hours.

Figure 4.1: Average day (subscribers and trafc) - Europe, xed access. [Sandvine-Incorporated, 2011a]

Source:

The information presented in the report could be used to approximate the average available bandwidth in the periods of low activity. Thus, the trafc varies between 23-100%. The number of subscribers varies between 33-100%, with the minimum value occurring at the same time as the trafc minimum value. By dividing the two intervals we can deduct that the average trafc per user varies between 69-100%, leaving almost 30% of the bandwidth unused during periods with less intensive network activity. Figure 4.2 shows that subscribers in North America generally leave their Internet connections active overnight (the lowest number of subscribers connected is still more than 66% of the peak), but their trafc demands fall off dramatically after peaking at 9 pm. The subscriber peak is reached at 8 pm, one hour ahead of the trafc peak. The subscriber curve is within 5% of its maximum value for just under 6 hours, while the trafc peak is more compact and is within 5% of the peak for only 2.5 hours. The approach applied to the European data set could be applied for the North America graph. Thus, the trafc varies between 30-100%. The number of subscribers varies between 68-100%, with the minimum value occurring at the same time as the trafc minimum value. By dividing the two intervals we can deduct that the average trafc per user varies between 44-100%, leaving almost 55% of the bandwidth unused during periods with less intensive network activity. Despite the less accurate analysis, it is clear that there is a signicant amount of bandwidth unused, bandwidth that is currently tied to the subscriber and not being relocated to areas with high demand.

CHAPTER 4. IMPROVING OVERLAY COMMUNICATION IN P2P SYSTEMS

45

Figure 4.2: Average day (subscribers and trafc) - North America, xed access. Source: [Sandvine-Incorporated, 2011b]

4.1.3

BitTorrent Monetization of Bandwidth

The BitTorrent protocol [Cohen, 2008] organizes the process of le sharing in groups of nodes, called swarms, that share a single piece of content. Within the swarm, nodes transfer subdivisions of the shared data, called chunks and pieces. When client A connects to client B to request pieces, it is rst placed in a status called choked where the connection is maintained alive, but no requests are being accepted. If node B decides that it can allocate resources for node A, it will unchoke the connection and node A will start sending requests. At any point in time node B can choke the connection again, stopping node A from requesting new pieces. The node A is initially unchoked in two cases: when a it offers pieces node B is interested in, or when it wins a lottery and it receives an optimistic unchoke from node B. On an unchoked connection there is only a single piece being actively requested. After a piece is retrieved, a new one is requested, thus the number of piece requests being proportional to the available bandwidth between the two nodes. In conclusion, in BitTorrent terms, the used bandwidth is directly reected in the number of active piece requests a node has.

4.2

Privacy Challenges in P2P Systems

An important element of the P2P systems is the level of privacy the participating nodes receive from the system. As each node shares its resources with the system, the information regarding the amount and type of offered resources could be available to any other participating node. Some P2P designs are built around the concept of protecting an users privacy. However,

CHAPTER 4. IMPROVING OVERLAY COMMUNICATION IN P2P SYSTEMS

46

BitTorrent has no such mechanisms and exposes the content shared by each node with the swarm.

4.2.1

Information Shared with the Community

When a client joins a P2P system, it implicitly shares an amount of information with the rest of nodes. Depending on the protocol used by the P2P system, the information can be sufcient to identify the user, to create a prole for its behavior or to record all the actions the user took within the system. The BitTorrent system is organized around the shared pieces of content. The content discovery and peer discovery are separated from the P2P architecture, being usually placed on a dedicated server. Thus, the information that the clients share with the rest of the community is limited to the content currently being downloaded or seeded. BitTorrent nodes share their presence with the tracker, specifying their IP address, port number, shared content and complete percentage. The tracker disseminates this information to the other nodes, at their request. When a BitTorrent connection is opened, the nodes share bitelds, detailing what chunks of the shared data they have stored locally. The bitelds are later used to determine what pieces are being requested to peers and which are the rarest pieces in the swarm, to prioritize their retrieval. As the nodes go through the process of downloading the data, they update the tracker with their current progress. Also, after one chunk is completed, all connected nodes will be announced about the updated biteld.

4.2.2

Breaking BitTorrent Privacy

A malicious node, part of a BitTorrent swarm, has access to all the pieces of information detailed in Section 4.2.1. By using a combination of these pieces of information it can create a detailed image of:

the peers in the swarm, by scraping the track (frequently requesting new peers to
the tracker) or exploiting PEX [wiki.theory.org, 2011] to discover new peers

the peers seeding the data (this peers provide a complete biteld) and the amount
of time the data was seeded by the peers

the peers downloading the data and the rate at which they retrieve the information
(by comparing successive pieces of biteld information) Information harvesting becomes faster and more accurate if multiple nodes collude as the coverage of the swarm peers increases. Another area for privacy violation is at the ISP level. All user connections are handled by the ISP and, with a minimum effort a packet inspection ltering can be applied tracking

CHAPTER 4. IMPROVING OVERLAY COMMUNICATION IN P2P SYSTEMS all user activity.

47

It can be easily concluded that nodes participating in BitTorrent swarms have little privacy. This creates potential problems in regions where the freedom of speech is frequently disregarded and censorship is applied. Current solutions for privacy enhancing are encrypted BitTorrent connections and VPNs. Connections initiated to other peers can be encrypted, preventing external entities from analyzing connection content. However, this only minimizes the amount of information that can be tracked, it does not eliminate the privacy violation. VPN solutions allow the user to create a secure communication channel to an Internet entity (usually outside of the users country) and to access the Internet from that entitys network. Such services offer limited bandwidth and trafc, requiring usually a paid subscription.

4.3

A Proposed Explicit Relay Architecture for the BitTorrent Protocol

This section presents the design of a novel relay architecture that enhances the functionality of the BitTorrent protocol, offering two key elements. The rst is an increase of the performance in the scenarios where BitTorrent, or the network infrastructure, limit the throughput of the data transfer. The second goal is to offer layer of privacy that the user can benet from when he wants to avoid the trace-back of his actions. The proposed architecture allows both downloading and uploading of data. The novel architecture is presented in two stages. The rst stage uses a single layer of proxy nodes to redirect the transfered content between peers. The second stage uses multiple layers of proxy nodes to increase the privacy offered to the user.

4.3.1

Existing Relay Architectures

Previous attempts have been made to create an intermediate layer for relaying BitTorrent content within a swarm. None of this attempts support uploading of data. The main goal of these solutions was to either offer privacy, or increase the performance. The closest attempt to reach the goal stated in the introduction of this section was made by [Garbacki et al., 2006], where the authors propose an extension of the standard BitTorrent message set. This extension overcomes the enforced fairness in BitTorrent bandwidth sharing that limits the download bandwidth to the available upload bandwidth. Their system is system called 2Fast, and solves the problem while preserving the fairness of bandwidth sharing. In 2Fast, groups of peers are formed, that collaborate in downloading a le on behalf of a single group member, which can thus use its full download bandwidth. A peer in their system can use its currently idle bandwidth to help other peers in their ongoing downloads, and get in return help during its own downloads.

CHAPTER 4. IMPROVING OVERLAY COMMUNICATION IN P2P SYSTEMS

48

The goals stated in the introduction of this section are beyond what 2Fast offers: the support for both downloading and uploading of data needs to be included, and multiple layers of relays have to be used to increase the privacy offered to the user. Both problems are solved in the relay architecture proposed in this chapter.

4.3.2

A Proposed 1-Hop Proxy Architecture

The rst stage of the proposed relay architecture is the introduction of a single layer of proxy nodes that intermediates the communication between the swarm and the node that requests or offers data.

Design Goals This section presents a proxy technology, proposed as an extension of the BitTorrent protocol, and designed specically to integrate with it. The technology is used as an intermediate layer of nodes in the communication between the end-node and the BitTorrent swarm. Although the intermediary proxy nodes need to have an implementation of the proxy protocol (in order to exchange proxy messages with the end-node), the service is designed in such a way that the proxy nodes can connect to any existing BitTorrent swarm. This feature eases the adoption of the new extension and extends its area of applicability. A high-level overview of a Proxy Service usage scenario is presented in Figure 4.3.

Figure 4.3: Proxy service high level overview: the Proxy Service offers an intermediate layer between the end-user and the BitTorrent swarm The Proxy Service aims at offering two key benets to the end user: performance boosting and privacy enhancement (the term performance in this case is dened mainly as throughput and high performance results directly lower download times). For this purpose a single layer of proxy nodes.

CHAPTER 4. IMPROVING OVERLAY COMMUNICATION IN P2P SYSTEMS

49

The desired performance boost can be achieved in two cases. First, if the nodes used as proxies offer, combined, more bandwidth to the end-node than its direct connection to the swarm. Secondly, if the upload bandwidth of the end-node is limited to a small value, than the usage of the proxy layer will bypass the strict tit-for-tat BitTorrent rule. Also, if the Proxy Service is used to support a regular BitTorrent download (the end-node having both proxy connections and direct connections to the swarm), the performance of the download during the ashcrowd period is increased as more connections are used to retrieve the same data. At the same time, the data relayed by the proxy nodes is stored locally for a period of time, transforming these nodes into active caches. Privacy enhancement is targeted at reducing the swarm exposure for the end-node. In a regular BitTorrent session, a node exchanges both actual data and information about the data it stores with other peers in the swarm. If the nature of the information that is downloaded or uploaded is sensitive, the end-user can put itself at risk. By relaying all BitTorrent communication through the Proxy Service, the end-node will be hidden from the swarm, and will pass the responsibility to the proxy nodes. Each proxy helps with only a part of the le data and never stores a full copy of it, and given that the proxy nodes do not download data for private use, the risk that the proxies are exposed to is greatly diminished.

Design Overview The proposed BitTorrent extension was designed take advantage of the technologies integrated in the Tribler BitTorrent client [Pouwelse et al., 2008b], mainly the Tribler Overlay. The implementation of the proposed Proxy Service was also made using the same BitTorrent client. As described in Figure 4.3, from the Proxy Service point of view there are three types of nodes:

Regular BitTorrent clients. These clients can be any instances of clients. Tribler,
libtorrent-rasterbar, uTorrent and Transmission have been tested as regular nodes during the development stage of the Proxy Service.

Proxy nodes. These clients have the Proxy Service enabled and relay for the
requesting end-users

End-user nodes, called in the Proxy Architecture Doe nodes. The Doe nodes are
interested in contacting the swarm using the services provided by the proxy nodes In order for a Proxy node to relay information, it must enable its Proxy capabilities. These Proxy nodes, except for regular data transfer, offer a new type of service called the Proxy Service. The activation and deactivation of this service (the Proxy Service on/off setting) is valid for the duration of a session, and not per-download. Enabling the Proxy Service is an user decision, and the user can, at any moment, decide to turn it off. Doe nodes decide if the help of a proxy is required. For each download there are there operating modes (called DoeModes) that specify if a transfer progresses as:

CHAPTER 4. IMPROVING OVERLAY COMMUNICATION IN P2P SYSTEMS

50

a normal, non-relayed download (DoeMode off) a private transfer, in which case the Doe does not contact the swarm directly and
relays all its trafc via the proxy nodes (DoeMode private)

a boosted transfer, in which case the Doe contacts the swarm as well as the proxy
nodes (DoeMode speed). When the Doe is using DoeMode private, it will not use any mechanism to discover the peers that are participating in the swarm (it will not contact the tracker, it will not use PEX). It is the responsibility of the proxy nodes to discover peers and to retrieve data from them. The communication between the proxy nodes and the swarm uses the normal BitTorrent protocol (as mentioned in Figure 4.4). The communication between the Doe and the proxy nodes uses a novel Proxy Protocol based on Overlay messages.

The Doe requests help from the proxies, and sends them lists of BitTorrent pieces
to download or upload.

In the case of download, the Proxies send the Doe information about the pieces
that are available in the swarm and transfer to it the requested pieces.

In the case of upload, the Proxies receive the content of uploaded piece from the
Doe and transfer it to the swarm.

Figure 4.4: BitTorrent and Overlay communication in a Proxy Service scenario As mentioned in Section 4.3.2, the developed proxy technology is designed specically to integrate with the BitTorrent protocol. Most generic proxy technologies are designed and implemented at the TCP level, making their granularity decrease the BitTorrent performance. The Proxy Service, working at the BitTorrent level and relaying BitTorrent pieces, has a lower impact on the end-user performance and allows the proxy nodes to connect directly to existing swarms.

CHAPTER 4. IMPROVING OVERLAY COMMUNICATION IN P2P SYSTEMS

51

The Proxy Service mechanism is designed to be completely controlled by the Doe. The Doe decides both which proxy nodes to use and which pieces to assign to each proxy.

Doe-Proxy Communication Protocol The communication between the Doe and the proxy nodes is based on overlay messages. The current section presents this complete set of messages. As previously mentioned, the Proxy Service was designed to take advantage of the Tribler technologies, in particular the Tribler Overlay. Tribler uses an internal addressing mechanism to uniquely identify other Tribler peers. The addressing mechanism uses a pair of public/private keys and identies each client by its public key. The name of this client identies is permid, as the identier is maintained for an unlimited period of time (it does not expire), and is used for all overlay communication. Client permids are stored and retrieved from the peer database and exchanged by the Overlay BuddyCast protocol.

Proxy Service Protocol Messages The proposed Proxy Service protocol consists of a set of 12 messages, designed to respect the same formatting rules as the BitTorrent messages. The complete list of messages is the following: RELAY_REQUEST, RELAY_ACCEPTED, STOP_RELAYING, RELAY_DROPPED, DOWNLOAD_PIECE, PIECE_DATA, CANCEL_DOWNLOADING_PIECE, UPLOAD_PIECE, CANCEL_UPLOADING_PIECE, DROPPED_PIECE, PROXY_HAVE, PROXY_UNHAVE. The messages are detailed in table 4.1 and table 4.2. Figure 4.5 illustrates the direction (doe -> proxy or proxy -> doe) each message is being sent at.

Initiating a Proxy Connection When a Doe node wants to contact a Proxy Node and ask it to relay pieces, it sends an RELAY_REQUEST message. This message payload consists of the infohash of the torrent that the Doe is requesting help for. In response to the RELAY_REQUEST message, a Proxy can either send a RELAY_ACCEPTED message, or a RELAY_DROPPED message. Both RELAY_ACCEPTED and RELAY_DROPPED have a payload that contains the infohash of the torrent the Doe asked help for. If a RELAY_ACCEPTED message is sent, the node will be used as a Proxy by the Doe. If a RELAY_DROPPED message is sent, the node will not be used as a Proxy by the Doe.

CHAPTER 4. IMPROVING OVERLAY COMMUNICATION IN P2P SYSTEMS Table 4.1: Proxy Service protocol messages Message Description Request relaying to the proxy node

52

RELAY_REQUEST RELAY_ACCEPTED STOP_RELAYING RELAY_DROPPED DOWNLOAD_PIECE PIECE_DATA CANCEL_DOWNLOADING_PIECE UPLOAD_PIECE CANCEL_UPLOADING_PIECE DROPPED_PIECE PROXY_HAVE PROXY_UNHAVE

Accept relaying data for the requesting doe node Request the proxy node to stop relaying data Stop relaying data for the doe node Request the proxy node to retrieve the specied piece number Transfer the retrieved piece data to the doe node Cancel the retrieval of the specied piece number Request the proxy node to upload the specied piece using the specied piece data Cancel the upload of the specied piece number Can not retrieve the requested piece number Announce the doe node about the available pieces in the swarm Announce the doe node about the pieces that became unavailable in the swarm

Breaking a Proxy Connection When a Doe node decides to stop using the services of a proxy, it sends a STOP_RELAYING message, that has as a payload the infohash of the torrent the proxy was used for. In response to a STOP_RELAYING message, a RELAY_DROPPED message, having as a payload the infohash of the torrent the Doe asked help for, is sent by the proxy.

Asking for Pieces In the Proxy Service, the Doe decides what pieces are assigned to each of the used proxies. The request of pieces is done in a DOWNLOAD_PIECE message, sent by the Doe. This message has a payload that consists of the infohash of the torrent the proxy is relaying and the piece number that the proxy will download. In response to this message, the proxy will contact the swarm and will download the requested piece. When the piece has arrived, it will send PIECE_DATA message to the

CHAPTER 4. IMPROVING OVERLAY COMMUNICATION IN P2P SYSTEMS Table 4.2: Proxy Service protocol payloads Message Payload

53

RELAY_REQUEST RELAY_ACCEPTED STOP_RELAYING RELAY_DROPPED DOWNLOAD_PIECE

torrent_infohash torrent_infohash torrent_infohash torrent_infohash torrent_infohash + bencode(piece_number) torrent_infohash + bencode(piece_number) + bencode(piece_data) torrent_infohash + bencode(piece_number) torrent_infohash + bencode(piece_number) + bencode(piece_data) torrent_infohash + bencode(piece_number) torrent_infohash + bencode(piece_number) torrent_infohash + bencode(haves_bitstring) torrent_infohash + bencode(haves_bitstring)

PIECE_DATA

CANCEL_DOWNLOADING_PIECE

UPLOAD_PIECE

CANCEL_UPLOADING_PIECE DROPPED_PIECE PROXY_HAVE PROXY_UNHAVE

Doe, containing the infohash for the torrent the proxy is relaying, the piece number and the actual piece data.

Piece Availability Information The Doe node requests specic pieces to be downloaded by the proxies. As it has no direct contact with the swarm (if the DoeMode is set to Private), the proxy nodes have to inform it about the availability of pieces. The proxy nodes send regular PROXY_HAVE messages to the Doe, each message having a payload that consists of the infohash of the torrent, and a bencoded biteld. After it receives a message of this type, the Doe uses the biteld information to make the piece picking decision. The Tribler BitTorrent core is used for piece picking, and thus

CHAPTER 4. IMPROVING OVERLAY COMMUNICATION IN P2P SYSTEMS

54

Figure 4.5: Proxy Service protocol details the same algorithm is used for the Proxy Service piece picking as for the BitTorrent protocol.

Analysis of Architecture Privacy Section 4.2.2 presented the main areas where the privacy of a peer could be breached in BitTorrent. If the proposed Proxy Service is used to support a private download (DoeMode private), then the actions of the Doe node are placed under a plausible deniability cloak. If a malicious peer wants to investigate the actions of a target doe, it can place itself in two positions in the architecture: within the BitTorrent swarm or as a proxy node.

CHAPTER 4. IMPROVING OVERLAY COMMUNICATION IN P2P SYSTEMS

55

Figure 4.6: A Proxy Service scenario with a malicious node (in red) part of the swarm If the malicious peer is part of the swarm (gure 4.6), then it will perceive all BitTorrent data communication as originating at the proxy nodes. The malicious node can not associate the proxy nodes with each other (as helping the same doe). However, the actions of the proxy nodes can be directly identied and recorded. Even so, there is no certainty that the pieces transferred by the proxies are transferred at the request of a doe or are transferred for their own use. Therefore, within the swarm the malicious node can not distinguish the usage of Proxy Service over the usage of regular BitTorrent.

Figure 4.7: A Proxy Service scenario with a malicious node (in red) part of the proxy layer If the malicious peer is part of the proxy layer (gure 4.7) than it will be able to identify all actions performed by a doe node. However, it will not be able to identify other proxy nodes helping the same doe. In this scenario, to prevent the malicious node from identifying the doe, multiple layers of proxy nodes are required. This improvement will be discussed in section 4.3.4.

CHAPTER 4. IMPROVING OVERLAY COMMUNICATION IN P2P SYSTEMS

56

4.3.3

A Proposed Proxy Discovery Mechanism

The BitTorrent protocol uses three methods for discovering peers: a centralized solution (tracker) and two distributed solutions (DHT and PEX). The tracker is a single point of failure and, for many years, researchers avoid using it as the main peer discovery solution in novel architectures. The proposed Proxy Service solution needs to offer an mechanism for automatically discover nodes capable of relaying that have the relay service turned on. An epidemic protocol was preferred, allowing the mechanism to function as long as there are at least two nodes present in the Tribler Overlay.

Proposed Epidemic Proxy Discovery The Proxy discovery mechanism has to be distributed and to provide sufcient Proxy nodes for the Doe to use. Taking advantage of the existing Tribler Buddycast protocol, the solution was to design the discovery service on top of Buddycast, an using its epidemic design. As Buddycast maintains an overlay between active Tribler peers, the same peers would be best candidates to use as Proxy nodes, if the Proxy service is enabled. For each node advertised by the Buddycast protocol an extra eld of information was added. This eld is a 32-bit integer that stores the last known status of the Proxy Service for that node (Proxy Service activated or deactivated). This information is stored in the Tribler peer database and disseminated to all Buddycast partners. When the Doe node needs to use a Proxy, it runs a query on the local peer database and selects the discovered Proxy nodes. The performance and the efciency of the discovery mechanism are identical to the ones offered by Buddycast. If the Tribler overlay is properly maintained and has a sufcient coverage, than the Proxy discovery mechanism will provide accurate results.

4.3.4

A Proposed Multihop Data Relay Architecture

The architecture described in section 4.3.2 meets the requirements for the increase in performance, but because it is using only one layer of proxy nodes, it does not completely protect the doe node by offering plausible deniability. To prevent the malicious node from identifying the doe, multiple layers of proxy nodes are required. This section presents an enhancement of the previous protocol by placing a random number of relay nodes between the doe and the swarm. The number of relays can not be determined by any participant in the data transfer (seeder, proxy or doe), offering the required level of protection. Preserving the privacy of the doe node comes at the cost of reducing the transfer performance.

CHAPTER 4. IMPROVING OVERLAY COMMUNICATION IN P2P SYSTEMS Multi-Hop Proxy Architecture

57

Introducing multiple layers of proxy nodes between the doe and the swarm (as shown in gure 4.8) requires two main design decisions:

what protocol to be used in the communication between the proxy nodes how to determine the length of the proxy chain for each request
One of the main goals of this architecture is to protect the privacy of the doe node. Therefore, the doe node must not be distinguishable from the intermediary proxy nodes. Within the inter-proxy communication, the proxy node requesting data can be associated with a simple doe node, that transfers information for personal use. Therefore it comes natural to use the same message set as the Proxy Service protocol for the inter-proxy communication. This design consideration also permits the doe-specic communication to be hidden within the inter-proxy trafc.

Figure 4.8: The Proxy Service with a multi-hop proxy layer The length of the proxy chain (how many proxy nodes are used to relay data between the doe and the swarm) has to be established for every requested piece. None of the participating peers (seeder-proxy-doe in the case of downloaded data or leecher-proxydoe in the case of uploaded data) must be able to determine the length of this chain. If there is no centralized entity that decides the length of the proxy chain, leaving this to be a distributed decision taken by the participating proxies, the algorithm is impossible to traceback and the length of the chain remains unknown. The complete solution for this problem is the following: the doe node sends a request to a proxy node (P1), using the Proxy Service protocol messages. The proxy P1, with the probability P , decides if it contacts the swarm directly or if it sends the request to another proxy node, P2. The process repeats at the P2 node, where the same probability P is used to determine if the swarm is contacted directly or if the proxy chain is continued. If the requested transfer is not completed with a predetermined period of time, the doe node cancels the request it made to the proxy P1. This mechanism prevents innite loops from being created.

CHAPTER 4. IMPROVING OVERLAY COMMUNICATION IN P2P SYSTEMS Analysis of Architecture Performance

58

The 1-Hop proxy architecture presented in section 4.3.2 aimed at increasing the transfer performance and ensuring a plausible deniability cloak for the doe node. Introducing multiple layers of relays between the doe the swarm introduces a performance loss at multiple levels. First, each proxy node introduces a delay required for processing the Proxy Service protocol messages and searching for new relays. The delay introduced by processing the messages can be considered constant within a narrow interval. However, the delay required to initiate the communication with new proxy relay is considerable and has a negative inuence on the nal performance. Secondly, as the length of the proxy-chain is determined at each relay, and an innite loop could occur, the doe node has to send cancel request message after a timeout has passed. If the timeout is reached frequently, than the performance penalty increases.

Analysis of Architecture Privacy Section 4.3.2 presented an analysis of the privacy protection for the 1-Hop architecture. Adding an extra set of relaying nodes does not affect the previous conclusion regarding the malicious peer that is part of the swarm (gure 4.6). When the malicious peer is part of the proxy layers (gure 4.7), with the multi-hop proxy relay, it will not be able to identify a Proxy Service protocol message as coming from a doe node or from another proxy node. The inter-proxy communication and the proxy-doe communication use the same set of messages. The malicious node will be able to identify the actions performed by other proxy/doe nodes, but will not be able to differentiate between the personal actions and the doe-requested actions.

4.4

Designing an Implicit P2P Relay Architecture

Section 4.3 presented a proposed relay architecture, designed to be integrated with BitTorrent, that offers increased performance and privacy to the end-user. This architecture used an explicit set of messages to communicate with the proxy nodes and requires multiple layers of proxy nodes for complete privacy. The introduction of multiple layers of nodes impacts the performance. This section presents a series of steps taken towards the design of an implicit P2P relay architecture. The use of implicit relaying comes as a natural evolution of the previous solution. Part of the problems caused in the Proxy Service protocol by the direct communication with the proxy nodes are solved by hiding this type of communication within the P2P base protocol messages. As the implicit relaying can not be integrated in the BitTorrent protocol without major changes to BitTorrent, Swift was chosen for the underlying P2P architecture.

CHAPTER 4. IMPROVING OVERLAY COMMUNICATION IN P2P SYSTEMS

59

As the proxy nodes can not be differentiated from the regular nodes, the term cache is bater suited for describing the relay function offered by peers.

4.4.1

Design Goals

The proposed implicit relay architecture has two main targets: to function as an active, self-adaptive cache mechanism and to offer privacy to the users of the system that are sharing content. The rst goal is achieved by allowing each node from the system to take the role of a cache and relay data for other nodes, at their request. After a cache node downloads the requested piece, it will store it locally and serve it for subsequent requests. This model is very similar to the Content-Centric Networking concept of Van Jacobson ([Jacobson et al., 2009a], [Jacobson et al., 2009b]). A node can, simultaneously, become a cache for more than one swarm. The cache-role related decisions are decentralized and based on the nodes specic interest. Localized algorithms are used for the decision of a node to become a cache, to determine the swarms and the amount of investment in each of them and to decide the amount of pieces that are stored locally (cache size). The second goal is achieved by designing the system such that the regular nodes from a swarm and the cache nodes joining the swarm could not be identied separately. Considering the scenario presented in Figure 4.9:

the seeders from the swarm would not be able to differentiate the case when a
cache connects to them to retrieve data from the case when a leecher connects for the same purpose

the node wanting to download data (a leecher) would not be able to differentiate
the case when they connect to a cache to retrieve data via it from the case when they connect directly to a seeder from the swarm

the cache nodes would not be able to differentiate the case when a request comes
from a node wanting to download data (a leecher) from the case when another cache node connects for the same purpose

4.4.2

Design Elements of an Active Cache Architecture for Swift

This section discusses some the required strategies and policies for designing an implicit relay architecture on top of the Swift protocol. These challenges have to be answered before the implicit communication protocol is designed.

Prevent the Identication of Cache Nodes User privacy is achieved by designing the system such that the regular nodes from a swarm and the cache nodes joining the swarm could not be identied separately, the

CHAPTER 4. IMPROVING OVERLAY COMMUNICATION IN P2P SYSTEMS

60

Figure 4.9: Cache overlay on top of the Swift protocol personication-uncertainty-principle. This approach requires two different elements:

use the same protocol in the communication with cache nodes and regular nodes implement the behavior of the cache node similar to the behavior of a regular node
As previously mentioned, all communication between nodes uses swift. This ensures that the cache nodes can not be identied based on the specic connections opened with them. The behavior of the cache nodes needs to mimic the behavior of regular nodes. Preferably they do not act the same, but are almost equal and simply use the same algorithm plus share the same code base. If the cache nodes behave in a specic, identiable manner, they can be separated as a group from the rest of the swarm, exposing the nodes directly sharing data. The main areas included in the node behavior are:

biteld information sent to the swarm partners the evolution of the reported biteld information order of the pieces requested to the swarm (rarest-rst, in-order, etc) response time for the advertised pieces (the time elapsed between sending a piece
request to a cache node and starting to retrieve the piece data)

CHAPTER 4. IMPROVING OVERLAY COMMUNICATION IN P2P SYSTEMS Avoid Cache-Loops

61

Following the design considerations in section 4.4, the system should be designed such that the regular nodes from a swarm and the cache nodes joining the swarm could not be identied separately. As a consequence, when a cache node connects to the swarm to download data, it can not determine if it connects to a seeder or to another cahe node. If the cache nodes implement a biteld advertising policy that includes the advertisement of pieces not stored locally (the cache nodes lie about the pieces they have), it is possible that cache loop requests appear in the swarm. Such a loop request is highlighted in Figure 4.9 with the red arrows. Considering the following example: all cache nodes advertise with biteld messages that they have all the content available. A leecher requests piece number 42 to one of the cache nodes. As the cache node does not have the piece locally, it will connect to the swarm to retrieve it. From the swarm it choses another cache and requests the piece number 42 to it. This process repeats 4 times, creating the loop highlighted in Figure 1 with the red arrows. The cache loops can prevent the leechers from retrieving the content or introduce an additional overhead to the system. There are four proposed solutions for avoiding cache-loops.

I: Timeout-based loop avoidance A cancel-piece message has to be sent by the leecher if the data was not retrieved within N milliseconds after the request. The cancelpiece message needs to propagate in the cache-loop faster than the initial request, to prevent zombie requests from existing in the system. The timeout message is a good failsafe for any requests sent in the system, and needs to be included in any of the chosen strategies.

II: Tagged requests Together with the message requesting a piece, the leecher sends a random 32 bit number. If a node receives the same (piece number, random number) request pair, it will not forward the request to another node, preventing a loop from being formed. This solution has the disadvantage of allowing the node to identify two cache nodes (privacy leakage):

the node that initially sent the (piece number, random number) pair the node from which it received the (piece number, random number) pair

III: Split horizon Split horizon is a method used in routing protocols to prevent routing loops. If nodes A and B exchange pieces of information, after node A sends an information to node B, node B will not include in its message the information it received from A.

CHAPTER 4. IMPROVING OVERLAY COMMUNICATION IN P2P SYSTEMS

62

The split horizon method can be adapted to the context of the current architecture. When a cache node sends a biteld message, it eliminates from the biteld the pieces its partner advertised. This process of adapting the biteld to the link partner takes place locally, for each of the links. The split horizon implementation needs to take into account the privacy constraints. If all cache nodes implement the strategy as described above, than any node from the swarm can easily identity them. A hybrid approach is preferred, where the nodes either:

implement the split horizon biteld partially, reducing the probability of creating
loops but not eliminating it

implement the split horizon at a different level in the architecture: the biteld
messages are sent unltered, but the cache will not respond to partner requests for pieces that have been advertised by the same partner.

IV: Reduce the biteld coverage Reducing the biteld coverage to minimize the probability for a cache miss event to occur reduces the probability for a loop to be formed.

Biteld Advertising for Cache Nodes One of the most important strategies is what information do cache nodes include in the biteld messages they sent. Note that it is possible to omit bitelds if desired for simplicity. This strategy is connected to the identication of the cache nodes, to the avoidance of cache loops and to the policy for cache investment. This strategy has two components: what is the content advertised by the biteld at a specic moment in time, and how does the advertised biteld evolve in time. There are a number of possible solutions to this problem. None of them satises all requirements, therefore a compromise needs to be chosen between them. The possible solutions for the content of the biteld are presented in Table 4.3:

Policy for Proxy Investment As proxies relay content requested by other nodes, there are two key elements dening the proxy behavior: swarm election and piece prefetching. The nodes will too select the swarms they join based on the swarm popularity and state. The seeder/leecher ratio, swarm age or interest in the keywords associated with the swarm could be used as inputs for the algorithm responsible for this decision. After a proxy joins a swarm it can either wait for piece requests, retrieve the pieces and deliver them to the requesting node or select a number of pieces, download them and advertise them in back to the swarm as available.

CHAPTER 4. IMPROVING OVERLAY COMMUNICATION IN P2P SYSTEMS Table 4.3: Possible solutions for the cache biteld content Biteld content Advantages Loop prevention Fast response time to requests Average response time to requests Disadvantages

63

Locally stored pieces

Requires investment Decreases the chance for the cache to be used Lower response time to requests Lower response time to requests Increases probability for the cache to be identied

An aggregation of the pieces advertised by the swarm peers Split-horizon applied to the biteld advertised by the partner A semi-randomly generated biteld, including part of the rarest pieces An adaptive biteld, that evolves from no pieces to all pieces based on swarm evolution

Loop prevention

Increases the probability for the cache to be used

Lower response time to requests

Decreases probability for the cache to be identied

Lower response time to requests Loop creation Loop creation Lower response time to requests Increases probability for the cache to be identied

All pieces

The cache will have a higher probability to be used

The second approach presents the advantage of having a collection of pieces to deliver instantly to requesting nodes, having a bootstrap for later piece retrievals, maintaining active connections to some peers in the swarm.

4.5

Conclusions

The current chapter presented a proposed extension for the BitTorrent protocol that uses an overlay of proxy nodes to intermediate the BitTorrent connections between peers. The chapter analyzed the types of resources required by P2P systems in general, and File Sharing systems in particular. It presented the resource availability and efciency of use. Bandwidth and disk storage were identied as important resources for BitTorrent.

CHAPTER 4. IMPROVING OVERLAY COMMUNICATION IN P2P SYSTEMS

64

Bandwidth, the most important of the two resources is perceived as a xed commodity and is not currently being relocated to areas with high demand. P2P systems also raise privacy issues. The nodes share with the system a series of sensible pieces of information. Thus, malicious peers can identify users, create a prole for their behavior or record all the actions that users took within the system. The proposed BitTorrent extension is detailed at the protocol level. It allows the nodes participating in the same system to transfer bandwidth between each other, creating the rst steps for a bandwidth-as-a-currency market. An enhanced version of the extension is presented, allowing increased privacy for the nodes using the protocol. The Proxy discovery mechanism was implemented on top of the Buddycast protocol, taking advantage of its epidemic design. As Buddycast maintains an overlay between active Tribler peers, the same peers would be best candidates to use as Proxy nodes, if the Proxy service is enabled.

Chapter 5 Evaluating P2P Overlay Improvements in Realistic Scenarios


From an academic and research perspective, the BitTorrent protocol has been the target of numerous investigations and measurements aiming to improve its performance, user satisfaction and functionality. Investigations and measurements regarding the BitTorrent protocol have generally resulted in two solutions:

the use of instrumented clients/trackers/peers in real-life environments and the


collection of information they output;

the use of network emulators or simulators for controlled environments and


collecting all required data. The rst solution possesses the advantage of the relevance of live scenarios, but allows limited control for deploying experiments and reproducing the results. The second solution allows complete control of the environment but requires careful setup for deploying a realistic scenario and implies limited scalability with respect to the number of simulated clients.

5.1

Characteristics of P2P Scenarios

When a Peer-to-Peer application is tested on an emulator or in a controlled environment, the scenarios used follow two possible paths: they place accent on a specic component of the application or they reproduce a real-life context. Both types of scenarios use the same elements to reproduce the required context: number of nodes participating in the system, number of created connections and connection types or lifetime of the nodes. The numerical values used for these elements split the scenarios into two categories: synthetic scenarios and replays of previously-recorded traces. The second category is preferred, but the available traces may not t the desired usecases.

65

CHAPTER 5. EVALUATING P2P OVERLAYS IN REALISTIC SCENARIOS

66

5.2

Virtualized Computer Clusters

One of the key elements of reproducing real-life scenarios is the infrastructure on top of which the clients are executed. Running a few clients can be easily done by using a few computers. If the size of the reproduced swarms increases, than the number of required hardware computers increases rapidly. Using a virtualization solution permits the reduction of hardware machines up to a factor of 10. This section presents a few solutions used to create clusters with hundreds of nodes by using a small number of hardware machines.

5.2.1

Virtualization Solutions

Three types of virtualization are among the most commonly used. Depending on the complexity of the virtualized environment and on the resources required to be allocated to the virtual environment, the virtualization solutions can be classied in full virtualization, paravirtualization and operating system-level virtualization. In the case of full virtualization, the virtual machine simulates enough hardware to permit an unchanged guest operating system to be executed in isolation. This type of virtualization requires the most hardware resources to be allocated for each guest operating system. The full virtualization presents the benet of requiring no changes to be done in the guest operating system. Examples of full virtualization include VMWare Workstation and VirtualBox. In paravirtualization, the virtual machine does not directly simulate the hardware, but instead offers a special API that can only be used by a modied guest OS. The virtualized environment is less expensive from the point of view of the consumed hardware resources, but requires changes to be made in the guest operating sistem kernel. Examples of paravirtualization include Xen In operating system-level virtualization, the guest OS is virtualized within the host operating system, enabling multiple isolated and secure virtualized guests to run on a single host OS. The guest operating system environments share the same kernel with the host system. Processes running in a given guest environment view it as a stand-alone system. At the same time, the host operating system has direct access to all processes running in any guest. Examples of operating system-level virtualization include OpenVZ and LXC. The last type of virtualization is the cheapest of all three, as the guest is roughly a group of processes with network and storage attached to them. For simulating large numbers of peers, the OS-level virtualization is the preferred solution.

CHAPTER 5. EVALUATING P2P OVERLAYS IN REALISTIC SCENARIOS

67

5.2.2

Network Emulation

The network emulation used in virtualized environments allows the guest operating system access to the network connection of the host. The network access can generally be congured using two possible solutions. The rst solution if to offer the guest OS direct access to the network connection of the host. The guest and the host machines appear as being part of the same Local Area Network from the point of view of other hosts connected to the same network. This solution is generally implemented by creating a virtual bridge within the host OS and connecting to that bridge both the network interface of the host OS and the network interface of the guest OS. The second solution is to route the trafc of the guest OS through the host OS. This solution uses a similar bridge in the host OS but does not connect into it the external network interface of the host, but a second, virtual interface. The packets are switched from the guest OS network interface to the host OS virtual network interface, then routed to the host OS physical interface.

5.2.3

Scalability of Virtualization Solutions

The scalability of the virtualization solutions (how many virtual instances can be executed on the same host OS without any major performance penalty) depends on the type of virtualization. The solution that requires the least amount of hardware resources (the operating system-level virtualization) has the highest potential for scalability. Previous studies [Bardac et al., 2010] have shown that up to 100 LXC virtual containers can be executed on an average hardware conguration. Regarding network usage, if all the guest operating systems access the Internet simultaneously than the available throughput for each of them will be greatly diminished.

5.3

Deploying Realistic Scenarios

Reproducing real-life environments is preferred over the simulations in the cases when the scenarios are easy to design. This section presents a novel infrastructure for managing all activities connected to reproducing a realistic scenario, from the management of the clients (start and stop) to the analysis and plot of the obtained data.

5.3.1

Design Goals

The use of network simulators for creating controlled environments has been an easy solution for achieving BitTorrent measurements. However, real BitTorrent clients behave

CHAPTER 5. EVALUATING P2P OVERLAYS IN REALISTIC SCENARIOS

68

differently from simulators and the network protocol stack has an important inuence on the outcome of a scenario. Considering the decreasing cost of hardware and the improvements in virtualization solutions, running network emulations with hundreds of nodes, each having a complete instance of an operating system1 is an achievable objective. Rao et al. [Rao et al., 2010] concluded that results gathered from BitTorrent experiments performed on clusters are realistic and reproductable. The proposed infrastructure for controlling peer-to-peer clients ([Milescu et al., 2011]) aims at providing an extensible and adaptable tool for experiment setup, execution and analysis. It has four primary goals, allowing it to be used in a large variety of scenarios. The rst goal is to provide an extensive tool for managing both clients and log les during experiments. Running scenarios that include a large number of clients (up to a few hundred) requires a control mechanism for starting, monitoring and stopping clients in a short time-frame. Most of the scenarios result in a collection of log les, at least one log le per client or per machine. Collecting and analyzing these log les, considering the large number of remote machines, has to be automated. The second goal is to use a common interface for accessing remote systems. The nodes on which clients run may consist of various Linux or Unix distributions, and, most likely, the machines are not administrated by the user running the scenarios. Also, the nodes can be real or virtual machines. A common access interface to this heterogeneous node infrastructure is needed, and the interface must not require administrative privileges for accessing the remote nodes. The third goal is to offer support for bandwidth control. Cluster computers are generally connected with 1Gbit/s or faster network connections. These types of connections are not common for end-users. In order to provide realism to the experiments, the infrastructure needs to offer a mechanism for controlling the amount of bandwidth each client can use. Having the bandwidth control integrated in the infrastructure offers the advantages of ne-tuning the scenarios and recreating a wide range of network environments. The last goal is to allow the user to introduce churn in the environment. Starting and interconnecting P2P clients is only the rst step towards reproducing a real-life scenario. Two of the elements that characterize real swarms are churn and population turnover. Both translate into clients joining and leaving the network at different time intervals. Controlling the periods when each client is connected to the network gives the user the freedom of creating a variety of scenarios, from a controlled ash-crowd to a swarm close to extinction.
From the point of view of the network measurements, the virtualization containers (like OpenVZ) can be considered a complete instance of an operating system
1

CHAPTER 5. EVALUATING P2P OVERLAYS IN REALISTIC SCENARIOS

69

5.3.2

Related Work

Current research focus regarding Peer-to-Peer systems and protocols uses carefully crafted experiments and network simulators. A survey of the use of Peer-to-Peer simulations has been undertaken by Naicken et al. [Naicken et al., 2007]. The authors surveyed papers and collected information regarding the use of simulators for Peer-to-Peer systems. Five criteria had been used to evaluate the simulators: simulation architecture, usability, scalability, statistics and underlying network simulation. A large number of custom simulators were detected to have been deployed, the main cause for that being assumed to be the lack of proper statistics output. The authors criticize the use of NS-2 as a simulator for Peer-to-Peer systems and provoke discussion to help build a consensus on the common platform for Peer-to-Peer research. One of the best places to look for deploying network experiments, also heavily used by Peer-to-Peer researchers, is Planet Lab [PlanetLab, 2011]. With more than 1000 nodes and 500 sites spread all over the world and healthy documentation, PlanetLab offers a suitable environment for Peer-to-Peer experiments. As user nodes are virtualized through the use of Linux-Vserver, experimenters have complete control over their system and its resources. The user may deploy a given set of tests or use PlanetLab as an underlying layer for a testing infrastructure (such as the one presented in this article) and be able to deploy a realistic environment for various scenarios. NS-2 [ns 2, 2011] is one of the most popular network simulators. Thorough documentation, continuous development over the past two decades and a rich set of features have ensured NS-2 as a prime candidate for network experiments. However, as Naicken et al. [Naicken et al., 2007] conclude, NS-2 is particularly useful for detailed modeling of the lower network layer, a characteristic that is of little interest to Peer-to-Peer researchers, though it has been often used in Peer-to-Peer experiments. We consider PlanetLab [PlanetLab, 2011] and NS-2 [ns 2, 2011] to be located at separate poles when discussing about the purpose of Peer-to-Peer experiments. PlanetLab and virtualized environments allow deployment of realistic scenarios, and collected valuable realistic information, but lack scalability. On the other hand, NS-2 and network/P2P simulators allow simulation of large number of nodes (even to the degree of millions) while failing to provide accurate data about client behavior and detailed statistics. We consider that, given the nature of the BitTorrent protocol as a solution for content distribution, realistic (or even real) environments are appropriate for experiments regarding BitTorrent swarms. Dinh et al. [Dinh et al., 2008] have used a custom network simulator (dSim) for large scale distributed simulations of P2P systems. The authors have been able to simulate approximately 2 million nodes for Chord and 1 million nodes for Pastry. Similar work has been presented by Sioutas et al. [Sioutas et al., 2009]. Video streaming in Peer-to-Peer networks has been simulated as described by Bracciale et al. [Bracciale et al., 2007] using a custom simulator dubbed OPSS. With respect to BitTorrent simulators and closer to the purpose of this article, Pouwelse

CHAPTER 5. EVALUATING P2P OVERLAYS IN REALISTIC SCENARIOS

70

et al. [Pouwelse et al., 2005] have undertaken a large BitTorrent measurement study spanning over several months on real BitTorrent swarms (provided by the Suprnova tracker). Data was collected through HTML and BitTorrent, (ab)using scripts, from the central tracker and BitTorrent clients. A similar approach has been employed by Iosup et al. [Iosup et al., 2005]. The authors have designed and implemented MultiProbe, a framework for large-scale P2P le sharing measurements on the BitTorrent protocol. MultiProbe has been deployed in real swarms/environments and collected status information from BitTorrent peers and subject it to analysis and dissemination. Our testing infrastructure is deployed on a hardware experimental setup (similar to a local PlanetLab) presented in an earlier paper [?]. Instrumented BitTorrent clients, logging facilities and an OpenVZ lightweight virtualization solution are basic block on top of which the software testing infrastructure was developed and used.

5.3.3

Infrastructure Overview

From a design point of view, the infrastructure uses four concepts:

A campaign consists of a series of experiments, each experiment being


independent of others and having associated a specic type of data processing. The difference between a campaign and an experiment resides in the fact that results from an experiment may be plotted on a single graph, while results from a campaign need a deeper analysis. Multiple experiments may be included in a campaign. If an experiment needs to be run multiple times (to retrieve signicant results), it can be included multiple times in the same campaign.

A scenario corresponds to a single experiment. It is associated with a specic


type of data processing and its results are generally presented on a single graph.

A node is one of the infrastructure machines. It can be a virtual or a physical


machine. The user running the experiments needs to have access to the nodes both for experiment deployment and execution and for bandwidth control.

A client is a single instance of a peer. The infrastructure is designed to run a single


client on each node, in order to reproduce the real-life execution context for P2P clients. Campaigns and scenarios each use conguration les that include a complete specication of the experiments. The campaign conguration le species the scenarios included in the campaign. The scenario conguration le includes all nodes that are part of the infrastructure used to execute the experiment; for each node, the conguration le denes access parameters, client types and churn and the bandwidth limitations. Annex A presents an example of a campaign conguration le and Annex B presents an example of a scenario conguration le.

CHAPTER 5. EVALUATING P2P OVERLAYS IN REALISTIC SCENARIOS

71

5.3.4

Infrastructure Design

The architecture of the proposed infrastructure is modular, allowing the reuse of many of its components. The central component of the architecture is the execution of a scenario and the retrieval of the scenarios outcome.

Architecture Overview The local machine is used to control the infrastructure. It stores the infrastructure scripts, conguration les, and campaign output. It may also store code or executable les for P2P clients. The infrastructure scripts copy required les from the local machine to remote nodes, set up the environments and start the clients. After the experiment ends, the results (log les) are brought back from the remote node to the local machine. The testing infrastructure uses a modular architecture. Some of the modules are generic (such as the module that parses the conguration les); other modules are node or client specic (for example the module that parses the log les obtained from a client). From a different point of view, part of the modules are executed on the local machine, others on the remote host. The infrastructure architecture is depicted in Figure 5.1. The run_campaign component reads the campaign conguration le and executes each of the specied scenarios. After a scenario is executed, its results are processed, and the next scenario is run. At the end of the campaign, campaign results may be published as a web-page for preliminary analysis. run_scenario, the central point of the infrastructure, is responsible for managing all activities related to the execution of an experiment. Its specic components will be detailed in the following section.

Architecture Details Figure 5.1 presents the architecture overview. The central point of the infrastructure is run_scenario, the component responsible for executing a scenario. This section details its components and explains the mechanisms it uses to deploy and execute scenarios. After the scenario conguration le is parsed, each of the nodes will be prepared for the experiment by scenario_setup. This component is detailed in Figure 5.2. The rst step is to synchronize the local infrastructure scripts with the remote host. The synchronization phase cleans up the remote host and ensures that consecutive scenarios do not inuence each other. A local node-specic conguration le, including parameters related to that node, is created for each of the nodes specied in the scenario conguration le. The node-specic conguration le is then copied to the remote host. This le is used for inter-component communication between the local-executed and the remote-executed components.

CHAPTER 5. EVALUATING P2P OVERLAYS IN REALISTIC SCENARIOS

72

Figure 5.1: Infrastructure design overview. The components use _ between the component names. The actions, that are not directly included in a component, are placed between [ ] The pre-run component prepares remote hosts for the experiment. This component parses the node-specic conguration le and applies settings required for the scenario. Currently the pre-run component also handles bandwidth limitations.

Figure 5.2: Detailed scenario_setup components. The components use _ between the component names. The actions, that are not directly included in a component, are placed between [ ] The schedule_client component schedules client executions on the remote host. Based

CHAPTER 5. EVALUATING P2P OVERLAYS IN REALISTIC SCENARIOS

73

on the node-specic conguration les stored on the remote host, schedule_client starts and stops the client to simulate the specied churn. The client lives until the scenario_wait component detects completion of the experiment, after which it may be stopped. The client will not be immediately stopped, as the infrastructure waits for all the clients to complete the experiment before stopping them. After all clients complete the experiment, each node will be cleaned up by the scenario_clean component, as presented in Figure 5.3. This component stops the client and retrieves the remote log les. A post-run component is then executed reverting all settings applied by pre-run to ensure that consecutive scenarios do not inuence each other. Currently the post-run also disables bandwidth limitations. In the end, the remote node-specic conguration le is deleted and local infrastructure scripts are synchronized to the remote host to clean any temporary le.

Figure 5.3: Detailed scenario_clean components. The components use _ between the component names. The actions, that are not directly included in a component, are placed between [ ] Information from clients is stored in log les. The last stage of the scenario execution, scenario_parse, translates the client-specic log format to an unied format used by the processing stage. Log les are used to analyze the evolution of various client parameters during each scenario by storing periodic status information, such as download speed, number of connections or ratio. Specialized log les could also be created, if the clients are instrumented, and gather detailed periodic information (for example information consisting of instant per-peer download speed and upload speeds). Specially designed R scripts are invoked in the post-processing phase. Using information stored in the unied log les format as input, the R scripts output graphical representation of the evolution of client parameters such as download speed.

CHAPTER 5. EVALUATING P2P OVERLAYS IN REALISTIC SCENARIOS

74

5.3.5

Infrastructure Implementation

The proposed infrastructure was implemented in the Bash scripting language allowing it to be easy to execute on any Linux operating system.

Node and Client Specic Components Part of the components presented in Section 5.3.4 are node or client specic and will be detailed in this section. The rst specic component is pre-run. Currently its main task is to congure the bandwidth limitations on the remote host. Three solutions have been tested for the current implementation (the solutions will be detailed in 5.3.7):

controlling bandwidth at the operating system level controlling bandwidth at the process level controlling bandwidth in the P2P client
pre-run is both client and node-specic. Some clients do no offer bandwidth control, while bandwidth used by some virtual machines can not be limited at the operating system level. The script used to start P2P clients is client-specic. This script also prepares the running environment prior to starting the client. An infrastructure requirement is for BitTorrent clients to provide a CLI interface to run on top of a Linux system. After a client starts, the scenario_wait component monitors it to detect the experiment completion. The detection phase is dependent on both the goal of the experiment, and on the type of client used. Given the generic architecture, the infrastructure may be used for multiple types of experiments, targeting download performance, epidemic protocol measurements, user behavioral patterns, etc. The requirement is for the infrastructure to detect the completion of the experiment based on the messages the client logs while it runs. As each client has a different log format and specic experiments require special log messages, scenario_wait is adapted to user needs. After all clients have completed the experiment, the scenario_clean component stops them and cleans up the remote host. The script used to stop the client is paired with the script used to start it, and is client-specic. The post-run component is used for the clean-up phase. Similar to pre-run, it has to revert the settings prior to stopping the experiment; currently, this stage includes deleting the bandwidth limitations. post-run is node specic. The last client-specic component is scenario_parse. Each client uses a particular log format that should be transparent to the results processing stage. As mentioned before, a translation is required, from the client-specic log format to an unied format used by the processing stage.

CHAPTER 5. EVALUATING P2P OVERLAYS IN REALISTIC SCENARIOS Employed Technologies

75

The current implementation of the testing infrastructure is based on shell (Bash) scripts. With support in any Linux operating system, and no requirements for additional software, shell scripts provide an ideal environment for easy deployment and exploitation. Shell scripting also allows easy integration with existing BitTorrent clients (as long as the clients provide a CLI interface) and access to a exible set of tools for parsing client output logs and automating tasks. Within the testing infrastructure, the SSH protocol is used as a common interface for accessing remote systems. Generally, SSH is widely used as a remote-access method, being one of the most popular services available on Linux machines. The SSH protocol allows, besides remote command execution, easy le transfer via SCP. Although le transfer is available via SCP, the rsync protocol offers better performance as it synchronizes folders between different hosts, transferring only the information that was updated. In the testing infrastructure rsync is used to synchronize scripts between thee local machine and remote systems. Currently, statistical analysis in the testing infrastructure is achieved through the R language. A powerful tool for processing large amounts of data, R can also do graphical post-processing. One of the most important features of R is that it can be scripted, and, thus, it enables the automation of the processing stage.

Conguration Files A campaign is a series of scenarios that are to be run sequentially. Each scenario is independent of the others. The campaign conguration le is provided by the user and consists of the names of the scenario conguration les and R scripts for processing results. Each scenario le, also provided by the user, describes peers that will be part of the swarm scenario and their characteristics (client type, bandwidth limitations, churning). All conguration les are stored in the TestSpecs/ folder, as described in the Section 5.3.5.

Implementation Organization As mentioned in section Section 5.3.5, the testing infrastructure is based on shell (Bash) scripts. For better organization, a folder hierarchy was created to separate the different components. The top level hierarchy is:

ClientWorkingFolders represents the running folder for clients. On the


local machine, this folder stores the BitTorrent metales (.torrent) and the seeded les. On the remote host, this folder also stores downloaded les and the client logs.

CHAPTER 5. EVALUATING P2P OVERLAYS IN REALISTIC SCENARIOS

76

ControlScripts stores the actual implementation of the testing


infrastructure, as shell scripts. Some scripts are run on the command station, while others are executed on remote systems.

TestSpecs stores campaign and scenario conguration les, as well as the R


scripts used to process the experiment results. Conguration les are parsed on the command station for conguring a given experiment.

Results stores output information gathered on the local system from remote
hosts. It contains all client log les, as well as results of data processing employed after the experiment completes.

Utils stores various helper scripts that are not directly invoked by the testing
infrastructure. These scripts are used during the infrastructure development and are useful for running experiments. The scripts present in this folder may also be used for testing various parts of the infrastructure. The Results folder on the control station stores information about all the executed scenarios. In order to differentiate between two different executions of the same scenario, the results are stored in separate folders named after the pattern CampaignFileName-YYYY.MM.DD-HH.MM.SS. This ID is unique if two scenarios with the same name are not executed at the same time on the same machine (a plausible assumption given the intended usecases for the proposed infrastructure). With the exception of scripts used to run a campaign or a scenario or for post-processing, all other scripts are run on the remote systems. The scripts running a campaign or a scenario parse the conguration les on the command station and use SSH to command the scripts on the remote systems. The remote system scripts prepare the node for the experiment and manage the P2P clients (start, monitor, stop). The scenario_wait infrastructure component causes the command station to wait for all remote clients to complete the experiment. A remote client completes either by reaching a state dened by the scenario (for example completely downloading the requested le) or when the churn conguration implies a stop action (see Section 5.3.6). Subsequently, log les from remote clients are retrieved to the command station and parsed. The parsing process results in an unied generic format (consisting of table and matrix structured les) that is used as input for statistical analysis.

5.3.6

Introducing Churn in P2P Scenarios

One of the main goals of the proposed infrastructure is the to allow the user to introduce churn in the environment by controlling the periods when each client is connected to the network. An array of intervals included in the scenario conguration le species the on-off behavior for each of the clients. The schedule_client control script uses the UNIX signals SIGTOP and SIGCONT to suspend and resume the client processes at the specied moments of time.

CHAPTER 5. EVALUATING P2P OVERLAYS IN REALISTIC SCENARIOS

77

5.3.7

Bandwidth Limitation in P2P Scenarios

As mentioned before three solutions regarding bandwidth limitation have been analyzed for the current implementation. The basic solution is using and operating system based tool that works closest to the kernel. In case of Linux, our choice was the popular tc [Routing and HOWTO, 2011] (trafc control) tool allowing a variety of limitation algorithms. tc is being used in the current infrastructure for scenarios employing the physical systems. Due to particularities of the OpenVZ implementation, tc cannot be currently used as a bandwidth limiter between containers. We are currently looking for a solution for this issue and also alternatives such as LXC [LXC, 2011]. In order to bypass the issue of tc not running on OpenVZ container, we have resorted to using client level limitation (also known as a rate limiter) in current experiments. hrktorrent and transmission client possess implicit limitation functionality. This allows the benets of including bandwidth limitation to clients running in OpenVZ containers. This approach does have its downsides, as it is less exible and is process-centric one cannot limit the total amount of trafc sent by a client (e.g. a combination of P2P and HTTP trafc). Some clients, such as P2P-Next projects [P2P-Next, 2011] NextShare, have no implicit rate limiter. Rate limiting may still be enabled through the use of the trickle [trickle, 2011] tool. trickle uses a form of library interposition to hook network related API calls and limit per-process trafc. It has two drawbacks: it is not actively maintained and issues arise when using the poll library call; in case of Linux, epoll support is absent.

5.3.8

Simulating Connection Dropouts

This section analyses the possibility of simulating realistic network dropout behavior in the testing infrastructures. Three possible solutions are presented ([Deaconescu et al., 2011]): terminating client processes, suspending them and disabling the network interface. A series of experiments are run to compare the solutions.

Network Dropouts One of the main difculties in simulating network environments is reproducing network unreliability. Given the heterogeneous nature of both end-user computers and ISP equipments and policies, a real-life deployment of Peer-to-Peer systems encounters multiple types of underlying network issues: dropped packages, connection delays, connection dropouts. Each of the above mentioned network behaviors may have an inuence on application behavior. If connection delays and dropped packages are most of the times covered by

CHAPTER 5. EVALUATING P2P OVERLAYS IN REALISTIC SCENARIOS

78

TCP functionality, connection dropouts have a direct inuence on the application level (clients joining and leaving are a fundamental process of Peer-to-Peer system [?]). Considering one simple example of a swarm composed of an initial seeder and six leechers, if the seeder periodically leaves and rejoins the swarm, the nodes will require a longer period of time for the data transfer to be completed. BitTorrent clients create and maintain reputation traces for the peers with whom data was exchanged. In case a peer possesses an unreliable behavior, it will not be preferred for opening new connection slots and thus it will experience a decreased level of performance from the rest of the swarm. Connection dropouts also contribute to peer population churn [Stutzbach and Rejaie, 2006]. As nodes are disconnected from the swarm, the rest of peer population may have an improved or diminished performance, depending on the swarm state. Previous studies have estimated and analyzed the impact of churn on the behavior of Peer-to-Peer protocols [Binzenhfer and Leibnitz, 2007]. However, in most cases the analysis was based on simulations of the protocols ([Luo et al., 2010], [Katsaros et al., 2009], [Ou et al., 2010]) rather than using real implementations.

Dropout Simulation From the point of view of the swarm, a client connection dropout is equivalent with the client abruptly leaving the system. In this case neither the BitTorrent connections, nor the TCP links are closed in a graceful manner and swarm peers will experience multiple timeouts before declaring the connections closed. Such behavior can be reproduced using three solutions: stopping and restarting the clients, suspending them and disabling the network interface. The clients can be instantly stopped using the POSIX SIGKILL signal. When a process receives this signal, it is immediately stopped and all its opened connections are closed by the operating system. In order for the client to be resumed, it has to be restarted. When a process is suspended, it is placed in a temporary inactive state. The operating system does not close its connections and opened les. A process can easily be suspended by sending it a POSIX SIGSTOP signal. To resume the process (and place it in an active state), the POSIX SIGCONT signal has to be sent to the process. The solutions for separating the client from the swarm are different from the point of view of the TCP connection management: the stopped clients have all their connections closed, while the suspended clients can resume their connections if a timeout has not been reached at the moment of their resume. The client connection dropout can be induced by disabling the network interface. Using this approach the client will continue to run without any intervention from the operating system while its TCP connections will be closed. Disabling the network interface closely reproduces network dropouts occurring in the Internet.

CHAPTER 5. EVALUATING P2P OVERLAYS IN REALISTIC SCENARIOS Experimental Setup

79

Practical demonstration and evaluation for the connection dropout features have employed a virtualized infrastructure and a scripted framework running on top of it. The infrastructure allowed us to deploy up to 80 peers each running in a single virtual machine. All peers have been grouped in pairs to evaluate proposed solutions for simulating connection dropout features. The thin scripted framework was also responsible for collecting output from all peers (in the form of log les) and for parsing it. The infrastructure is constructed on top of several commodity hardware systems in the NCIT cluster from the University POLITEHNICA of Bucharest. In order to deploy a large number of peers, the thin virtualization layer employed OpenVZ [OpenVZ, 2011]. OpenVZ is a lightweight solution that allows rapid creation of virtual machines (also called containers). As an operating system-level virtualization solution, OpenVZ enables creation of a large number of virtual machines (30 virtual machines are sustainable on a system with 2GB of RAM memory). In our infrastructure each container is running a single BitTorrent client instance. All hardware systems used were identical with respect to hardware and software components: 2GB RAM, 3 GHz dual-core CPU, 300GB HDD, 1Gbit NIC, running Debian GNU/Linux 5.0 Lenny. The deployed experiments used a single OpenVZ container for each peer taking part in a swarm. A virtualized network has been build allowing a direct link layer access between systems all systems are part of the same network; this allows easy conguration and interaction. A separate hardware system, also called Commander, was used to start and handle scenarios. The Commander uses SSH for communicating with the virtualized environment. Connections between Commander and containers are handled through secondary specialized venet interfaces; the presence of the Commander connections and the virtualized network created an easy testing ground for disabling and enabling interfaces the interfaces employed for dropping connections were those part of the virtualized network. The experiments made use of an updated version of hrktorrent [hrktorrent, 2011], a lightweight application built on top of libtorrent-rasterbar [libtorrent (Rasterbar), 2011]. Previous experiments [Deaconescu et al., 2009] have shown libtorrent-rasterbar outperforming other BitTorrent implementations leading to its usage in the current experiments. The hrktorrent has been updated to make use of bandwidth limitation facilities provided by libtorrent-rasterbar. Deployed scenarios have forced a 100 KB/s download limit. To evaluate the impact of the dropout, clients have been grouped into pairs (one seeder and one leecher), the leechers downloading capacity being temporary disabled. Together with the seeder and leecher, a tracker is started on the same container as the seeder to allow BitTorrent communication between the two nodes. The use of a two-peer swarm reduces the impact of undeterministic BitTorrent communication inside the swarm. At the same time, this setup emphasizes the activity of the target leecher by clearly targeting one BitTorrent connection. Placing the tested leecher in a larger swarm would make the results less accurate.

CHAPTER 5. EVALUATING P2P OVERLAYS IN REALISTIC SCENARIOS

80

The duration of the download has to be large enough to allow the experiments to avoid the BitTorrent-specic start-up behavior. Given the leechers bandwidth limitation of 100 KB/s, a 100MB le was chosen to be shared by the seeder. This creates a theoretical duration of 1000 seconds for each complete download. However, the size of the shared les has no effect on the connection dropout behavior: the leechers bandwidth will always be saturated at the moment of the dropout simulation. Time recovery is the interval of inactivity of a given peer and, thus, of a connection between two peers. Through various methods, the connection between peers is disabled and, after a given period of time, re-enabled. Time recovery is the focus of our experiments and may of may not differ signicantly from the connection interrupt timeout. The connection behavior of each of these methods provides insight on their suitability for various simulation scenarios. Each seeder-leecher pair is executing the following scenario schedule:

start tracker, create .torrent les start seeder, leecher wait a predetermined amount of time for swarm initialization disable the leecher, in accordance to the respective connection dropout solution
(suspend the process, terminate it or disable the network interface)

wait a test-specic ammount of time enable the leecher wait for swarm stabilization
The amounts of time before the leecher is disabled and after it is enabled have been used to ensure swarm stabilization and proper results. At this point, the swarm initialization time is 15 seconds, while the swarm stabilization time is 45 seconds. These time intervals have been chosen empirically from experimentation and experience. A thorough analysis of the optimum values for the time intervals is set as further work. A leecher-seeder pair is terminated after each scenario and restarted in order to take part in the next one. A complete test suite for a given pair implies using increasing amounts of timeout between disabling and enabling the leecher. This is, for every situation, equivalent to an increase in the duration of the connection dropout and forms the basis for subsequent analysis. The values chosen for timeout intervals are measured in seconds in geometric progression: 4 seconds, 8 seconds, 16 seconds, 32 seconds, 64 seconds, 128 seconds, 256 seconds, 512 seconds. They cover a simulated dropout timeout ranging from a few seconds to close to 10 minutes.

Scenarios and Results The virtualized infrastructure and scripted framework have been used as an evaluation suite testing-ground for the proposed dropout equivalent solutions. The evaluation suite

CHAPTER 5. EVALUATING P2P OVERLAYS IN REALISTIC SCENARIOS ifdown pause(s) 8 16 32 64 128 256 512 mean(s) 12.10 23.84 47.16 79.66 146.37 288.58 535.42 rsd(%) 10.88 16.73 4.50 13.4 8.05 2.15 0.09 suspend mean(s) 9.80 17.12 32.33 72.16 142.11 259.90 532.42 rsd(%) 4.30 2.33 8.87 26.83 11.56 1.56 2.11 stop mean(s) 12.21 19.95 36.36 67.61 131.80 260.26 515.81 rsd(%) 11.61 9.04 4.11 2.45 1.14 0.69 0.33

81

Table 5.1: Recovery timeout for different scenarios has employed virtualized containers to create leecher-seeder pairs. Each pair is used to transfer a 100 MB le from an initial seeder to one leecher. A tracker is also started on the same container as the seeder to mediate communication. The leecher uses a 100 KB/s limitation. Information from both the leecher and the seeder is collected as log les and parsed subsequently. The accessibility of a high number of virtualized systems and the small swarm size (two peers, one leecher and one seeder) allows rapid deployment of scenarios and easy repeatability. A single running suite employes 39 swarm pairs, 13 pairs for each proposed solution. Through the scripted interface each pair is sequentially simulating a series of connection dropouts. Each series consists of dropping the connection for 4 seconds, 8 seconds, 16 seconds and so on until 512 seconds. A simulated dropout is equivalent to a connection interrupt for the given amount of time. Information used for analysis has been collected in the form of log les from BitTorrent clients. SSH, rsync and shell scripts have been glued together in order to collect and parse relevant information. Statistical processing has been employed in the form of R language scripts for mean values, standard deviation, graphics, etc. The main goal of the employed experiments is to measure and compare the recovery time after each connection drop out for each of the three proposed solutions. By employing diverse timeout intervals we present similarities and differences between suspending and terminating clients or disabling network interfaces in order to simulate connection dropout. Table 5.1 summarizes the results of the employed experiments. The three main methods used for simulating connection dropouts are identied by ifdown, suspend and stop. For each method, the mean value of the recovery timeout and the relative standard deviation have been computed. The column dubbed pause signies the timeout implied on the given process. The mean column is the mean of the measured values, measured in seconds, while the rsd column is the relative standard deviation (percentage). All three methods offer similar results, with recovery time values very close as the scheduled pause time increases. The suspend and stop are usually very close to the real pause value, while the ifdown method is usually further from that value. Due to

CHAPTER 5. EVALUATING P2P OVERLAYS IN REALISTIC SCENARIOS

82

their similar results, we consider safe to assume that the suspend and stop methods may be used interchangeably in order to simulate connection dropouts. As a connection dropout is usually caused by a client process being terminated, the stop method is the most appropriate choice. If available or easier to deploy, the suspend method may be used with similar effects. Figure 5.4 shows a graphical representation of the recovery time for the solutions involving interface disable and terminating the process with respect to the leecher timeout interval. We have found the results regarding suspending to process to be inconclusive due to improper swarm initialization time and have been left out.
Time Recovery in Simulated Dropouts

500

Dropout types ifdown sigkill sigstop

400

Recovery time (s)

300

200

100

16

Leecher dropout time (s)

32

64

128

256

512

Figure 5.4: Time recovery with respect to dropout interval

Result Analysis The rst solution (ifdown) brings down the network interface of the peer in order to simulate the end of the connection. Results have shown that recovery time, although in

CHAPTER 5. EVALUATING P2P OVERLAYS IN REALISTIC SCENARIOS

83

the same range, is higher than the other methods and we conclude that such a solution would be mostly suitable for simulating connection dropouts due to network failure. The second solution (suspend) suspends clients during the simulated dropout (using SIGSTOP) and resumes them afterwards (using SIGCONT). Although suspending peers is not a common action, results have shown that this method is similar to the stop method with respect to recovery time. In an environment where suspending peers is easier to be achieved than stopping them, such a solution could prove suitable and provide realistic results. The third solution (stop) consists of stopping (using SIGKILL) and restarting the clients. This solution (although aggressive) is considered to be the best approximation of a realistic behavior of a connection dropout, as peers have typically high dynamics when entering and exiting a swarm.

5.4

Running Experiments

One of the main goals of the testing infrastructure is to relieve the experimenter of the burden of experiment management and monitoring, providing an extensive tool for managing both clients and log les. As much of the experiment as possible should be run in background with little input from the user. By use of the proposed testing infrastructure, the activity of managing clients, sending commands and collecting information is completely automated, leaving the experimenter with only three tasks to accomplish, sequentially: 1. create the client-specic scripts 2. create the campaign conguration and the scenario conguration les 3. run the campaign startup script After lling the required information in the campaign and scenario conguration les, the user running the experiment starts the campaign through the use of a control script that receives, as argument, the name of the campaign conguration le. The script parses the conguration le and creates and manages a swarm for each scenario accordingly. In order to limit the possibility of the user accidentally stopping the campaign control script, it is recommended to detach the running terminal using tools such as screen, nohup or dtach. After completion of campaign experiments, all output information and R processed graphics les are stored in the Results/ folder, in a subfolder named after the campaign. This campaign folder contains a sub-set of folders, one for each scenario, as shown in the le system tree below. The actual log and graphics les are stored in the per-scenario folders.
. -- campaign01-2010.07.27-14.02.24 |-- err.log

CHAPTER 5. EVALUATING P2P OVERLAYS IN REALISTIC SCENARIOS


|-|-| | | |-| | | -index.html scenario01 |-- err.log |-- <log files> -- <graphics files> scenario02 |-- err.log |-- <log files> -- <graphics files> scenario03 |-- err.log |-- <log files> -- <graphics files>

84

Command logging in scripts is enabled to allow error handling in case anything unusual or unexpected occurs. A specialized log le is stored in the campaign folder and each scenario subfolder.

5.5

Use Case

Figure 5.5 depicts the evolution of peer download speed with respect to download percentage in a 90 peer swarm consisting of 50 seeders and 40 leechers. All peers are limited to 8Mbit/s upload and download speed. A 700MB le was used for content distribution among peers.
Test swarm, 90 peers 50 Seeders, 40 Leechers; 8Mbit limitation
8

Download speed (Mbit/s)

0 0 10 20 30 40 50 60 70 80 90 100

Percent

Figure 5.5: Scenario output: download speed evolution

CHAPTER 5. EVALUATING P2P OVERLAYS IN REALISTIC SCENARIOS

85

In order to setup and run the scenario and to publish graphical results as seen above, the following steps have been undertaken, as generally described in Section 5.4. The scenario in Figure 5.5 is part of a campaign consisting of as much as 103 scenarios. Each scenario simulates a particular swarm by varying either the number of peers, seeders, leecher, maximum number of connections per peer, upload/download speed. For each scenario, the campaign conguration le species the scenario conguration le and the R script used for post-processing. The scenario conguration le describes all peers in the swarm. In this particular case it denes the use of 90 peers 50 seeders and 40 leechers. All peers (either seeders or leechers) are limited to 8Mbit/s upload and download speed. Each peer is described to use the same .torrent le corresponding to the 700MB distribution data le. The scenario conguration le denes the remote system on which the BitTorrent client is going to be run. The BitTorrent client is identied by a unique client string used by the scenario run script. The R script is to be run when the campaign completes. A specic R script is to be run for for each scenario le. The R script parses output information from clients in the scenario (upload speed, download speed, peer connections) and renders graphical interpretations such as the one in Figure 5.5. Typically, the R script uses the campaign name to identify generated gures. Figures are generated in PNG and EPS format. Given the large size of the campaign (103 scenarios), the conguration les and R scripts were automatically generated. For this purpose a simple description le and a script have been implemented. An experimenter will simply have to ll a CSV (comma separated values) le and then use the above mentioned script to generated all infrastructure required conguration les. With the campaign conguration le, scenario conguration les and R scrips generated, the experimenter simply needs to run the campaign script from the ControlScripts folder. After the campaign completes, the campaign run script is going to collect all client output information from remote systems. It will also run the R script corresponding to each scenario. The R scripts use the collected information as input and renders it as graphical images as seen in Figure 5.5.

5.6

Conclusions

This chapter presented a new approach to building an automated infrastructures for Peer-to-Peer testing. The proposed infrastructure is built on top of a thin virtualization layer allowing easy deployment of experimental scenarios involving Peer-to-Peer clients. Main design goals for the infrastructure were providing an extensive tool for managing both clients and log les, using a common interface for accessing remote systems, offering support for bandwidth control and allowing the user to introduce churn

CHAPTER 5. EVALUATING P2P OVERLAYS IN REALISTIC SCENARIOS

86

in the environment. The infrastructure uses a hierarchical set of conguration les and run scripts and has been deployed for a variety of Peer-to-Peer experiments. The physical infrastructure is currently hosted in the UPB NCIT cluster and consists of 10 identical hardware systems. A thin OpenVZ virtualization layer allows easy multiplication of base systems. We are able to safely deploy 100 virtualized systems; most scenarios use a virtual environment as a sandbox for running a single BitTorrent client. A second large scale experiment used the infrastructure to deploy scenarios over 250 virtual containers. Tools such as brctl, iptables or tc have been employed for ensuring proper network conguration between virtual environments. On top of the physical infrastructure, a exible shell script based software framework is used to setup and manage experimental scenarios. The interface uses a series of text conguration les (campaign les and scenario les), SSH and rsync to setup and start experiments. All conguration and interaction is achieved through a single system, allowing ease of use and centralized management. Client monitoring and logging is enabled and R script are used to automatically process collected data. The main advantage of the proposed infrastructure when compared to other solutions is automation coupled with easy deployment. The use of a single commanding station, shell scripts, SSH and rsync allows the user to rapidly deploy a given scenario. The use of OpenVZ virtualization allows consolidation a small number of hardware nodes are used to create a complete virtualized framework capable of running 100 sandboxed BitTorrent clients. With the use of Linux specic networking tools, the user may dene bandwidth limitation and network topology characteristics in order to simulate realistic scenarios. As of this writing the infrastructure has been up and running for one year. Tracker interaction scripts have been added to allow deployment of experiments consisting of multiple trackers. Various BitTorrent clients (hrktorrent, nextshare, swift) have been congured and deployed to provide valuable information regarding performance. Since the initial implementation new scripts have been added for client monitoring and data processing, proving the exibility of the infrastructure.

Chapter 6 Improvement Evaluation for Overlay Communication Protocols


Chapter 4 introduced a novel explicit relaying architecture focused on the enhancing the overlay patterns of the P2P systems, designed to enhance the transfer performance and user privacy over the BitTorrent. This chapter presents implementation details for the explicit relaying protocol and analyses its performance in a number of test scenarios. The performance of the proposed proxy mechanism is compared to the regular BitTorrent protocol performance.

6.1

Explicit Relay Architecture Implementation

The proposed explicit architecture was implemented over the Tribler client. Most of the changes made to the client were in its core, with a command-line interface offering the required options for starting the client as a doe or as a proxy. The architecture of the implementation is presented in gure 6.1. Given the nature of the protocol, the ProxyService had to be integrated with most of the key Tribler components: the BitTorrent download core, the overlay, the peer management database. A new type of downloader was introduced in Tribler, called ProxyDownloader. It manages all the components required to transfer data via the proxy layer: the ProxyPiecePicker (responsible for selecting the pieces requested to the proxy nodes), the Doe (that manages the communication with the set of proxies) and the Proxy (that transfers data to the doe node). The ProxyPiecePicker uses the information sent by the proxy nodes, via the PROXY_HAVE messages, to decide what pieces are available in the swarm and what piece is the next one to be requested tot the proxy nodes. A separate component, called ProxyPeerManager, handles the discovery of proxy nodes. The ProxyPeerManager is integrated with Buddycast and, for each message advertising a new proxy, it stores that information in the peer database. When the Doe 87

CHAPTER 6. IMPROVEMENT EVALUATION FOR OVERLAY PROTOCOLS

88

component requires a new proxy, it interrogates the ProxyPeerManager and receives the requested data from it. The updated Tribler core exposes a set of settings to the external modules, for managing the relay properties of a download. The Proxy Service uses three sets of state variables, accessible in the session conguration or download conguration. The proxyservice_status (table 6.1) controls the activation of the proxy service. If this service is not enabled (PROXYSERVICE_OFF), the current node can not be used as a relay and ca not use other node as relays (it ca not assume the role of a doe). This setting is valid for the entire session of a client. Table 6.1: proxyservice_status settings (available in the session conguration) proxyservice_status value Description the current node can send RELAY_REQUEST messages the current node analyses RELAY_REQUEST messages the current messages ignores

PROXYSERVICE_ON

PROXYSERVICE_OFF

RELAY_REQUEST

The doe_mode (table 6.2) species, for each download, the data is retrieved directly from the swarm (DOE_MODE_OFF), using only proxy nodes (DOE_MODE_PRIVATE) or by using a combination of the both direct download from the swarm and proxy nodes (DOE_MODE_SPEED). This information is valid for the current download. Table 6.2: doe_mode settings (available in the download conguration) doe_mode value Description the current download is not using proxy relays the current download is using only proxy relays to retrieve the content the current download is using both proxy relays and direct swarm connections to retrieve the content

DOE_MODE_OFF DOE_MODE_PRIVATE

DOE_MODE_SPEED

The proxyservice_role (table 6.3) species, for each download, if the current node acts as a doe (PROXYSERVICE_ROLE_DOE), if the current node acts as a proxy, having the download started at the request of a doe (PROXYSERVICE_ROLE_PROXY), or if the download is a regular one (PROXYSERVICE_ROLE_NONE).

CHAPTER 6. IMPROVEMENT EVALUATION FOR OVERLAY PROTOCOLS

89

Figure 6.1: Proxy Service implementation architecture

CHAPTER 6. IMPROVEMENT EVALUATION FOR OVERLAY PROTOCOLS

90

Table 6.3: proxyservice_role settings (available in the download download conguration) proxyservice_role value Description for the current download a doe role is active if DOE_MODE_OFF it means currently there are no proxies relaying the current download acts as a proxy for a doe node the current download is neither a doe, neither a proxy

PROXYSERVICE_ROLE_DOE

PROXYSERVICE_ROLE_PROXY PROXYSERVICE_ROLE_NONE

6.2

Analysis of Architecture Performance

This section presents two test campaigns aimed at evaluating the performance obtained by transferring data via the proxy nodes. The regular BitTorrent protocol results are compared to the results obtained when proxy replays are placed between the doe node and the swarm. As each extra relay added on the path between the doe and the swarm adds a delay and may cause a performance drop compared to the direct connection, it is expected that in some scenarios the usage of the Proxy Service will increase performance while in others it will decrease it.

6.2.1

Small Scale Analysis

A preliminary evaluation of the proxy overlay aimed at placing the BitTorrent extension in a small number of comparable scenarios. The effects of the proxy overlay on the performance were measured by the variation of the average transfer throughput. The evaluation was performed on 9 cluster computers, using a 1 Gbit Ethernet link between them. The proxy nodes were used as a single layer of proxies. This analysis did not limit the bandwidth for any of the participating nodes. This allowed the two protocols (regular BitTorrent and the proxy mechanism) to ll the network link at their maximum capacity. The results are presented in Table 6.4. The reference measurements are presented in the rst group of lines and represent the use case where there are no proxy relays between the seer nodes and the Doe Nodes (regular BitTorrent protocol). With one seeder and one Doe (1-0-1), the average throughput was 78.57 Mbps. When one Proxy relay was added (1-1-1), the performance dropped to 69.48 Mbps. This drop is explained by the fact that the proxy node had to use its bandwidth for uploading and downloading simultaneously, performing worst than an node that only uploads or only downloads data.

CHAPTER 6. IMPROVEMENT EVALUATION FOR OVERLAY PROTOCOLS Table 6.4: Preliminary performance evaluation Seeders 1 2 1 2 4 1 2 1 4 Proxy Nodes 0 0 1 1 1 2 2 4 4 Doe Nodes 1 1 1 1 1 1 1 1 1 Average Throughput (Mbps) 78.57 94.98 69.48 71.88 71.21 76.85 90.99 81.41 88.03

91

When two Proxy nodes were added (1-2-1) the performance increased to 76.85 Mbps, similar to the control test with no proxy nodes. The two proxy nodes combined, sharing the network load, were able to offer the same level of performance as the direct connection between the Doe and the seeders. In the last scenario four Proxy nodes were placed between the Doe and the swarm. In this case the obtained performance was 81.41 Mbps, better than the control test. The performance increase is based on the difference of complexity between the Proxy Overlay protocol and the BitTorrent protocol. The Doe node sent requests to the Proxies and received from them the pieces directly using a simpler state machine. Also, dividing the same data transfer between multiple TCP streams reduces the delay between two sent/received packets and improves throughput.

6.2.2

Large Scale Analysis

A more detailed performance analysis was made on a larger number of cluster nodes and with a larger number of test cases. The aim is to obtain a clear view of the scenarios where the proxy overlay introduces a performance increase and to have a clear evaluation of the throughput compared to the regular BitTorrent protocol. To execute the large scale analysis, a new virtualized computer cluster was congured. The cluster is formed of 35 hardware machines with the following conguration:

dual-core CPU with each core running at 3.0 GHz 2GB RAM memory 80GB HDD 1Gbps Ethernet connection
Each of the 35 hardware machines was congured to execute 10 LXC containers, creating a total of 350 virtual containers. Each container had a limited network

CHAPTER 6. IMPROVEMENT EVALUATION FOR OVERLAY PROTOCOLS

92

connection of 10Mbps. The packets sent between containers running on different hardware machines were routed without the use of NAT. The large scale analysis was composed of 36 scenarios, each with more than 250 peers. In each scenario, a 700MB random-generated data le was transferred from the seeders to the leechers and the doe (if the scenario included a doe node). The size of the transferred le was chosen such that each download would require approximately 10 minutes to be completed, allowing the transfer rate to be stable. Table 6.5, table 6.6, table 6.7 and table 6.8 present the detailed structure of each scenario. All the proxy nodes were used in a single layer between the doe and the swarm. Table 6.5: Large scale performance analysis - regular BitTorrent Id 1 2 3 4 5 6 7 8 9 Seeders 225 200 175 150 125 100 75 50 25 Leechers 25 50 75 100 125 150 175 200 225 Proxies 0 0 0 0 0 0 0 0 0 Doe 0 0 0 0 0 0 0 0 0 Total nodes 250 250 250 250 250 250 250 250 250 Avg. Thr. (Mbps) 6.61 7.83 7.04 5.80 4.76 3.44 2.91 2.54 2.29

Table 6.6: Large scale analysis performance analysis - 5 proxies Id 1-p1 2-p1 3-p1 4-p1 5-p1 6-p1 7-p1 8-p1 9-p1 Seeders 225 200 175 150 125 100 75 50 25 Leechers 25 50 75 100 125 150 175 200 225 Proxies 5 5 5 5 5 5 5 5 5 Doe 1 1 1 1 1 1 1 1 1 Total nodes 255 255 255 255 255 255 255 255 255 Avg. Thr. (Mbps) 6.68 7.36 5.54 5.65 6.96 5.44 5.08 4.26 2.03

Each scenario required, on average, 1 hour to be completely executed (from the

CHAPTER 6. IMPROVEMENT EVALUATION FOR OVERLAY PROTOCOLS Table 6.7: Large scale analysis performance analysis - 10 proxies Id 1-p2 2-p2 3-p2 4-p2 5-p2 6-p2 7-p2 8-p2 9-p2 Seeders 225 200 175 150 125 100 75 50 25 Leechers 25 50 75 100 125 150 175 200 225 Proxies 10 10 10 10 10 10 10 10 10 Doe 1 1 1 1 1 1 1 1 1 Total nodes 260 260 260 260 260 260 260 260 260 Avg. Thr. (Mbps) 6.43 4.49 4.96 5.38 6.20 5.68 5.40 4.24 1.59

93

synchronization of required les to collecting of output log les). The overall runtime of the analysis was approximately 40 hours. The scenarios described in table 6.5 represent the control measurement and use the regular BitTorrent protocol. They reproduce a regular BitTorrent swarm while variating the seeder/leecher ratio. The scenarios from table 6.6 introduce 5 proxy nodes that help the doe retrieve the data. The doe node used only proxy connections to download the le. With a similar design, the scenarios from table 6.7 and table 6.8 introduce 10 and respectively 15 proxy nodes between the doe and the swarm. For the scenarios in table 6.5 the Avg. Thr. represents the average download throughput for all the leechers. The average value includes the initial period of time when the leecher is waiting for connections with other peers. The scenarios presented in table 6.6, table 6.7 and table 6.8 show the average download throughput for the doe node. All values are expressed in Mbps (megabit per second). The analysis of the results for table 6.5 shows a peak of download speed for the scenario number 2. This can be explained by analyzing the steps a seeder follows when it is being started. After the client is fully initialized, it checks the hash of the local stored le against the hash from the .torrent le. This operation requires a large number of disk reads, slowing all seeders and delaying the moment when the leechers start receiving data. For all subsequent scenarios from each group, the le data is being red from a cache, so the seeder startup time is shorter. As mentioned above, the average leecher throughput value includes the initial period of time when the leecher is waiting for connections with other peers, so this average value is larger for all scenarios starting with the second one in each group. This behavior does not affect the analysis. The rst scenario of the set described in table 6.5 will only be compared with the rst scenario from the other three sets, all of them being subjected to the same running conditions. The results of the large scale analysis are presented in gure 6.2. The seeder/leecher

CHAPTER 6. IMPROVEMENT EVALUATION FOR OVERLAY PROTOCOLS Table 6.8: Large scale analysis performance analysis - 15 proxies Id 1-p3 2-p3 3-p3 4-p3 5-p3 6-p3 7-p3 8-p3 9-p3 Seeders 225 200 175 150 125 100 75 50 25 Leechers 25 50 75 100 125 150 175 200 225 Proxies 15 15 15 15 15 15 15 15 15 Doe 1 1 1 1 1 1 1 1 1 Total nodes 265 265 265 265 265 265 265 265 265 Avg. Thr. (Mbps) 5.04 4.60 5.07 6.07 6.89 5.30 5.01 4.70 2.96

94

ratio is marked on the horizontal axis below the scenario number.


Large scale analysis results
Experiment types regular_throughput p1_throughput p2_throughput p3_throughput 6

Average Throughput (Mbps)

1 (r=9.00)

2 (r=4.00)

3 (r=2.33)

4 (r=1.50)

Scenario

5 (r=1.00)

6 (r=0.67)

7 (r=0.43)

8 (r=0.25)

9 (r=0.11)

Figure 6.2: Large scale analysis results For a regular BitTorrent download, the second scenario offers the largest throughput for the leechers. As seeder/leecher ratio decreases, the download throughput reaches

CHAPTER 6. IMPROVEMENT EVALUATION FOR OVERLAY PROTOCOLS

95

almost 20% of the available bandwidth. Scenario number 9 reproduces a ashcrowd event, when a large number of leechers join the swarm in a limited period of time. The P2P system does not have enough resources for all joining nodes, offering them only a fraction of their available throughput. The introduction of the proxy nodes does not bring any benet in the rst two scenarios. The number of seeders in the system is larger than the minimum required (seeder/leeder ratio r=8) for completely lling the bandwidth of the leechers. When the seeder/leecher ratio decreases, the advantages brought by the proxy nodes allow the doe to outperform regular nodes. For r=1.5 the obtained throughput is equal for all four sets of scenarios. When r=0.25, the doe has an almost double throughput compared to regular nodes. For r=0.11 the advantages of the proxy technology are no longer visible. The number of leechers in the system is too large and adding new peers that download data does not harvest more throughput. When the scenario groups P1, P2 and P3 are compared, the obtained results are almost equal. For the simulated environment, increasing the number of proxy nodes does not radically increase the obtained throughput. However, this occurs because the proxy nodes that were used had the same available bandwidth as the rest of the peers. Depending on the network conditions, increasing the number of proxy nodes could create different results.

6.3

Large Scale Technical Trials

Apart from the analysis presented in the previous section, the Proxy Service protocol was tested in a real-life condition with approximately 20.000 clients executing a test download via the proxy relays. This section describes the trials and analyses their results.

6.3.1

Trial Description

The proxy technology was put in real-life conditions and its behavior was tested. For this purpose a test was developed that required the users to download a xed amount of data using a set of four proxy servers. The test had two main goals: to observe the behavior of the proxy core in real-life networking conditions and to evaluate the proxy discovery mechanism. Each of the clients participating in the trial automatically submitted log messages to a central server. The logs gathered from the clients participating in the test were processed and a set of graphs was realized to summarize the main goals previously mentioned above. The code responsible for executing the test was added to the Tribler 5.3 release code. After installing Tribler 5.3, the test code would automatically be executed only once. Subsequent installations of the same kit do not cause the re-execution of the test.

CHAPTER 6. IMPROVEMENT EVALUATION FOR OVERLAY PROTOCOLS

96

Approximately 20.000 clients took part in the trial during the period December 9th 2010 - February 5th 2011. Figure 6.3 shows the daily distribution of total number of reporting IP addresses for the duration of the test. The graph includes two peaks: the rst one corresponds to the announcement of the Tribler release on the http://www.torrentfreak.com web site, and the second one occurred then the release announcement was published on http://www.reddit.com.
Total number of T5.3 IPs per day

2,000

Total number T5.3 reporting IPs

1,500

1,000

500

0 6/12/2010 13/12/2010 20/12/2010 27/12/2010 3/01/2011 10/01/2011 17/01/2011 24/01/2011 31/01/2011 7/02/2011

Date

Figure 6.3: The daily distribution of total number of reporting IP addresses Figure 6.4 presents the number of clients that participated in the trial, per country. Only the countries with more than 100 clients are included in the graph.

6.3.2

Trial Analysis

For the proxy performance in the real-world, a correlation was made between the transfer rate for the rst 25% and last 25% of the test download. Figure 6.5. presents the results. The same correlation was repeated for the rst 50% and the last 50% of the test download and is presented in Figure 6.6. The median performance achieved was over 1 Mbit per second and the maximum performance was over 18 Mbit per second. Figure 6.7 presents the evolution (between 0 and 120 minutes) of the average number of nodes discovered by each client. Time 0 is dened as the moment the rst node was discovered via BuddyCast. At the moment T=10 minutes after the client started, more than 90 different clients (and potential proxy relays) were discovered, on average. Given the BuddyCast aggressive startup discovery policy, approximately 65 peers are discovered in the rst minute after the client starts.

CHAPTER 6. IMPROVEMENT EVALUATION FOR OVERLAY PROTOCOLS


Number of T5.3 recorded IPs per contry 5699

97

5,000

4,000

Number of T5.3 IPs

3,000

2102
2,000

1825 1580 559 1

1,000

919 896 531 440439 383 353351 300296 228

208197196 174169 158 138134 122113111

106

0 US ES HU DE IT CA GB PL AU RU MX SE NL IN RO CN AR TR FR BR NZ AT CO CH SG BE CL IE

Country

Figure 6.4: The distribution of clients participating in the trial per country
T5.3 proxy transfer rate correlation for the first and last 25%
2500

2250

2000

T5.3 proxy transfer rate for the last 25% (KBps)

1750

1500

1250

1000

750

500

250

0 0 250 500 750 1000 1250 1500 1750 2000 2250 2500

T5.3 proxy transfer rate for the first 25% (KBps)

Figure 6.5: The correlation of the download speed for the rst 25% and last 25% of the download Figure 6.8 presents the average number of nodes discovered by each participating client

CHAPTER 6. IMPROVEMENT EVALUATION FOR OVERLAY PROTOCOLS


T5.3 proxy transfer rate correlation for the first and last 50%
2500

98

2250

2000

T5.3 proxy transfer rate for the last 50% (KBps)

1750

1500

1250

1000

750

500

250

0 0 250 500 750 1000 1250 1500 1750 2000 2250 2500

T5.3 proxy transfer rate for the first 50% (KBps)

Figure 6.6: The correlation of the download speed for the rst 50% and last 50% of the download
Evolution of average cumulative number of T5.3 nodes discovered per IP
110

100

Average cumulative number of T5.3 nodes discovered per IP

90

80

70

60

50

40

30

20

10

0 0 10 20 30 40 50 60 70 80 90 100 110 120

Time (minutes)

Figure 6.7: The evolution of the average number of nodes discovered by each client for the rst 120 minutes of the trial

CHAPTER 6. IMPROVEMENT EVALUATION FOR OVERLAY PROTOCOLS

99

daily. By comparing this graph with Figure 6.3, it can be observed that the rst peak (corresponding to the announcement of Tribler release on the http://www.torrentfreak website) is not present. This is an indication of a large number of one-time users, people that tried the software for a small amount of time and never stated it again.
Average T5.3 nodes discovered per IP per day

250

200

T5.3 nodes discovered per IP per day

150

100

50

0 6/12/2010 13/12/2010 20/12/2010 27/12/2010 3/01/2011 10/01/2011 17/01/2011 24/01/2011 31/01/2011 7/02/2011

Date

Figure 6.8: The average number of nodes discovered daily Figure 6.9 presents the 1-CDF function for the daily discovered nodes. More than 80% of the nodes discovered at least 70 peers daily. More than 50% of the nodes discovered at least 100 peers daily.

6.4

Conclusions

The Proxy overlay was implemented on top of the Tribler client and a command-line interface is offered to start the proxy-enabled clients. The architecture performance analysis was split into a small-scale test an a large scale test. The small-scale test proved that, in an environment with unlimited bandwidth the proxy architecture could outperform the regular BitTorrent distribution. The large scale test placed the proxy technology in a number of realistic scenarios, showing that for seeder/leecher ratios smaller than r=1.5 the use of proxy nodes increases the throughput of the end-node reaching up to twice the performance of regular BitTorrent. A large technical trial was executed using approximately 20.000 Internet clients. The trial focused on observing the behavior of the proxy core in real-life networking conditions and on evaluating the proxy discovery mechanism. The trial results showed that the median

CHAPTER 6. IMPROVEMENT EVALUATION FOR OVERLAY PROTOCOLS


1CDF of T5.3 nodes discovered daily

100

80%

Percentage of reports

60%

40%

20%

0% 0 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000 2100

T5.3 nodes discovered daily

Figure 6.9: 1-CDF of nodes discovered daily performance achieved was over 1 Mbit per second and the maximum performance was over 18 Mbit per second. More than 80% of the nodes discovered at least 70 potential proxies daily.

Chapter 7 Conclusions
Peer-to-Peer systems are a class of distributed systems that rely on resources shared by the clients to perform the application tasks. The overlay network created by the nodes is the central component of a P2P architecture. The central goal of this thesis was to improve P2P systems by enhancing the overlay communication between nodes. The thesis proposes a proxy mechanism used as an intermediate layer between the server and the client nodes. The proxy layer allows bandwidth to be allocated to nodes with high demand, increasing their throughput. Experimental results have shown that the performance of the proposed mechanism is better than the performance of regular BitTorrent in a large number of scenarios. At the same time, the proxy layer can be used to offer the users a shield of plausible deniability enhancing their privacy. A testing infrastructure was designed to ease the management of P2P experiments using computer clusters. The infrastructure was used to perform a large scale analysis of the proposed proxy technology using a virtual cluster of 350 nodes. The results of the thesis were integrated in P2P-Next EU FP7 project. The P2P-Next integrated project builds a next generation Peer-to-Peer (P2P) content delivery platform, that is designed, developed, and applied jointly by a consortium consisting of high-prole academic and industrial players with proven track records in innovation and commercial success.

7.1

Overview of the Thesis

The main components of the thesis are presented in gure 7.1. The central goal, to improve the communication within the P2P systems at the overlay level, is supported by a set of principal contributions, detailed in their respective chapters. The thesis presented an overview of the Peer-to-Peer systems. A brief history of the different P2P applications was included and the BitTorrent protocol was introduced. Currently the main Peer-to-Peer platform, BitTorrent is a le sharing protocol with proved scalability and robustness. Three BitTorrent connected areas were presented, 101

CHAPTER 7. CONCLUSIONS

102

Figure 7.1: Overview of main thesis components classifying the overlay protocols in peer discovery, content discovery and reputation mechanisms. The analysis of the P2P system behavior and the quality of its services was made by three major metrics: performance, reliability and efciency. Performance translates mainly, but not only, into throughput, reliability translates into the capacity of a system to maintain the performance in a variety of conditions and the efciency compares the available resources with the resources required to keep the performance of the services. The thesis presents a proposed proxy extension for the BitTorrent protocol that uses an overlay of proxy nodes to intermediate the BitTorrent connections between peers. The proposed BitTorrent extension is detailed at the protocol level. It allows the nodes participating in the same system to transfer bandwidth between each other, creating the rst steps for a bandwidth-as-a-currency market. A Proxy discovery mechanism is implemented on top of the Buddycast protocol, taking advantage of its epidemic design. As Buddycast maintains an overlay between active Tribler peers, the same peers would be best candidates to use as Proxy nodes, if the Proxy service is enabled. The Proxy overlay is implemented on top of the Tribler client. A large technical trial was executed using approximately 20.000 Internet clients. The trial focused on observing the behavior of the proxy core in real-life networking conditions and on evaluating the proxy discovery mechanism. A small and a large scale analysis were performed using a cluster of computers. The results have shown that the proxy technology can outperform the throughput offered by the regular BitTorrent client.

CHAPTER 7. CONCLUSIONS

103

A new approach is presented for building automated infrastructures for Peer-to-Peer testing. The proposed infrastructure is built on top of a thin virtualization layer allowing easy deployment of experimental scenarios involving Peer-to-Peer clients. The infrastructure provides an extensive tool for managing both clients and log les, uses a common interface for accessing remote systems, offers support for bandwidth control and allows the user to introduce churn in the environment.

7.2

Summary of Contributions

The main contribution of the thesis is the proxy mechanism offering increased throughput and privacy to the users. The proxy layer offers full control to the user, allowing him to enable or disable relaying, to specify the number of proxies used and to set the system towards performance or towards privacy. The performance of the proxy mechanism exceeds the raw BitTorrent throughput when the seeder/leecher ratio is smaller than r=1.5 reaching up to twice the level of regular BitTorrent performance. Other important, original contributions of the thesis are the following:

Proposing a model for evaluating the swarm behavior. This model is based on a set
of measurable parameters and offers an evaluation of the systems performance, reliability and efciency. The model is designed such that, with minimal changes, it can be used for any P2P le-sharing system.

Validating the swarm behavior model experimentally, using a number of specially


designed scenarios. The scenarios reproduced a common set of realistic contexts and have shown the models capacity to offer a realistic projection of the system behavior.

The classication of P2P model types into node-centric, connection-centric and


system-centric based on the input elements and modeled perspective.

The design and implementation of an epidemic proxy discovery mechanism. The


mechanism is fully-descentralized and take advantage of the technologies included in the Tribler P2P client. The proxy discovery mechanism harvests information about active, connectable peers offering dependable results.

The analysis of the required elements for building an implicit relay architecture.
Main architectural decisions are discussed and recommendations are made for creating a self-sustained implicit caching mechanism.

Performing a real-life test of the proxy technology by using approximately 20.000


worldwide clients over a period of 2 months. The test required each client to download data using the novel proxy technology and to upload the resulting log led for analysis.

Performing a large scale analysis of the proposed proxy technology using a virtual
cluster of 350 nodes. The analysis used 36 different scenarios and required more than 40 hours for a full-run.

CHAPTER 7. CONCLUSIONS

104

The usage of real P2P clients for validating the proposed protocols. Compared to
simulators, real P2P clients offer the benet of a full implementation of the P2P application protocol and the use of real transport level protocols (TCP/UDP).

Designing and implementing a testbed infrastructure for evaluating the proposed


P2P overlay improvements. The infrastructure, together with a computer cluster, reproduces realistic network conditions for advanced testing purposes.

Introducing the management of test-suites into scenarios and campaigns.

A scenario is a context with a specic purpose into which the tested application is introduced. The campaign is a set of scenarios, grouping them and allowing them to be executed multiple times within the same test suite. P2P clients could be easily compared. The log format uses plain text les and its content is easy to be parsed for further analysis.

Proposing a unied format for log storage such that logs from different BitTorrent

Introduction of churn simulation within a P2P testbed. Churn is one of the basic
elements in the lifetime of any P2P application. The reproduction of churn within a P2P testbed allows the reproduction of complex scenarios.

Comparison of the effects of different client stopping methods for BitTorrent P2P
clients. In a real-life context, multiple elements can cause the P2P clients to leave the system. The reproduction within a controlled environment of some of the above-mentioned causes can increase the level of realism for the simulated scenarios.

Proposing a new method for detecting the scenario completion for a P2P testbed.
By generalizing this process of detection, the P2P testbed can be used to test any type of application that saves state messages in log les.

The classication of P2P technologies based on the purpose of the technology and
services it offers to other components.

The usage of lightweight virtualization solutions (OpenVZ, LXC) in combination


with a P2P testing infrastructure to allow a direct evaluation of P2P protocols.

The design and implementation of virtualized computer clusters. Two architectures


have been deployed and used: a virtual computer cluster with 100 OpenVZ nodes and a virtual computer cluster with 350 LXC nodes.

7.3

Future Work

The P2P overlay improvements proposed by this thesis, as well as the testing infrastructure introduced, offer research directions for further work. The proposed proxy relay mechanism could be used to bring anonymity to other types of network trafc: HTTP or FTP. The BitTorrent clients could work in a similar architecture with Tor by routing the data connections between the P2P swarm and concealing the

CHAPTER 7. CONCLUSIONS

105

original source of HTTP or FTP trafc. Such an architecture would benet from the large number of existing P2P BitTorrent clients and would offer increased performance compared to Tor. The thesis introduced the proxy relay as a type of service offered to the swarm nodes. Other services could be implemented as well, and could benet from the developed framework. Such services could include CPU cycles or disk storage. An application could be designed to be executed remotely, on the swarm nodes, with a separate BitTorrent extension responsible for task scheduling and monitoring. The proposed testing infrastructure was designed by taking into account generic requirements. Currently it is being used to create P2P scenarios. Further work could be made to adapt the infrastructure for any type of application that can be executed in a computer cluster. Also, developing methods for remotely controlling Windows applications and integrating these applications in test scenarios would provide an increased level of realism.

Publications
Articles

A Distributed File System Model for Shared, Dynamic and Aggregated Data,
Mircea Bardac, George Milescu, Razvan Rughini , International Conference on s Control Systems and Computer Science - CSCS17, May 26-29, 2009, Bucharest, Romania, Vol. 1, pag. 45-51, ISSN 2066-4451

Monitoring a BitTorrent tracker for peer-to-peer system analysis, Mircea Bardac,


George Milescu, Razvan Deaconescu, International Symposium on Intelligent Distributed Computing - IDC 2009, October 12-14, 2009, Ayia Napa, Cyprus, Vol. 237, pag. 203-208, ISBN 978-3-642-03213-4 (ISI Indexed)

A virtualized infrastructure for automated bittorrent performance testing and


evaluation, Razvan Deaconescu, George Milescu, Bogdan Aurelian, Razvan Rughini , Nicolae Tapu . International Journal on Advances in systems and s s Measurements, vol. 2(no. 2 and 3):236247, 2009.

Swarm metrics in peer-to-peer systems, George Milescu, Mircea Bardac, Nicolae


Tapus, 9th RoEduNet IEEE International Conference, Sibiu, Romania, 2010, pp. , , 276-281 (ISI Indexed)

Simulating

Connection Dropouts in BitTorrent Environments, Razvan , , IEEE International Conference on Deaconescu, George Milescu, Nicolae Tapus , Computer as a Tool EUROCON2011, Lisbon, Portugal, 2011, pages 1-4 (ISI Indexed) George Milescu, Razvan Deaconescu, Nicolae Tapus, The Seventh International , , Conference on Networking and Services ICNS2011, Venice/Mestre, Italy, 2011, pp. 262-267 (ISI indexed)

Versatile Conguration and Deployment of Realistic Peer-to-Peer Scenarios,

Deploying a High-Performance Context-Aware Peer Classication Engine, Mircea


Bardac, George Milescu, Adina Florea, The Seventh International Conference on Networking and Services ICNS2011, Venice/Mestre, Italy, 2011, pp. 268-273 (ISI indexed)

Evaluating Resource Utilization of Peer-to-Peer Video Streaming Strategies,


Mircea Bardac, George Milescu, Adina Florea, International Conference on Control Systems and Computer Science - CSCS18, Bucharest, Romania, 2011, pp. 848-852, issn: 2066-4451 106

PUBLICATIONS

107

Optimization of Performance Monitoring and Attack Detection in All Optical


Networks, Razvan Rughinis, George Milescu, Mircea Bardac, Nicolae Tapus, , , , UPB Buletin Stiintic Seria C, Bucharest, Romania, 2011, 73/1, pp. 3-12, issn: , , 1454-234x. Books

Introducere n sisteme de operare, Printech 2009, ISBN 978-606-521-386-9 (coauthor) Posters

Designing and Building a Self-Sustained Bandwidth Market for Improved Privacy


and User Experience, Scientic poster session, CATIIS PhD Students Day, UPB, October, 2010 European Project Deliverables

Deliverable number 4.0.4 - Next-Share Platform in P2P-Next FP7 project http://www.p2p-next.org

Deliverable number 4.1.0 in P2P-Next FP7 project - http://www.p2p-next.org

Appendix A Campaign Conguration File


1 2 3 4 5 6

# Campaign01 # Description: # * a complete run of all the 7 scenarios # # ScenarioDescription PlotScript scenario01.cfg scenario01.r

Listing A.1: campaign01.cfg

108

Appendix B

Scenario Conguration File

109 Listing B.1: scenario01.cfg

10

11

12

13

14

15

# Scenario01 # Description: # * a flashcrowd swarm # * 1 seeder # * 6 leechers # * all peers have the same bandwidth # # Hostname SSHport User RemoteFolder NetInterface Download(Mbps) DownloadBurst(K) Upload(Mbps) UploadBurst(K) OverallNoOfConnections PreRunScript PostRunScript ClientType TorrentFile Periods p2p-next-01 22 p2p /home/p2p eth0 8 100 8 100 0 pre-run.sh post-run.sh tribler_seeder Data.torrent (0,-) p2p-next-05 22 p2p /home/p2p eth0 8 100 8 100 0 pre-run.sh post-run.sh tribler_leecher Data.torrent (0,-) p2p-next-06 22 p2p /home/p2p eth0 8 100 8 100 0 pre-run.sh post-run.sh tribler_leecher Data.torrent (0,-) p2p-next-07 22 p2p /home/p2p eth0 8 100 8 100 0 pre-run.sh post-run.sh tribler_leecher Data.torrent (0,-) p2p-next-08 22 p2p /home/p2p eth0 8 100 8 100 0 pre-run.sh post-run.sh tribler_leecher Data.torrent (0,-) p2p-next-09 22 p2p /home/p2p eth0 8 100 8 100 0 pre-run.sh post-run.sh tribler_leecher Data.torrent (0,-) p2p-next-10 22 p2p /home/p2p eth0 8 100 8 100 0 pre-run.sh post-run.sh tribler_leecher Data.torrent (50,200) (400,700) (800,-)

Bibliography
[Adar and Huberman, 2000] Adar, E. and Huberman, B. A. (2000). gnutella. Technical report, Xerox PARC. Free riding on

[Bangeman, 2008] Bangeman, E. (2008). http://arstechnica.com/old/content/2008/04/ study-bittorren-sees-big-growth-limewire-still-1-p2p-app.ars, accessed January 2011. [Bardac et al., 2010] Bardac, M., Deaconescu, R., and Florea, A. M. (2010). Scaling Peer-to-Peer testing using Linux Containers. Roedunet International Conference (RoEduNet), 2010 9th, pages 287292. [Bardac et al., 2009a] Bardac, M., Milescu, G., and Deaconescu, R. (2009a). Monitoring a bittorrent tracker for peer-to-peer system analysis. In Intelligent Distributed Computing III, volume 237/2009 of Studies in Computational Intelligence, pages 203 208. Springer Berlin / Heidelberg. [Bardac et al., 2011a] Bardac, M., Milescu, G., and Florea, A. (2011a). Deploying a High-Performance Context-Aware Peer Classication Engine. The Seventh International Conference on Networking and Services - ICNS2011, Venice/Mestre, Italy, pages 268273. [Bardac et al., 2011b] Bardac, M., Milescu, G., and Florea, A. (2011b). Evaluating Resource Utilization of Peer-to-Peer Video Streaming Strategies. International Conference on Control Systems and Computer Science CSCS18, pages 848852. [Bardac et al., 2009b] Bardac, M., Milescu, G., and Rughini , R. (2009b). A Distributed s File System Model for Shared, Dynamic and Aggregated Data. International Conference on Control Systems and Computer Science CSCS17, 1:4551. [Binzenhfer and Leibnitz, 2007] Binzenhfer, A. and Leibnitz, K. (2007). Estimating churn in structured P2P networks. In Proceedings of the 20th international teletrafc conference on Managing trafc performance in converged networks, ITC2007, pages 630641, Berlin. Springer-Verlag. [BitTorrent, 2009] BitTorrent, accessed July 2011. I. (2009). http://bittorrent.org/beps/bep_0029.html,

[BitTorrent, 2011] BitTorrent, I. (2011). http://www.utorrent.com/help/documentation/utp, accessed July 2011. [Bracciale et al., 2007] Bracciale, L., Piccolo, F. L., Salsano, S., and Luzzi, D. (2007). Simulation of peer-to-peer streaming over large-scale networks using opss. In 110

BIBLIOGRAPHY

111

ValueTools 07: Proceedings of the 2nd international conference on Performance evaluation methodologies and tools, pages 110, ICST, Brussels, Belgium, Belgium. ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering). [Buford et al., 2008] Buford, J., Yu, H., and Lua, E. K. (2008). P2P Networking and Applications. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA. [Channels, 2011] Channels, T. (2011). accessed January 2011. [Cohen, 2008] Cohen, B. (2008). January 2011. http://www.tribler.org/trac/wiki/LivePlaylists,

http://bittorrent.org/beps/bep_0003.html, accessed

[Corp, 2008] Corp, S. (2008). Analysis of trafc demographics in broadband networks. IETF Workshop on Peer-to-Peer Infrastructure (P2Pi). [David et al., 2002] David, P., COBB, J., and KORPELA, E. (2002). SETI@ home: An Experiment in Public-Resource Computing. Communications of the, 45(11). [Deaconescu et al., 2009] Deaconescu, R., Milescu, G., Aurelian, B., Rughinis, R., and Tapus, N. (2009). A virtualized infrastructure for automated bittorrent performance testing and evaluation. International Journal on Advances in Systems and Measurements, vol. 2(no. 2 and 3):236247. [Deaconescu et al., 2011] Deaconescu, R., Milescu, G., and Tapus, N. (2011). , , Simulating Connection Dropouts in BitTorrent Environments. IEEE International Conference on Computer as a Tool - EUROCON2011, Lisbon, Portugal, pages 14. [Delaviz et al., 2010] Delaviz, R., Andrade, N., and Pouwelse, J. A. (2010). Improving Accuracy and Coverage in an Internet-Deployed Reputation Mechanism. IEEE. [DHT, 2011] DHT (2011). http://en.wikipedia.org/wiki/Distributed_hash_table, accessed January 2011. [Dinger and Waldhorst, 2009] Dinger, J. and Waldhorst, O. (2009). Decentralized Bootstrapping of P2P Systems: A Practical View, volume 5550 of Lecture Notes in Computer Science, pages 703715715. Springer Berlin Heidelberg, Berlin, Heidelberg. [Dinh et al., 2008] Dinh, T. T. A., Theodoropoulos, G., and Minson, R. (2008). Evaluating large scale distributed simulation of p2p networks. In DS-RT 08: Proceedings of the 2008 12th IEEE/ACM International Symposium on Distributed Simulation and RealTime Applications, pages 5158, Washington, DC, USA. IEEE Computer Society. [Gadea, 2011] Gadea, L. (2011). http://engineering.twitter.com/2010/07/murder-fastdatacenter-code-deploys.html, accessed January 2011. [Garbacki et al., 2006] Garbacki, P., Iosup, A., Epema, D., and van Steen, M. (2006). 2Fast : Collaborative Downloads in P2P Networks. Sixth IEEE International Conference on Peer-to-Peer Computing (P2P06), pages 2330.

BIBLIOGRAPHY

112

[Ghit et al., 2010] Ghit, B., Pop, F., and Cristea, V. (2010). Epidemic-Style Global Load Monitoring in Large-Scale Overlay Networks. In 2010 International Conference on P2P, Parallel, Grid, Cloud and Internet Computing, pages 393398. IEEE. [Grishchenko and Bakker, 2010] Grishchenko, V. and Bakker, A. (2010). The generic multiparty transport protocol (swift), ietf internet draft. http://tools.ietf.org/html/draftgrishchenko-ppsp-swift-03, accessed September 2011. [Gummadi et al., 2003] Gummadi, K. P., Dunn, R. J., Saroiu, S., Gribble, S. D., H. M., and Zahorjan, J. (2003). Measurement, modeling, and analysis of a to-peer le-sharing workload. In SOSP 03: Proceedings of the nineteenth symposium on Operating systems principles, pages 314329, New York, NY, ACM. Levy, peerACM USA.

[hrktorrent, 2011] hrktorrent (2011). http://50hz.ws/hrktorrent/, accessed January 2011. [Iosup et al., 2005] Iosup, A., Garbacki, P., Pouwelse, J. A., and Epema, D. H. (2005). Correlating Topology and Path Characteristics of Overlay Networks and the Internet. [ipoque, 2009] ipoque (2009). ipoque internet studies. http://www.ipoque.com/resources/internet-studies/, accessed January 2011. [ISC, 2010] ISC (2010). Internet host count https://www.isc.org/solutions/survey/history, accessed January 2011. history.

[Istin et al., 2010] Istin, M.-D., Visan, A., Pop, F., and Cristea, V. (2010). SOPSys: SelfOrganizing Decentralized Peer-to-Peer System Based on Well Balanced Multi-Way Trees. In 2010 International Conference on P2P, Parallel, Grid, Cloud and Internet Computing, pages 369374. IEEE. [Jacobson et al., 2009a] Jacobson, V., Smetters, D. K., Briggs, N. H., Plass, M. F., Stewart, P., Thornton, J. D., and Braynard, R. L. (2009a). VoCCN. ACM Press, New York, New York, USA. [Jacobson et al., 2009b] Jacobson, V., Smetters, D. K., Thornton, J. D., Plass, M. F., Briggs, N. H., and Braynard, R. L. (2009b). Networking named content. In Proceedings of the 5th international conference on Emerging networking experiments and technologies - CoNEXT 09, page 1, New York, New York, USA. ACM Press. [Katsaros et al., 2009] Katsaros, K., Kemerlis, V., Stais, C., and Xylomenos, G. (2009). A BitTorrent module for the OMNeT++ simulator. IEEE. [Kaufman, 2009] Kaufman, M. (2009). Peer-to-peer on the ash platform with rtmfp. Technical report, Adobe MAX 2009. [Korpela et al., 2001] Korpela, E., Werthimer, D., Anderson, D., Cobb, J., and Leboisky, M. (2001). SETI@home-massively distributed computing for SETI. Computing in Science & Engineering, 3(1):7883. [Kreitz and Niemela, 2010] Kreitz, G. and Niemela, F. (2010). Spotify Large Scale, Low Latency, P2P Music-on-Demand Streaming. In 2010 IEEE Tenth International Conference on Peer-to-Peer Computing (P2P), pages 110. IEEE.

BIBLIOGRAPHY [Kubiatowicz, 2003] Kubiatowicz, J. (2003). Commun. ACM, 46(2):3338.

113 Extracting guarantees from chaos. (2011).

[libtorrent (Rasterbar), 2011] libtorrent (Rasterbar) http://www.rasterbar.com/products/libtorrent/, accessed January 2011.

[Loewenstern, 2008] Loewenstern, A. (2008). Bittorrent protocol specication: Bittorrent enhancement proposal dht protocol. http://bittorrent.org/beps/bep_0005.html, accessed January 2011. [Lukasik, 2010] Lukasik, S. (2010). Why The ARPANET Was Built. IEEE Annals of the History of Computing, (99):11. [Luo et al., 2010] Luo, Q., Li, Y., Dong, W., Liu, G., and Mao, R. (2010). A Novel Model and a Simulation Tool for Churn of P2P Network. IEEE. [LXC, 2011] LXC, L. C. (2011). http://lxc.sourceforge.net/, accessed January 2011. [Maymounkov and Mazires, 2002] Maymounkov, P. and Mazires, D. (2002). Kademlia: A Peer-to-Peer Information System Based on the XOR Metric. pages 5365. [Meulpolder et al., 2010] Meulpolder, M., DAcunto, L., Capot\ua, M., Wojciechowski, M., Pouwelse, J. A., Epema, D. H. J., and Sips, H. J. (2010). Public and private BitTorrent communities: a measurement study. In Proceedings of the 9th international conference on Peer-to-peer systems, IPTPS10, pages 1010, Berkeley. USENIX Association. [Meulpolder et al., 2009] Meulpolder, M., Pouwelse, J., Epema, D., and Sips, H. (2009). BarterCast: A practical approach to prevent lazy freeriding in P2P networks. IEEE. [Milescu et al., 2010] Milescu, G., Bardac, M., and Tapus, N. (2010). Swarm metrics in , , peer-to-peer systems. 9th RoEduNet IEEE International Conference, Sibiu, Romania, pages 276281. [Milescu et al., 2011] Milescu, G., Deaconescu, R., and Tapus, N. (2011). Versatile , , Conguration and Deployment of Realistic Peer-to-Peer Scenarios. The Seventh International Conference on Networking and Services - ICNS2011, Venice/Mestre, Italy, pages 262267. [Naicken et al., 2007] Naicken, S., Livingston, B., Basu, A., Rodhetbhai, S., Wakeman, I., and Chalmers, D. (2007). The state of peer-to-peer simulators and simulations. SIGCOMM Comput. Commun. Rev., 37(2):9598. [ns 2, 2011] ns 2, T. N. S. (2011). http://www.isi.edu/nsnam/ns/, accessed January 2011. [OpenVZ, 2011] OpenVZ (2011). http://wiki.openvz.org/, accessed January 2011. [Ou et al., 2010] Ou, Z., Harjula, E., Kassinen, O., and Ylianttila, M. (2010). Performance evaluation of a Kademlia-based communication-oriented P2P system under churn. Computer Networks, 54(5):689705. [P2P-Next, 2011] P2P-Next (2011). http://www.p2p-next.org/, accessed January 2011.

BIBLIOGRAPHY [PlanetLab, 2011] PlanetLab (2011). 2011.

114 http://www.planet-lab.org/, accessed January

[Pouwelse et al., 2005] Pouwelse, J., Garbacki, P., Epema, D., and Sips, H. (2005). The Bittorrent P2P File-Sharing System: Measurements and Analysis. Peer-to-Peer Systems IV, pages 205216. [Pouwelse et al., 2008a] Pouwelse, J., Yang, J., Meulpolder, M., Epema, D., and Sips, H. (2008a). Buddycast: an operational peer-to-peer epidemic protocol stack. Fourteenth Annual Conference of the Advanced School for Computing and Imaging. [Pouwelse et al., 2008b] Pouwelse, J. A., Garbacki, P., Wang, J., Bakker, A., Yang, J., Iosup, A., Epema, D. H. J., Reinders, M., van Steen, M. R., and Sips, H. J. (2008b). TRIBLER: a social-based peer-to-peer system: Research Articles. Concurrency and Computation: Practice & Experience, 20(2):127138. [Rao et al., 2010] Rao, A., Legout, A., and Dabbous, W. (2010). Can realistic bittorrent experiments be performed on clusters? In Peer-to-Peer Computing (P2P), 2010 IEEE Tenth International Conference on, pages 1 10. [Rieche et al., 2004] Rieche, S., Wehrle, K., Landsiedel, O., Gotz, S., and Petrak, L. (2004). Reliability of data in structured peer-to-peer systems. In HOT-P2P 04: Proceedings of the 2004 International Workshop on Hot Topics in Peer-to-Peer Systems, pages 108113, Washington, DC, USA. IEEE Computer Society. [Rossi et al., 2009] Rossi, D., Mellia, M., and Meo, M. (2009). Understanding Skype signaling. Computer Networks, 53(2):130140. [Routing and HOWTO, 2011] Routing, L. A. and HOWTO, T. C. (2011). http://lartc.org/, accessed January 2011. [Rughinis et al., 2011] Rughinis, R., Milescu, G., Bardac, M., and Tapus, N. (2011). , , , , Optimization of Performance Monitoring and Attack Detection in All Optical Networks. UPB Buletin Stiintic Seria C, Bucharest, Romania, 73/1:312. , , [Sandvine-Incorporated, 2011a] Sandvine-Incorporated (Spring 2011a). Global internet phenomena spotlight, europe, xed access. Technical report. [Sandvine-Incorporated, 2011b] Sandvine-Incorporated (Spring 2011b). Global internet phenomena spotlight, north america, xed access. Technical report. [Saroiu et al., 2002a] Saroiu, S., Gummadi, K. P., Dunn, R. J., Gribble, S. D., and Levy, H. M. (2002a). An analysis of internet content delivery systems. SIGOPS Oper. Syst. Rev., 36(SI):315327. [Saroiu et al., 2002b] Saroiu, S., Gummadi, P. K., and Gribble, S. D. (2002b). A measurement study of peer-to-peer le sharing systems. In Proceedings of Multimedia Computing and Networking (MMCN) 2002. [SETI@home, 2011] SETI@home (2011). July 2011. http://setiathome.berkeley.edu/, accessed

BIBLIOGRAPHY

115

[Shalunov, 2011] Shalunov, S. (2011). Low extra delay background transport (led-bat), ietf internet draft. http://tools.ietf.org/wg/ledbat/draft-ietf-ledbat-congestion, accessed September 2011. [Sioutas et al., 2009] Sioutas, S., Papaloukopoulos, G., Sakkopoulos, E., Tsichlas, K., and Manolopoulos, Y. (2009). A novel distributed p2p simulator architecture: Dp2p-sim. In CIKM 09: Proceeding of the 18th ACM conference on Information and knowledge management, pages 20692070, New York, NY, USA. ACM. [Skype, 2011] Skype (2011). http://www.skype.com, accessed January 2011. [Stutzbach and Rejaie, 2006] Stutzbach, D. and Rejaie, R. (2006). Understanding churn in peer-to-peer networks. In Proceedings of the 6th ACM SIGCOMM conference on Internet measurement, IMC 06, pages 189202, New York, NY, USA. ACM. [Swarmplayer, 2011a] Swarmplayer accessed January 2011. (2011a). http://swarmplayer.p2p-next.org,

[Swarmplayer, 2011b] Swarmplayer (2011b). http://techblog.wikimedia.org/2010/09/videolabs-p2p-next-community-cdn-for-video-distribution, accessed January 2011. [trickle, 2011] trickle (2011). http://monkey.org/ marius/pages/?page=trickle, accessed January 2011. [Visan et al., 2010] Visan, A., Istin, M., Pop, F., Xhafa, F., and Cristea, V. (2010). Peer Interest-based Discovery for Decentralized Peer-to-Peer Systems. In 2010 International Conference on P2P, Parallel, Grid, Cloud and Internet Computing, pages 363368. IEEE. [Vuze-Incorporated, 2011] Vuze-Incorporated (2011). http://www.vuze.com/, accessed January 2011. [wiki.theory.org, 2011] wiki.theory.org (2011). http://wiki.theory.org/BitTorrentPeerExchangeConventions, accessed January 2011. [Wu et al., 2010] Wu, D., Dhungel, P., Hei, X., Zhang, C., and Ross, K. W. (2010). Understanding Peer Exchange in BitTorrent Systems. IEEE.

You might also like