Professional Documents
Culture Documents
On
Acknowledgement
Apart from our efforts, success of this project depends largely on the
encouragement and guidelines of many others. We take this opportunity to
express our gratitude to the people who have been instrumental in the successful
completion of this project.
THANK YOU,
Professor
Head of department
Contents
S.No.
Topic
1.
Acknowledgement
2.
Certificates
3.
Abstract
4.
Motivation
5.
6.
Implementation
7.
Architecture
8.
Working
9.
Functions
10.
Further Improvements
11.
Implementation Results
12.
Conclusion
13.
Source Code
14.
References
Abstract
The Peer to Peer file sharing over LAN makes use of the redundant files
present over the LAN to download the file. This enables the client to download the files as parts from
the system possessing the same file thereby utilising the advantage of file present on different system.
Unlike the Traditional file sharing system which uses File Transfer
Protocol(FTP), the peer to peer file sharing system employs the use of Hypertext Transfer
Protocol(HTTP). There are a lot of advantages of using HTTP over FTP some of which are :1) HTTP supports pipelining :- Pipelining means that a client can ask for different file when the
previous file is already in the process of retrieving i.e it allows us to download multiple files
at the same time.
2) HTTP includes header :- There are no headers present in the FTP, which are present in HTTP
while there is a transfer of file using it. Although these are a overhead for the HTTP but they
play a significant part while sending small files.
3) FTP requires two connections :- One of other disadvantage of FTP over HTTP is that it
requires two connections. One connection is required for control commands and other
connection for actual transfer of the files between the two.
4) Download Limit :- Although both the protocols offer the support for download but
downloading the files of over 2gb using FTP has been known to cause errors.
5) Persistent connection: - The HTTP connection can maintain a single connection to the server
and can use the same connection for any number of transfers whereas in case of FTP a new
connection is setup for each new file transfer, which in turn affects the performance by having
to do handshakes/connection all the time.
Motivation
Talking of the Traditional File sharing system ,the system employs the File
Transfer Protocol(FTP) for sharing of the file between the system. The system didn't made use of the
redundancy of file present across the different system and could only be downloaded as a single large
file ,in the case ,if the file couldn't be downloaded in a single go ,then the file needed to downloaded
again or more precisely the entire file needed to downloading from the beginning.
Unlike the Traditional file sharing system,Distributed file sharing system
took the advantage of file being located at different system by downloading it in various parts.
So the motivation behind the project was that there was no need to download
the entire file as a single file but could be downloaded from different system at the same time as
different parts of the same file i.e the system possessing the files delivered these files as different parts
of the same file and then at the client side these parts could be assembled to make up the entire file.
The use of Distributed file sharing system led to the better and efficient
utilisation of the Bandwidth as well as the time required to download the file came down due to file
being downloaded as parts instead of the entire file.The other advantage of this Distributed file
sharing system was that if at some point there was a break in a connection from any system, then the
part which was to be downloaded from that system could be download again at some other time,which
mean there was no need to download the entire file again from all the system,unlike the case in
traditional file sharing system where the entire file needed to be downloaded again.
One of the major motivating factors for Distributed file sharing system
over was that similar applications do not exist on the most widely used platforms - the PCs. Our
implementation is for the LINUX platform. We expect a greater reachability and use of Distributed
file sharing system, on this platform simply because of its immense presence all over the world. We
eyed Distributed file sharing as a product that can reach out to a lot of people, and this encouraged us
throughout the project.
Peer-to-peer file sharing helps in distribution of digital media using peer-topeer (P2P) network technology. P2P file sharing enables users to use files such as books, music,
movies, and games using a P2P software program that searches for other connected computers on a
P2P network to locate the desired content. The nodes (peers) of such networks are end-user computer
systems that are interconnected via the Internet.
Peer-to-peer file sharing technology has evolved through several design
stages from the early networks like Napster, which popularized the technology, to the later models
like the Bit Torrent protocol.
Several factors contributed to the widespread adoption and facilitation of
peer-to-peer file sharing. These included increasing Internet bandwidth, the widespread digitization of
physical media, and the increasing capabilities of residential personal computers. Users were able to
transfer either one or more files from one computer to another across the Internet through various file
transfer systems and other file-sharing networks.
A peer-to-peer system is a collection of peer nodes that act both as servers and as clients
Provide resources to other peers
Consume resources from other peers
Characteristics:-
Freedom of information
Freedom of scale
Centralized Topology
The idea of a centralized topology is similar to the traditional client/server model. A centralized
server exists which is used to manages the files and user databases of many peers .The client informs
the server about its current IP address and names of all the files that it has to be shared. This is done
every time the application is launched. The information obtained from the peers is then utilized by
centralized server to create a database listing all mappings from file to IP address.
The server runs a quick search on the locally maintained database for all queries sent to it by the
clients. On matching, a link is established between the 2 peers. The file to be shared is never placed
on the central server.seti@home and Napster are few examples of this type of typology.
Ring Topology
The disadvantage of a centralized topology is that the central server can
become a bottle neck (when load becomes heavy) and a single point of failure. It is composed up of a
group of systems that are arranged in the form of a ring to act as a distributed server. This group of
systems will work together to provide efficient load balancing. This topology is generally used when
all the systems are located near,it is used by single institution; where security is not a concern.
Hierarchical Topology
Hierarchical systems have been there since the beginning of time. Many day
to day life systems and organisations function in a hierarchical manner. Nowadays many internet
systems also employ a hierarchical environment. The best example on the Internet would be the
Domain Name Service (DNS).Authority flows from the root name servers to the servers of the
registered name and so on. This topology is very suitable for systems that require a form of
governance that involves delegation of rights or authority. Certification Authorities (CAs ) also
employ the hierarchical topology, that validates of an entity on the Internet. The main top level CA
can grant some of its authoritative rights to lower level CAs that subscribe to it, so that those CAs can,
in turn provide credentials to those that reside in beneath it.
Decentralized Topology
In a pure P2P system, no centralized servers are present. All peers are equal, hence creating a flat, free
network topology. In order to join the network, a peer must first, contact a bootstrapping node (node
that is always online), which gives the joining peer the IP address of one or more existing peers,
officially making it part of the ever dynamic network. Each peer, however, will only have information
about its neighbours, which are peers that have a direct edge to
it in the network.
Since there are no servers to manage searches, queries for files are flooded through the network. The
act of query flooding is not exactly the best solution as it entails a large overhead traffic in the
network.
An example of an application that uses this model is Gnutella. Details of how it searches and shares
files in a pure P2P network will be discussed in the Gnutella section.
Implementation
SQL Instance:
Functions
Show
Query
3. Sends filename :- The client responds to client by sending the name of the file.
4. Check Filename :- The filename received from the Tracker is checked in the database. The
database lists all the ips which have the file requested by the client.
5. Send the list :- The database sends the list of ip containing that particular file.
6. Parse and make list of IP.txt :- The Tracker parse the list received from the database in the
.txt file.
7. IP.txt file is send to client :- The IP.txt file is send from the tracker to client.
8. Parse list to threaded client :- The list of ips containing the specified file are then made
request to download the file in threads.
Results
Further Improvements
Many firewalls do not allow connections which are not to ports 80 or 443
(http & https). FTP may be not be blocked, let aside the active/PASV modes.
Futhermore, HTTP allows for much better partial requests ("only send from
byte 123456 to the end of file"), conditional requests and caching and content compression (gzip).
HTTP is much easier to use through a proxy.
HTTP is easier to make work with dropped/slow/flaky connections; e.g. it is not
needed to (re)establish a login session before (re)initiating transfer.
On the other hand, there are some drawbacks of HTTP which limit its use in file
transfer. First of all,HTTP is stateless, so we have to do authentication and building a trail of "who
did what when" by ourselves.
The only difference in speed is when transferring lots of small files as HTTP with
pipelining is faster (reduces round-trips, esp. noticeable on high-latency networks)
2.1-show- reply back with the list of file which the server has
####################################################################
#####################
####################################################################
#####################
from detail_ip import *
from list_of_file import *
from db_update import *
import socket
import os
####################################################################
#####################
# EOF
offset += sent
file.close()
####################################################################
#####################
####################################################################
#####################
####################################################################
#####################
sock_obj = socket.socket()
host = socket.gethostname()
port = 8887
service.
sock_obj.bind((host, port))
sock_obj.listen(50)
connection.
while True:
sock, addr = sock_obj.accept()
client.
tescase=sock.recv(1024)
testcase=testcase.strip()
while True:
if testcase==show:
#genrate list of file
db_file()
#send complete file (curfile.txt)
common_send(curfile.txt,sock)
elif testcase==query:
#recieve query string
fil=sock.recv(65536)
#genrate details of ip
ip_file(fil)
#send to client complete file fil.txt
common_send(fil+'.txt',sock)
elif testcase==exit:
#exit from loop
break
else:
sock.send('Enter valid input!!!')
tescase=sock.recv(1024)
testcase=testcase.strip()
# Close the
client.py
##############################################################
###########################
#This program acts a client program which communicate between
clients as well as
#tracker.It has different use case to communicate to the
tracker and also different
#mode to communicate to other clients so that transfer of file
between clients can happen
#It is also responsible to send its list of files in the
current working directory.
#Functions used-#1. get_sourcefileapths - walks in the directory and genrates
the list of files in the
#
import socket
import os
from sendfile import sendfile
from thread_download import
#Functions definations
##############################################################
###########################
##############################################################
###########################
##############################################################
###########################
def get_sourcefileapths(path):
file_paths = []
filepaths.
return file_paths
# Self-explanatory.
##############################################################
###########################
##############################################################
###########################
##############################################################
###########################
def common_recv(filename,sock_obj):
data=sock_obj.recv(1024)
lent=len(data)
filesize=(data[0:lent])
f = open(filename,'wb')
data = sock_obj.recv(65536)
totalRecv = len(data)
f.write(data)
while str(totalRecv) != str(filesize):
data = sock_obj.recv(65536)
totalRecv += len(data)
f.write(data)
print "Download Complete!"
f.close()
##############################################################
###########################
##############################################################
###########################
##############################################################
###########################
sock_obj = socket.socket()
host = socket.gethostname()
port = 8887
service.
sock_obj.connect((host, port))
---
" + str(os.path.getsize(f))
files.close()
#use cases
#show list
#query file
#undefined
#exit
testcase=input()
while 1:
if testcase==show:
sock_obj.send(testcase)
#receive complete file curfile.txt
common_recv('currentfile.txt',sock_obj)
#display the list
show=open('currentfile.txt','r')
for line in show:
print line
show.close()
elif testcase==query:
sock_obj.send(testcase)
fil=input('Enter a file name:-\n')
sock_obj.send(fil)
os.system('./mkdir '+fil)
common_recv(fil+'.txt',sock_obj)
llel_download(fil.txt)
print "Download complete!!!"
elif testcase==exit:
sock_obj.send(testcase)
break
else:
sock_obj.send(testcase)
error=sock_obj.recv(1024)
print error
testcase=input()
end_message=sock_obj.recv(65536)
print end_message
sock_obj.close
s.py
####################################################################
#####################
#This program acts as tracker which maintains the list of files
#that the network contains.And provides
#It has two functions
#1. get_sourcefileapths - walks in the directory and genrates the
list of files in the
#
2.1-show- reply back with the list of file which the server has
####################################################################
#####################
import os
import socket
from sendfile import sendfile
import time
#Functions definitions
####################################################################
#####################
####################################################################
#####################
####################################################################
#####################
def get_sourcefileapths(path):
file_paths = []
filepaths.
return file_paths
# Self-explanatory.
####################################################################
#####################
####################################################################
#####################
####################################################################
#####################
####################################################################
#####################
####################################################################
#####################
####################################################################
#####################
####################################################################
#####################
####################################################################
#####################
####################################################################
#####################
def main():
cwd=os.getcwd()
#generate list of files
full_file_paths = get_sourcefileapths(cwd)
---
" + str(os.path.getsize(f))
files.close()
host = socket.gethostname()
port = 8887
s = socket.socket()
s.bind((host,port))
s.listen(5)
print "########################################Tracker
Started.#######################"
while 1:
sock, addr = s.accept()
print addr
print
"###################################################################
########"
testcase=sock.recv(1024)
while True:
if testcase=='show':
filename='files.txt'
file = open(filename, "rb")
blocksize = os.path.getsize(filename)
sock.send(str(os.path.getsize(filename)))
offset = 0
while True:
sent = sendfile(sock.fileno(), file.fileno(),
offset, blocksize)
if sent == 0:
break
offset += sent
file.close()
print 'transer of list complete'
elif testcase=='query':
print 'receive filename'
fil=sock.recv(65536)
print fil
filename=fil
file = open(filename,"rb")
blocksize = os.path.getsize(filename)
print blocksize
sock.send(str(os.path.getsize(filename)))
time.sleep(.1)
offset = 0
while True:
sent = sendfile(sock.fileno(), file.fileno(),
offset, blocksize)
if sent == 0:
break
offset += sent
file.close()
print 'transer of file complete'
elif testcase=='exit':
break
else:
sock.send('Enter valid input!!!')
testcase=sock.recv(1024)
print addr
print 'disconnected'
sock.close()
if __name__=='__main__':
main()
c.py
####################################################################
#####################
#This program acts a client program which communicate between
clients as well as
#tracker.It has different use case to communicate to the tracker and
also different
#mode to communicate to other clients so that transfer of file
between clients can happen
#It is also responsible to send its list of files in the current
working directory.
#The different use cases are as follows#1-show - ask the tracker for list of files available with it
currently
#2-query- ask the tracker for a particular file
#3-exit- ends the client program
import socket
import os
from sendfile import sendfile
sock_obj = socket.socket()
host = socket.gethostname()
port = 8887
sock_obj.connect((host, port))
while 1:
if testcase=='show':
sock_obj.send(testcase)
#receive complete file curfile.txt
filename='currentfile.txt'
data=sock_obj.recv(1024)
lent=len(data)
filesize=(data[0:lent])
f = open(filename,'wb')
data = sock_obj.recv(65536)
totalRecv = len(data)
f.write(data)
while str(totalRecv) != str(filesize):
data = sock_obj.recv(65536)
totalRecv += len(data)
f.write(data)
print "Download of list Complete!"
f.close()
#display the list
show=open('currentfile.txt','r')
for line in show:
print line
show.close()
os.system('rm currentfile.txt')
elif testcase=='query':
sock_obj.send(testcase)
fil=raw_input('Enter a file name:-\n')
sock_obj.send(fil)
filename=fil
data=sock_obj.recv(100)
print data
lent=len(data)
filesize=(data[0:lent])
f = open(filename,'wb')
data = sock_obj.recv(65536)
totalRecv = len(data)
f.write(data)
while str(totalRecv) != str(filesize):
data = sock_obj.recv(65536)
totalRecv += len(data)
f.write(data)
f.close()
elif testcase=='exit':
sock_obj.send(testcase)
break
else:
sock_obj.send(testcase)
error=sock_obj.recv(1024)
print error
testcase=raw_input('Enter the option:-\n')
end_message=sock_obj.recv(65536)
print end_message
sock_obj.close
file_client.py
####################################################################
#####################
#this program is used to download file from a tcp server
#provided the name of file present on the server
#it uses the filesend libraray of
def Main():
host = '192.168.106.255'
port = 3338
s = socket.socket()
s.connect((host, port))
data=s.recv(1024)
#print int(data[0:])
#s.close()
lent=len(data)
filesize=(data[0:lent])
#filesize=int(filesize)
#print filesize
f = open('new_.mp4', 'wb')
data = s.recv(65536)
totalRecv = len(data)
f.write(data)
while str(totalRecv) != str(filesize):
data = s.recv(65536)
totalRecv += len(data)
f.write(data)
#print totalRecv
#print "{0:.2f}".format((totalRecv/int(filesize))*100)+ "%
Done"
print "Download Complete!"
f.close()
s.close()
if __name__ == '__main__':
Main()
File_server.py
####################################################################
#####################
#this program is used to run as a server and uses sendfile
#to communciate to the client to send file to it.
#Functions:#1-main-it contains the whole body of program containig the logic
and functionality
####################################################################
#####################
'''import socket
import threading
import os
'''
'''
def RetrFile(name, sock):
filename = sock.recv(1024)
if os.path.isfile(filename):
sock.send("EXISTS " + str(os.path.getsize(filename)))
userResponse = sock.recv(1024)
if userResponse[:2] == 'OK':
with open(filename, 'rb') as f:
bytesToSend = f.read(1024)
sock.send(bytesToSend)
while bytesToSend != "":
bytesToSend = f.read(1024)
sock.send(bytesToSend)
else:
sock.send("ERR ")
sock.close()
'''
'''
filename='mov.mp4'
def Main():
host = '192.168.102.14'
port = 5000
s = socket.socket()
s.bind((host,port))
s.listen(5)
if __name__ == '__main__':
Main()
'''
import os
import socket
from sendfile import sendfile
filename='files.txt'
file = open("files.txt", "rb")
blocksize = os.path.getsize("files.txt")
host = '192.168.137.219'
port = 5001
s = socket.socket()
s.bind((host,port))
s.listen(5)
offset = 0
while True:
sent = sendfile(sock.fileno(), file.fileno(), offset, blocksize)
if sent == 0:
break
# EOF
offset += sent
list_of_file.py
####################################################################
#####################
#this program is use to return all the file available at tracker
#basically it queries the central database to look for distinct
#filname writes them in a file and send it to client using filesend
cur_files=open('curfile.txt','w')
listfile.py
####################################################################
#####################
#this program is used to walk in a directory and subdirectory and
genrate
#complete file path.
#it also puts them in text file so that it can be sent to the
tracker
# Functions:=#1.getfilepaths- make a file with all the subfiles present in it.
####################################################################
#####################
import os
#Functions definations
####################################################################
#####################
####################################################################
#####################
####################################################################
#####################
def get_sourcefileapths(path):
file_paths = []
filepaths.
return file_paths
# Self-explanatory.
####################################################################
#####################
####################################################################
#####################
####################################################################
#####################
# Run the above function and store its results in a variable.
full_file_paths = get_sourcefileapths("/home/bismith/fileshare")
#print full_file_paths
files=open('files.txt','w')
for f in full_file_paths:
print >>files,f + "
---
files.close()
"""
for f in full_file_paths:
if f.endswith(".dat"):
print f
"""
m_file_client.py
####################################################################
#####################
#this is a program to receive files from a multithreaded server
#it takes as an input the host path name from the user
#and starts to download files from server based on the peers list
#Funtions :#1-main- parses the argument received from the user
#2-fun the actual download client that takes host as input
####################################################################
#####################
import socket
import argparse
def fun(host):
#host = '192.168.106.255'
port = 5001
s = socket.socket()
s.connect((host, port))
data=s.recv(1024)
#print int(data[0:])
#s.close()
lent=len(data)
filesize=(data[0:lent])
#filesize=int(filesize)
#print filesize
f = open(host+'.txt', 'wb')
data = s.recv(65536)
totalRecv = len(data)
f.write(data)
while str(totalRecv) != str(filesize):
data = s.recv(65536)
totalRecv += len(data)
f.write(data)
#print totalRecv
#print "{0:.2f}".format((totalRecv/int(filesize))*100)+ "%
Done"
print "Download Complete!"
f.close()
s.close()
def Main():
parser = argparse.ArgumentParser()
parser.add_argument("host",help="The host whom to connect")
args = parser.parse_args()
fun(args.host)
if __name__ == '__main__':
Main()
m_file_server.py
##################################################################################
#######
#this program is used to download part of files from different
#peers using threaded architecture and uses sendfile to send
db_update
####################################################################
#####################
#this program is use to connect mysqldb and update client data into
database
#it has different functions
cur = db.cursor()
####################################################################
###
######## function to check if file with given ip is already
exist######
####################################################################
###
def exists(ip1,name1):
cur.execute("SELECT ip FROM ipname WHERE ip = %s and name=%s",
(ip1,name1))
return cur.fetchone() is not None
######################################################
##########function definend for FOR LOOP #############
######################################################
def my_range(start, end, step):
while start >= end:
yield start
start -= step
####################################################################
#####################
j=0
size1=0
needed to declere for glaobally use
####variable
name1='0'
ath1='0'
####################################################################
#####################
####################################################################
##################
########### function to parse txt file and update database
###########################
####################################################################
##################
def add_file(txt):
ip1=txt
f = open(txt+'.txt', "r") #txt is name of file
for line in f :
line.strip()
k=len(line)
for i in my_range(k-1,0,1):
if(line[i]==' '):
global size1
size1=line[i+1:k-1]
global j
j=i-7
break
global ath1
ath1=line[0:j+2]
for i in my_range(j,0,1):
if(line[i]=='/'):
global name1
name1=line[i+1:j+2]
ip1.strip()
name1.strip()
ath1.strip()
size1.strip()
if(exists(ip1,name1)==0):
cur.execute("INSERT INTO ipname VALUES(%s, %s,
%s,%s)",(ip1,name1,size1,ath1))
db.commit()
dbconnect.py
####################################################################
#####################
#this program is used to connect to the database and fetch all the
result from database
####################################################################
#####################
import MySQLdb
cursor = db.cursor()
detail_ip.py
####################################################################
#####################
#this program is used to generate list of ip which conatains the
file
#this list is sent to the client demanding the file so that he can
#get the file in different parts from the different peers parallely
#Funtions :#1.ip_file it is used to genrate the list of ip querying over the
database
####################################################################
#####################
import MySQLdb
def ip_file(query):
db = MySQLdb.connect(host="127.0.0.1", # your host, usually
127.0.0.1
user="root", # your username
passwd="9595", # your password
db="distributed") # name of the data base
cur = db.cursor()
####################################################################
###
#suppose name1 is the name of the file which's details
reqired#########
####################################################################
###
f = open(query,'w')
f.write(line+'\n')
f.close()
cur.close()
temp_client.py
import socket
import os
#Functions definations
####################################################################
#####################
####################################################################
#####################
####################################################################
#####################
def get_sourcefileapths(path):
file_paths = []
filepaths.
return file_paths
# Self-explanatory.
####################################################################
#####################
####################################################################
#####################
####################################################################
#####################
sock_obj = socket.socket()
host = socket.gethostname()
port = 12345
sock_obj.connect((host, port))
---
files.close()
sock_obj.close
thread_download.py
####################################################################
#####################
#this program
import thread
import os
def download(filename,host,path,counter,total):
port = 8888
s = socket.socket()
s.connect((host, port))
s.send(path)
s.send(counter)
s.send(total)
data=s.recv(1024)
lent=len(data)
filesize=(data[0:lent])
f = open('./'+filename+'/'+(counter+1), 'wb')
data = s.recv(65536)
totalRecv = len(data)
f.write(data)
while str(totalRecv) != str(filesize):
data = s.recv(65536)
totalRecv += len(data)
f.write(data)
#print "Download Complete!"
f.close()
s.close()
def llel_download(filename):
file=open(filename,"rb")
counter=0
total=0
thread_server.py
####################################################################
#####################
#this program is used to start separate
import os
import socket
from sendfile import sendfile
host = '192.168.137.219'
port = 8888
s = socket.socket()
s.bind((host,port))
s.listen(5)
while True:
sock, addr = s.accept()
filename = sock.recv(65536)
part=sock.recv(65536)
total=sock.recv(65536)
#process the part
os.system('./split -d -n '+(part+1)+'/'+total+'
'+filename+'>'+(part+1))
#using shell script
filename=part+1
blocksize = os.path.getsize(filename)
sock.send(str(os.path.getsize(filename)))
offset = 0
while True:
sent = sendfile(sock.fileno(), file.fileno(), offset,
blocksize)
if sent == 0:
break
# EOF
offset += sent
#delete made part
os.system('./rm '+(part+1))
Conclusion
Using the proposed software and methodology we have seen improvements
of 6% in speed client to server upload and 3% improvement of speed in server to client download.
This method also adds benefit of adding resume capability to files being downloaded. Which is a
boon for downloading large files because instead of starting the download from beginning we just
download the required part and concatenate it so that we have the whole file with very less overhead.
Also further improvements could be done in it at the packet layer so that it
has very less overhead with each packet instead of encapsulating it in HTTP packet instead of FTP
packet .We could design packets which has very basic encryption because very less integrity is
required.Similar other minor improvements could be done so that we could have further improvement
in the speed as well as reduction in download time.
References
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]