Free Counter

Wednesday, September 21, 2005

How to cheat BitTorrent and why nobody does




Introduction

BitTorrent3 is currently “king” of the popular file-sharing clients. Some reports have claimed that it accounts for the majority of peer-to-peer traffic on the internet others that Hollywood is doomed [10, 13]. Bram Cohen, the original inventor of BitTorrent, is now under high demand on the invited talks circuit.

How BitTorrent works

BitTorrent works by groups of users (called swarms) with an interest in downloading a single specific file (be this an mp3, dvd or executable file) coordinating and cooperating to speed-up the process. Cohen describes how this works in some detail in his paper [3]. Here we give enough relevant details to allow us to develop our hypothesis.
To release a file on the BitTorrent network, one needs to create a specific description file (commonly called a “torrent file”) which contains the necessary information for clients to prepare the download and join the swarm. For our concerns, the main information stored in this file is the address of the “tracker”.

Faking Identity in BitTorrent

In the context of BitTorrent, identity is signalled to the tracker and to other peers using a 20-byte string. A unique identity is generated by the client for each swarm it participates in. This identity is used for every interaction with the tracker and is sent to other peers during the handshake at the beginning of each connection. The tracker, when it returns a list of peers, sends the identity of each of them, in addition to the address and port to connect to.

When a client initiates a connection, it checks whether the identity it receives in the handshake matches the one it has obtained from the tracker. If this is not the case, it drops the connection. The recipient of the connection, however, cannot perform this sort of checking as it has only a limited view of the peers in the swarm. Moreover, trackers do not provide an interface to perform online identity checks.

A client that would want to fake its own identity could do it very easily. As long as trackers do not allow online identity checks (based on IP addresses and ports), it is sufficient to have one identity to interact with the tracker and with the connecting peers, plus one distinct identity for every other peer it connects to. Given the current liberalism of the current implementations it is not even necessary to remember the identity used with a given peer and the (fake) identity could be created randomly for each new outgoing connection. In the eventuality where trackers and (legitimate) clients would become more cautious, it would be necessary to open a new port for each identity used and register to the tracker using these parameters. As long as trackers support clients connecting from behind a firewall, this subterfuge cannot be defeated.

Why does BitTorrent work?

Given that, currently at least, BitTorrent is “king” of the file-sharers and yet, it would seem, is so easy to cheat, why does it work? Also, as stated previously, the system relies to some extent on the pure altruism of ”seeders” who have nothing to gain from continuing to serve the file. Why is there so much cooperation going on? Why doesn’t selfish behaviour swamp the system? One obvious answer is that people are simply more cooperative and altruistic than a worst case kind of economic rationality would suggest - that people don’t always act selfishly when they could.

Leaving Meta-Data to the Users

One aspect that place BitTorrent apart from many other file-sharing systems is the way that meta-information concerning content is distributed. BitTorrent, infact, does not distribute metainformation at all. In order to download a particular file using BitTorrent the user must supply the details of the specific file which are included in a .torrent file. How the user gets this .torrent file is not a concern of the BitTorrent client.


Torrents as Tribes

This “leave it to the user” approach to meta-data is sometimes considered a weakness of BitTorrent, however, we hypothesise this is actually a key strength of the system and helps to support altruism and cooperation.

It does this in two ways. Firstly, by fostering cooperative in-groups of like-minded users with common interests and isolating them, to some extent, from casual users finding meta-data with simple queries - to find and register onto some torrent websites is time-consuming and requires users who know what they are looking for so reducing the casual user (who may be less altruistic). Secondly, and more significantly, each individual torrent swarm is logically isolated from all other torrent swarms, even those sharing the same files running on different trackers.

Conclusion

We have argued that the success of Bittorent is unlikely to be due purely to the use of the tit-fortat inspired protocol, as is often claimed. We argue that the real driving force behind the high cooperation might be the by-product of the lack of meta-data search within BitTorrent. This results in the creation of a number of disconnected “tribes” at both the swarm and the tracker level.

The users are active in the tribal dynamics by selecting those tribes that best satisfy their needs hence tribes filled with free-riders will tend to die out. We compare this to existing simulations of tribal dynamics in both human and peer-to-peer systems.

We advance a number of hypothesis which our theory suggests including the production of both cheating peers and unconditionally altruistic peers. We argue that releasing such peers into the “wild” of the BitTorrent ecology would not damage BitTorrent but could actually increase the system level performance because unconditional altruism would tend to be selected and predominate. In order to fully test these hypotheses we would need to construct and release such clients into the ”wild” and collect data from them. This may be the subject of future work. It’s a sobering thought to consider that, perhaps, the most bandwidth hungry applications on the internet today work by complex social mechanisms we don’t yet understand. However, this is less sobering when one realises that since peer-to-peer systems are really just computationally supported human social systems then we should expect the same issues to arise as we observe within human social systems - namely the question “what’s going on?”.

How to cheat BitTorrent and why nobody does

0 Comments:

Post a Comment

<< Home