BitTorrent basics - protocol etc

July 10, 2007 at 01:25 PM | categories: Technical, BitTorrent | View Comments |

I started working on a BitTorrent implementation in 2006. The protocol is kind of fun and there is a need for a lightweight UNIX implementation in my opinion. rTorrent is quite usable, but absolutely huge - something like over 40,000 lines of C++ last time I looked. My implementation is ~3,500 lines at present and has most of the needed functionality - although admittedly it will not support downloading of multiple torrents at a time, nor the fancy UI of rTorrent. Even assuming it doubles in size by the time its finished, that is still quite lightweight in comparison. It is written in C, Yacc and uses libevent - and its under the BSD license. I develop it completely under OpenBSD, although across multiple platforms (to weed out the endianness and pointer-size bugs). It should be easily portable to other systems with some glue (much like OpenSSH or OpenNTPd). Anyway, it has shaped up rather well and in fact I have been able to actually download some 'real-world' torrents with it. However, I still need to overhaul the network scheduler. Once I am satisfied with this part, I will publish the source code. The network scheduler is a complicated component - with numerous research papers on the subject! - and I plan to cover it in future postings.

For now, I want to post a quick summary of how BitTorrent works. I'm going to describe 'vanilla' BitTorrent - without distributed, trackerless operation and unofficial encrypted streams. The best complete "specification" I have found is this wiki page - the official specification being rather useless. Unfortunately, the official Python implementation sources are a neceesary reference at times also.

BitTorrent splits data into a number of units called pieces. Typically these pieces are 256K. Each piece has an associated 20 byte SHA-1 hash to verify integrity. The pieces are sub-divided further into what are called 'blocks' - typically 16K. Each peer has zero, some or all pieces. Metadata is distributed in .torrent files. Very likely you are familiar with downloading these files via HTTP and then feeding them to a BitTorrent implementation in some way. These files are encoded in a mixed binary/ASCII format called 'bencode' - which is a kind of mini-language supporting nested strings, integers, dictionaries and lists. Its actually quite a pain to parse in a language like C - it took me quite some time to write the Yacc parser. The metadata contains things like, number of pieces, piece sizes, hashes for each piece, tracker URL and filenames. Pieces are all zero-indexed. This gets a little complicated in multi-file torrents, where you must determine which file piece X falls within, and at what offset. My implementation uses mmap(), and has translation routines to nicely abstract these calculations away :-)

Each peer must initially 'announce' itself to the tracker, and is supposed to send further updates periodically. Communication with the tracker is done via HTTP GET, with the tracker returning a bencode'ed dictionary as its response. This response includes a list of other peers - whose addresses are in fixed-length (IPv4) binary format, making it a little difficult to use IPv6 for BitTorrent, at least without protocol changes. There are a variety of peer messages. Data is transferred in terms of pieces, offsets and lengths (block size). Peers tell each other which pieces they have by sending a bitmap - one bit per piece.

The basics of the protocol are fairly straight forward. Aside from some poor decisions in terms of the bencode format for encoding messages, it is not hard to code support. As I said above, things start to get tricky when it comes to peer communication strategies. One thing of note is that the most efficient algorithm for downloading pieces is believed to be 'rarest first order'. To my knowledge, nobody is using a sequential download order - which is why partial BitTorrent downloads of audio or video data are notoriously unplayable.

Niall O'Higgins is an author, event organizer and software consultant. He wrote the book MongoDB and Python, published by O'Reilly. Events he organizes include We Have Tablets, the #1 Bay Area Tablet Computing Meet-up and PyWebSF. He also offers consulting services for Mobile, Tablet and Cloud Computing.

blog comments powered by Disqus