Main Page | File List | Related Pages

MoonlightTorrent Protocol - Preliminary Specifications (rewrite pending [2004-04-17])

Note: This page has been temporarily removed from the main index since it will remain in "rewrite pending" state for the foreseeable future because I have no time to re-think and rewrite it.

[2004-04-17] I have started adding clarification and doing some random corrections. Obviously, a number of things have changed over the last four months that I have not updated this. Now, MT is fairly mature network-wise, I have had many insights from practical situations and will soon proceed with this overdue rewrite, most likely including some useful extension proposals from other sources along the way. (But not before the first build of my next series is out.)

While the BitTorrent protocol is simple and gets the job done, it has one major drawback: at high download rates and relatively small piece sizes, it can become woefully inefficient and wasteful. The changes below are aimed at creating a better behaved and more efficient file distribution protocol based on the BT protocol and also provide developpers some means of implementing experimental extensions with no risk of breaking or clashing with the standard protocol.

2003-12-22: Now that MoonlightTorrent(.com) is working and that the stuff below has been discussed to some extent on IRC, I have concluded that most of the stuff below will not really be necessary so this page will be rewritten in the next week or in January...

What I think the BT protocol needs the most:

What makes the BT protocol so critically inefficient on highly asymetric links?

My main gripe with the BT protocol has to do with the following from the BT protocol specifications at http://bitconjurer.org/BitTorrent/protocol.html

The 'have' message's payload is a single number, the index which that downloader just completed and checked the hash of.

Why is this a problem? Well, for one thing, Have messages are by far the most frequently sent messages and when they are the only thing being sent in one TCP packet, that packet will contain:

This means each individual Have message carrying a 5 bytes payload generates up to 44 bytes of overhead, this is just over 10% in network traffic efficiency.

A single Have packet every now and then is fine... now, consider the case, where you have 100 peers while downloading at 360KB/s and the torrents have standard 256KB pieces...

2003-12-21: If you think this theory looks bad, it actually gets worse in practice. I have started using Performance Monitor to keep an eye on NIC traffic while transferring files using BT. It turned out that actual overhead in a case similar to the above can reach 20KB/s, meaning my assumption that Haves accounted for most of the 'waste' may be marginally accurate since they account for only half the measured overhead.

Proposed protocol changes / MoonlightTorrent Protocol / BTv2

Behavioral Changes:

The MoonlightTorrent Protocol - A More Efficient Torrent Protocol:

Behavioral Changes

These are meant to avoid unnecessary messages, reduce message frequency and reduce the frequency at which the TCP/IP bandwidth 'tax' is paid without changing the protocol definition.

Avoid generating redundant messages

Some clients send KeepAlive messages within seconds of sending something else. Since receiving any valid message from a peer can be considered as an implicit KeepAlive, the extra KeepAlive is very much unnecessary. Thankfully, even with 500 connected clients sending KeepAlives every minute, this still accounts for less than 0.5KB/s including TCP/IP overhead. However, it does not make these any less unnecessary and wasteful. Since the clients are also supposed to keep advertising their [dis]interest, there is hardly any justification to using the KeepAlive message.

Note:
[2004-04-17] One exception to this is when waiting for a slo/no-poke to send overdue requested data. Since I have added KeepAlive for this condition (every 10 seconds after waiting for a chunk more than 30 seconds), I have not noticed any effect on no-senders so I conclude KeepAlives are mostly useless.

Withold small messages

Most OSes' TCP implementation will buffer small packets for a few miliseconds and merge them until a full-size packet can be formed or the TCP window times out. By holding small messages for up to 10 seconds and sending them all at once, the TCP overhead would be incurred once every ~10 messages instead of every single one of them, increasing the network efficiency from 10% to about 40% and the overhead bandwidth would drop from 6.8KB/s to a much more reasonable 1.8KB/s.
Note:
The potential savings are less while downloading from multiple torrents since the download bandwidth is spread across more files so each individual torrent's completion rate will be lower than the 1.4 pieces/second used above.

[2004-04-17] The above (and pretty much everything else on this page actually) was written in January while I was more concerned with making MT work without crashing every few minutes than trying to get stable upload/download speeds. Since then, I have noticed that many BT clients already do withold small messages then send them in bursts to use the TCP/IP's small packet coalescing.

Mixed Trickled Have and Field

As a variant of witholding messages and pack them in one TCP packet, Have messages could be trickled.

Note:
[2004-04-17] Alternately, it would be possible to probe the other client when the first extra bitfield is sent and flag it as non-multifield-friendly if this results in a disconnect.

Example Case

Setup:

Case:

The MoonlightTorrent Protocol - A More Efficient Torrent Protocol

Why is there a "Message Length" field in the BT protocol? The only reason I can think of is to increase overhead... :)

Also, since the small messages usually should be grouped together before sending, compression could be of great help in improving network bandwidth efficiency since much of these small packets' data is highly redundant. This compression would either take the form of a compressed socket stream or a 'compressed messages' message.

Compressed Streams (most likely using zLib)

When I first considered proposing compression in BT, I was thinking about the Piece messages' payload. Much later in MT's development (roughly a month later), I started looking into effective ways of reducing bandwidth and after doing the maths from the previous section, reducing message overhead on a global scale seemed like a good idea and the simplest way of doing this is to compress the socket stream itself and here is why:

Compressed Messages (most likely using zLib)

For those who think full-stream compression is too much, MT will also have a "Compressed Messages" message. This one is specifically targetted at packing multiple specific messages then compress them as one larger message. This could be used to compress everything but Piece messages for those who think compressing Piece messages is an indisputable waste of time. Message compression will be less effective since each compressed message will be independant... unless they are made to act like a sort of sub-stream, in which case the control message compression (as compressed messages) could be better than it would be with full-stream compression.

Reduced message sizes

This is fairly straightforward:

Example Case

Setup:

Case:

Of course, with compression as mentionned previously, most of the management messages could be compressed to near nothingness and network bandwidth efficiency could certainly be much higher.


Hits since December 5, 2003:
Count.cgi?display=counter|df=vllmtqtw.dat

Generated on Tue Aug 24 23:57:31 2004 for MoonlightTorrent(.com) by doxygen 1.3.8