Bittorrent

Dennis Faas's picture

BitTorrent is the name of a client application for the torrent peer-to-peer (P2P) file distribution protocol created by programmer Bram Cohen. BitTorrent is designed to widely distribute large amounts of data without incurring the corresponding consumption in server and bandwidth resources (and typically, monetary fees attracted as a result of that).

The original BitTorrent application is written in Python and its source code has been released under the BitTorrent Open Source License (a modified version of the Jabber Open Source License), as of version 4.0. The name "BitTorrent" refers to the distribution protocol, the original client application, and the .torrent file type.BitTorrent: How it works

The bittorrent protocol breaks the file(s) down into smaller fragments, typically a quarter of a megabyte (256 KB) in size. Peers download missing fragments from each other and upload those that they already have to peers that request them.

The protocol is 'smart' enough to choose the peer with the best network connections for the fragments that it's requesting. To increase the overall efficiency of the swarm (the ad-hoc P2P network temporarily created to distribute a particular file), the bittorrent clients request from their peers the fragments that are most rare; in other words, the fragments that are available on the least number of peers, making most fragments available widely across many machines and avoiding bottlenecks.

The file fragments are not usually downloaded in sequential order and need to be reassembled by the receiving machine. It is important to note that clients start uploading fragments to their peers before the entire file is downloaded. Sharing by each peer therefore begins when the first complete segment is downloaded and can begin to be uploaded if another peer requests it. This scheme is particularly useful for trading large files such as videos and operating systems. This is contrasted with conventional file serving where high demand can lead to saturation of the host's resources as the consumption of bandwidth to transfer the file to many requesting downloaders surges.

With BitTorrent, high demand can actually increase throughput as more bandwidth and additional seeds of the file become available to the group. Cohen claims that for very popular files, BitTorrent can support about a thousand times as many downloads as HTTP.BitTorrent: Sharing files

To share a file using BitTorrent, a user creates a .torrent file, a small "pointer" file that contains:

  • the filename, size, and the hash of each block in the file (which allows users to make sure they are downloading the real thing)  
  • the address of a "tracker" server (which is discussed below)  
  • and some other data (like client instructions). 

The torrent file is then distributed to users, often via email or placed on a website. The BitTorrent client is started as a "seed node", allowing other users to connect and begin downloading. When other users finish downloading the entire file, they can optionally "reseed" it -- becoming an additional source for the file. One outcome of this approach is that if all seeds are taken offline, the file may no longer be available for download, even if a client has a copy of the torrent file. However, everyone can eventually get the complete file as long as there is at least one distributed copy of the file, even if there are no seeds.

Downloading with BitTorrent is straightforward. Each person who wants to download the file first downloads the torrent and opens it in the BitTorrent client software. The torrent file tells the client the address of the tracker, which, in turn, maintains a log of which users are downloading the file and where the file and its fragments reside. For each available source, the client considers which blocks of the file are available and then requests the rarest block it does not yet have. This makes it more likely that peers will have blocks to exchange. As soon as the client finishes importing a block, it hashes it to make sure that the block matches what the torrent file said it should be. Then it begins looking for someone to upload the block to.

BitTorrent gives the best download performance to the people who upload the most, a property known as "leech resistance", since it discourages "leechers" from trying to download the file without uploading it to anyone. (Although, confusingly, when used in opposition to "seeds" or "seeders" as in "S/L ratio" (meaning "seed/leech ratio"), "leecher" only means someone who hasn't downloaded the full file yet.)

Though BitTorrent is a good protocol for a broadband user, it is less effective for dial up connections, where disconnections are common. On the other hand, many HTTP servers drop connections over several hours, while many torrents exist long enough to complete a multi-day download.BitTorrent: Terminology

  • Torrent: A torrent can mean either a .torrent metadata file or all files described by it, depending on context. The torrent file contains metadata about all the files it makes downloadable, including their names and sizes and checksums of all pieces in the torrent. It also contains the address of a tracker that coordinates communication between the peers in the swarm.  
  • Swarm: Together, all users sharing a torrent are called a swarm. Six peers and two seeds make a swarm of eight.  
  • Peer: A peer is one instance of a BitTorrent client running on a computer on the Internet that you connect to and transfer data. Usually a peer does not have the complete file, but only parts of it, however, 'peer' can be used to refer to any participant in the swarm (in this case, also known as a 'client').  
  • Seed: A seed is a peer that has a complete copy of the torrent and still offers it for upload. The more seeds there are, the better the chances are for completion of the file.  
  • Super Seed: When a file is new, much time can be wasted because the seeding client might send the same file piece to many different peers, while other pieces have not yet been downloaded at all. Some clients, like ABC and Shadow's Experimental, have a "superseed" mode, where they try to only send out pieces which have never been sent out before, making the initial propogation of the file much faster. This is generally used only for a new torrent, or one which must be re-seeded because no other seeds are available. leech A leech is usually a peer who has a negative effect on the swarm by having a very poor share ratio - in other words, downloading much more than they upload. Most leeches are users on asynchronous internet connections who do not leave their BitTorrent client open to seed the file after their download has completed. However, some leeches intentionally hurt the swarm to avoid uploading by using modified clients or excessively limiting their upload speed. The term leech is also incorrectly used to refer to what should properly be called a peer, a member of the swarm who has not yet downloaded the complete file.  
  • Tracker: A tracker is a server that keeps track of which seeds and peers are in the swarm. Clients report information to the tracker periodically and in exchange receive information about other clients that they can connect to. The tracker is not directly involved in the data transfer and does not have a copy of the file.  
  • Availability: (also distributed copies) The number of full copies of the file available to the client. Each seed adds 1.0 to this number, as they have one complete copy of the file. A connected peer with a fraction of the file available adds that fraction to the availability (ie. a peer with 65.3% of the file downloaded increases the availability by 0.653).  
  • Interested: Describes a downloader who wishes to obtain pieces of a file the client has. For example, the uploading client would flag a downloading client as 'interested' if that client did not possess a piece that it did, and wished to obtain it.  
  • Choked: Describes an uploader to whom the client does not wish to upload. An uploading client 'chokes' another client in several situations: The second client is a seed, in which case it does not want any pieces (I.E.: it is completely uninterested)  The uploading client is already uploading at its full capacity (ie. the value for max_uploads has been reached).  
  • Snubbed: An uploading client is flagged as snubbed if the downloading client has not received any data from it in over 60 seconds.  
  • Scrape: This is when a client sends a request to the tracking server for information about the statistics of the torrent, like who to share the file with and how well those other users are sharing. Comparison to other file sharing systems Version 4.0.4 running in Windows XP.
BitTorrent: Legal issues

BitTorrent, like any other file transfer protocol, can be used to distribute files without the permission of the copyright holder. BitTorrent has become famous for its ability to also share copyrighted files.

Source / more information: wikiPedia.com

| Tags:
Rate this article: 
No votes yet