Review of Rowstron and Druschel.
Kiran Komaravolu

In their article titled "Pastry: Scalable. edcentralized oblect location and routing for large peer to peer systems", the authors introduce a mechanism for object location and message routing in p2p systems. Pastry is not a complete p2p file sharing system byt itself, but one can be built on top of its routing mechanism.

Each node in Pastry has a nodeID. When given a message and a message key (a hash of the message itself), Pastry effectively routes this message to the node that contains some information about this message. This is done by sending the message to the node with the nodeID numerically closest to the key amongst all current live nodes. The message itself could be some request for a file, the destination node knowing where the file is physically present.

The node state of Pastry has three sets of information: the leaf set, the routing table and the neighbourhood set. The leaf sets has L/2 numerically closest larger and L/2 closest smaller nodes and is used for direct routing. The routing table has log N [with base 2b] rows with 2**b-1 entries each. Every ith row shares first i digits with present node^Òs id but i+1th digit has any one of 2**b-1 possible values. The routing table is used for incremenatal routing. The neighbourhood set has nodeIDs and IP addresses of M nodes that are closest (by proximity) to local node.

When a message with key D arrives, if D is in the leaf set, message is routed directly, else the message is sent to a node (from routing table) such that its nodeID shares a longer prefix with the key than the current node's ID.

If such a node doesnt exist, then the message is routed to the node which shares a prefix atleast as long as the current node and is numerically closer to the key than the current node.

The average case takes an O(log N) [base 2b] number of hops and the worst case will take O(N). The message delivery guaranteed unless L/2 nodes with consecutive node ids fail simultaneously.

When a new node X wishes to join the network, it sends a join request with key X to some node A already in the network and in the proximity of X. A's neighbourhood set fills up X's set. ^ÓA^Ô returns A0 and this is the X0 row in the routing table. ^ÓA^Ô routes to node ^ÓB^Ô that shares one more digit in prefix ^ÓB^Ô sends B1 ^ÓB^Ô routes to C and C sends C2 These steps are repeated for each successive node X search goes to and receive row Xi. Each ^Óith^Ô node in search sends its Xi row, These exchanges fill up the Routing Table. If ^ÓZ^Ô is last node. Besides the last level of row being sent, it also fills up the leaf set. X sends its state to all nodes replies received from, All intermediate nodes receiving ^Ójoin^Ô now include X.

 

Overall Pastry makes a fairly robust and efficient mechanism for routing messages in an overlay network. Unfortunately it suffers from some of the same problems that have plagued other existing p2p search mechanisms (eg CHORD,CAN). Some of the critisism that follows may be unfair since Pastry calls itself as a search and routing protocol only.

In this mechanism It is fairly easy for a node to become a hotspot. The upper layer which is using this scheme must be careful and implement some caching scheme to avoid this problem. Node failures will mean lost data. It is hard to imagine how a total data loss can be prevented. Node replication is always a desirable feature. It can combat both the hotspot and node failure issues. A scheme where two nodes may share the same nodeID (or same nodeID pointing to two IP addresses) may be introduced.

The paper does not talk about message to key hashing (probably out of scope and striaghtforward). The bigger problem is key distribution. How will new keys that are added to the system be ditributed to the nodes in an efficient manner ? In other words, how will users advertise their content to the rest of the world. With Pastry, implementing an advertisement scheme looks very expensive. Pastry does not provide any significant benefits over CHORD and CAN routing mechanisms. While the article itself is interesting, the scheme cannot be used for implementation in the real world.