Scott Weber
CSE 403

Review of "Topologically-Aware Overlay Construction and Server Selection"


Peer-to-peer Internet applications build an application-level network on top of the underlying infrastructure of the Internet.  While complicated algorithms may be used when building the network in order to satisfy and simplify some application goals, they usually do not deal with the layout of the underlying infrastructure.  As a result, a node that is two application-level hops away may end up being dozens of network-level hops away.  This paper proposes a proximity estimation technique and shows that even very simple estimation can yield much better results than no consideration.  The authors also propose use of a proximity estimation technique for effective server selection.

The binning algorithms proposed by the authors makes use of round-trip-time (RTT) and a set of landmark machines.  Each node measures the RTT between itself and each of the landmark machines.  It then orders the results to obtain an ordering of landmark machines.  Each ordering becomes the label for a bin so that every node with a given ordering is a member of the corresponding bin.  Since each node is responsible for determining its own bin, the algorithms is scalable.  Each node need only send a handful of ping messages to determine its placement.  The authors spend a great deal of time explaining how binning is scalable and why it works, but do not mention how a node determines the closest node to connect with in general.  This is probably an application-dependent feature, but it is never explicitly left as such.  Without being able to communicate bin information between nodes, this algorithm is useless.

The algorithm was simulated using several different Internet topologies for testing.  The use of multiple topologies is important as there are conflicting ideas about the best way to simulate the Internet.  The fact that binning performs well on all tested topologies suggests that it should also perform well in the Internet.  The choice of latencies for the transit-stub model is counter-intuitive.  The generally accepted view of the Internet is a high-speed, low-latency backbone with lower-speed, higher-latency networks connected to it.  The further you get from the backbone, the lower the speed and the higher the latency of the connections.  However, the latencies as chosen by the authors for the transit-stub model, are exactly the opposite: higher in the backbone, lower at the edges.  This nonintuitive choice needs validation or, at the very least, explanation.

The authors describe how to apply binning to the CAN overlay network.  The simple implementation does not attempt to balance the distribution of nodes across the coordinate space, only to place nodes that near each other in the underlying network near each other in the overlay.  The authors leave the addition of balancing to future work.  However, balancing the distribution of nodes in the coordinate space is one of the key principles of CAN.  Since the binning method implemented destroys this principle, it is no longer truly a CAN overlay and should not be directly compared to a true CAN.  The authors make some concessions to try to compensate for the differences, but it may not be enough.  CAN is implemented to work in a balanced environment and may perform rather unexpectedly when this is not true.

As with the overlay networks, implementation of binning in server selection is left without being described or delegated.  One could imagine a system in which the client sends its bin information to the server which, knowing the bin of all mirrors, chooses the appropriate destination and informs the client.  However, no mention is ever made in the paper as to how this might work.  Modifications to DNS or the HTTP would likely be necessary and difficult to deploy, perhaps making the findings of this paper moot.

This paper presents a new algorithm for easily determining the proximity of two nodes on the underlying network.  While the authors do show that the algorithm is simple and scalable and that it yields good results, too many implementation details are left out.  It is not immediately evident how this technique could be usefully implemented in current applications.  For this reason, I would recommend this paper as a good background paper, usefully only to those planning to build on this research, hopefully through application and implementation of the ideas within.

REFERENCE
Sylvia Ratnasamy, Mark Handley, Richard Karp, and Scott Shenker. "Topologically-Aware Overlay Construction and Server Selection," Proceedings of IEEE INFOCOM, 2002.