Scott Weber
CSE 498

Review of "Measuring ISP Topologies with Rocketfuel"


Simulation of the Internet is is an important part of networking research.  Since it is so large and is a public network, it is both easier and better to use a model to represent the Internet in testing rather than trying to test directly in it.  In order to effectively simulate the Internet, it is important to have an accurate model upon which to build a simulator.  The accuracy of the model affects the quality of the results.  Obviously, a more accurate model will yield better results that are more representative of what the situation would be in the real network.  In order to obtain an accurate model, the Internet must be measured and mapped.  This paper describes the mapping efforts of a group using a system they built called Rocketfuel.

Rocketfuel essentially uses intelligence and traceroute to map the router level of the Internet.  Information from routing tables is used to minimize the number of routes that must be traced for maximal coverage of the network.  Using this method, the tool is able to map 40%-80% of the routers in the Internet while using only 0.1% of the traces required by a brute-force mapping.  Further information gathered from DNS and other sources is used to combine multiple addresses that refer to the same physical router to give a better representation of the network.

The system also uses DNS to help determine the physical location of the routers it finds.  Most ISPs appear to have a policy to place some sort of geographical identifier in the name for each of its routers.  For a router that cannot be placed from this information, the system infers its location from the location of its neighbors.  Intuitively, it follows that most routers will be connected mostly to nearby routers, making this assumption seem acceptable.  However, it may not be appropriate.  One could imagine a single router located in a small city with one or more connections to "neighbors" in a nearby larger city.  If the router in the smaller city has no location identifying information attached to it, its location will be incorrectly inferred as that of its neighbors in the larger city.  The authors should provide data to show that their assumption provides no false identification, or if it does, they should show why this do! es not negatively impact the map generated.

This study relies on the naming convention used by an ISP to be stable, consistent and logical.  While it seems an ISP really has no reason make its naming convention difficult to follow, it also has no external incentive to make it easy.  An ISP may choose to keep its network more private by using an encoded naming scheme that would collude information obtainable from the network, but still enable the ISP to easily locate the source of problems when they occur.  Perhaps I'm being a bit too picky here, but this seems like a major assumption that is never explicitly stated as such in the paper.

The results of this study were validated by three of the ISPs that were mapped.  The authors do not disclose anything about the three ISPs that responded due to agreement between the authors and the ISPs.  The validation is presented general proof that the Rocketfuel system is effective at mapping network topology.  However, too little information about this validation is given to apply it generally.  It is possible that the three responses received were from ISPs that share resources, philosophies, network designs or other properties.  For example, how generally applicable are these results if all three have similar network sizes, similar topologies or peer in many locations?  While the ISP validation shows that Rocketfuel can get good results, it says nothing about from what circumstances these results were obtained.

Rocketfuel is scalable in at least one aspect: adding new vantage points from which traceroutes can be run improves either the system speed or the accuracy of the result.  It would be interesting to know whether this results from a design objective or a design side-effect.  It would also be nice to know how the rest of the system scales as well.  It seems that the database that is central to the system could become a bottleneck at some point.

My biggest problem with this paper is that it does not state how long it took to for a single run of Rocketfuel to complete.  The longer the elapsed time, the less the results are a snapshot of the state of the network, possibly degrading the usefulness of the results.

Overall this is a good paper that introduces what seems to be a powerful and relatively lightweight way to determine the structure of the Internet.  The results obtained may also prove to be useful for developing better simulations of the Internet.  I would recommend this paper for reading by anyone studying the Internet.

REFERENCE
Neil Spring, Ratul Mahajan and David Wetherall. "Measuring ISP Topologies with Rocketfuel." Proceedings of ACM SIGCOMM, August 2002.