Review of "On the Effectiveness of DNS-based Server Selection"
David Deschenes
April 3, 2003
Advanced Topics in Networking

Summary
In this paper [2], the authors discuss the drawbacks of using the DNS for server selection. Because of its transparency, DNS-based server selection has been widely employed, and is used by the likes of Akamai and Digital Island. However, as the authors point out, DNS-based server selection has two problems:

1. It typically requires the use of very low TTL values for DNS responses, which can negatively impact the benefits of caching.

2. It requires making the assumption that clients are topologically close to their DNS servers, which is not necessarily the case.

The main contributions of the paper are quantifications of the impacts of the above two problems on the effectiveness of DNS-based server selection.

Using ISP proxy logs, the authors set out to show the negative impact of using low TTL values for DNS responses. In order to do so they extracted a list of servers from their proxy logs and measured the DNS request latency from each of four locations (Massachusetts, New York, Michigan, and California) to each server. Each measurement from client to server was executed three times, and a measurement was varied across its three executions as follows:

1. The local nameserver had neither the server addresss nor the authoritative namesever address in its cache.
2. The local nameserver had the authoritative nameserver address in its cache.
3. The local nameserver had the server address in its cache.

The authors found that as the level of caching improved the DNS request latency improved by an order of magnitude. Consequently, the authors concluded that because DNS-based server selection typically results in little caching, it worsens request latency by two orders of magnitude. The authors also point out that the latency in retrieving a web page is likely to be much worse as a typical page requires 3.7 name resolution operations.

To quantify the impact of the assumption of topological closeness between clients and their nameservers, the authors use traceroute to measure the proximity of a number of client, nameserver pairs. Their measurments were performed over two sets of pairs, one extracted from the ISP proxy logs mentioned earlier and the other determined from dial-up accounts for which the authors registered. They found that the median cluster size, or maximum disjoint path length, for the first set was 5 hops and that for the second set it was slightly higher, 7 to 8 hops depending on the measurment location. However, more than 30% of the pairs are in 8 hop clusters, regardless of the set. Additionally, the authors found that in most cases the disjoint path length is significantly long relative to the common path length. Finally, in order to further show the disimilarity in the network performance between clients and their nameservers, the authors obtained RTT measurements to all clients and nameservers. They found that there is little correlation between nameserver latency and actual client latency.

Review
In general this paper is well written and addresses an important topic, yet it is not significantly insightful so as to warrant a reccomendation. In particular, the paper has two problems that undermine its conclusions:

1. The use of low TTL values for DNS responses may in fact negatively impact caching within the DNS hierarchy, but will not likely result in increased client perceived latency. Because user-agents typically employ DNS response caching on the order of minutes (which the authors mention), subsequent requests to the same server are not likely to require additional DNS requests.

2. Although the authors have shown that the proximity of a nameserver to a server does not imply proximity of that nameserver's clients to the same server, there is no proof that content distribution networks use only that information to make load-balancing decisions. I find it highly unlikely that CDNs aren't using more complicated heuristics. Moreoever, the authors' suggestions fly in the face of the results of [1], which suggests that many CDNs are, in fact, providing better levels of service than some very popular web sites.

I should mention that the authors' comments regarding the use of both iterative and recursive DNS queries lead me to an interesting thought: why are there two methods and would it have been better to only allow for iterative queries?

Citations
[1] Krishnamurthy, Wills, and Zhang. On the use and performance of content distribution networks. ACM SIGCOMM Internet Measurement Workshop, 2001.

[2] Shaikh, Tewari, and Agrawal. On the Effectiveness of DNS-based Server Selection. Proceedings of IEEE INFOCOM, 2001.