David Manura
2003-02-27
CSE-498 (Adv. Networks)

Wide-Area Traffic: The Failure of Poisson Modeling -- A Review

This paper [P95] examines a variety of TCP network traces to demonstrate the extent to which TCP network traffic is more closely modeled by means other than Poisson processes. In particular, the authors focus on the lack of burstiness in Poisson processes as well as the ways that bustiness can be modeled by other means such as heavy-tailed distributions and self-similar processes. Burstiness is addresses on a variety of levels, such as time between connections, packet inter-arrival time, and packets per connection. A variety of TCP-based protocols, specifically Telnet and FTP, are examined individually. A full model for Telnet traffic is derived based on Poisson connection arrivals, log-normal connection sizes in packets, and a Tcplib distribution of packet inter-arrivals. FTP data transmissions are found to be extremely bursty (about half of traffic volume comes from 0.5% of the bursts of data), but a full model is not attempted.

The results of this research are useful for some types of network modeling. In particular, the authors point out queuing behavior, which is dependent on the distribution of data arrival (connections, packets, bursts, or otherwise) over time. For example, the burstiness of network traffic has an obvious impact on the behavior of RED gateways, such as the propensity for queue overflow and TCP timeouts, as mentioned in a previous paper coauthored by Floyd [FJ93]. Overall, the data largely agree with previous research and only reinforces and elaborates on the need for non-Poisson process modeling. The trace data is dated, however. At the time, HTTP was an emerging protocol, NNTP and Telnet were popular, and FTP was dominant. This is no longer the case, especially in the case the now dominant HTTP. Interestingly, the authors state that preliminary evidence indicates that HTTP follows a decidedly non-Poisson process. This would have been obtained on HTTP 1.0, which does not have persistent connections, so these results could apply poorly today.

Figure 2 could be refactored for clarity. One could plot % Uncorrelated and % Exponential on separate graphs but put the 10 min and 1 hour cases on the same graph. This will facilitate easier comparison of the data (as of now, points from the same protocol are scattered.)

The paper accounts for partly unexpected results (e.g. user behavior resulting in non-Poisson processes), but these many of these accounts are largely hypotheses, although plausible. The distribution of Telnet packets could also be due to language and keyboard layout effects. Controlled experiments could be done rather than relying on existing traces.

The paper overall is a difficult read, and lots of ideas are presented. The argument depends on a variety of results and theories possibly unfamiliar to the reader, and these are simply references. This is understandable, but with a minor amount of additional background provided in-line to the paper itself, the paper would be easier to read and more accessible to a wider audience. For example, I was initially confused as to whether Tcplib was a statistical distribution or a software package (apparently both). The use of the variance-time plot is effective, and pivotal to the paper, but one result could be explained: the variance of a Poisson process aggregated to a level M will be 1/M times the variance of the unaggregated process, and that this is not true for a non-Poisson process. The replacement of a conclusions section with an implications section seemed unusual and similarly terse, but in this case it does seem efficient.

In summary, the paper is quite dense, but it does present a fairly broad picture of a topic I hadn^Òt thought about^×the distribution of packets and connections rather than merely average and standard deviation of these measurements^×and it outlines a rich mathematical framework for this. Many of the results are not entirely new nor conclusive, but the paper supports, extends, and clarifies the analysis of network traffic behavior nevertheless.

[FJ93] Floyd, S. and Jacobson, V. ^ÓRandom Early Detection Gateways for Congestion Avoidance.^Ô /IEEE/ACM Transactions on Networking/. 1(4), pp. 397-413, 1993.

[P95] Paxson, V. and Floyd, S. ^ÓWide-Area Traffic: The Failure of Posson Modeling.^Ô /IEEE/ACM Transactions on Networking/, 3(3), pp. 226-244, June 1995.