The Design and Evaluation of Web Prefetching and Caching Techniques

Complete dissertation (329 pages, PDF, 4.75MB)
(Chapter by chapter version)
Brian D. Davison

Abstract
User-perceived retrieval latencies in the World Wide Web can be improved by pre-loading a local cache with resources likely to be accessed. A user requesting content that can be served by the cache is able to avoid the delays inherent in the Web, such as congested networks and slow servers. The difficulty, then, is to determine what content to prefetch into the cache.

This work explores machine learning algorithms for user sequence prediction, both in general and specifically for sequences of Web requests. We also consider information retrieval techniques to allow the use of the content of Web pages to help predict future requests. Although history-based mechanisms can provide strong performance in predicting future requests, performance can be improved by including predictions from additional sources.

While past researchers have used a variety of techniques for evaluating caching algorithms and systems, most of those methods were not applicable to the evaluation of prefetching algorithms or systems. Therefore, two new mechanisms for evaluation are introduced. The first is a detailed trace-based simulator, built from scratch, that estimates client-side response times in a simulated network of clients, caches, and Web servers with various connectivity. This simulator is then used to evaluate various prefetching approaches. The second evaluation method presented is a novel architecture to simultaneously evaluate multiple proxy caches in a live network, which we introduce, implement, and demonstrate through experiments. The simulator is appropriate for evaluation of algorithms and research ideas, while simultaneous proxy evaluation is ideally suited to implemented systems.

We also consider the present and the future of Web prefetching, finding that changes to the HTTP standard will be required in order for Web prefetching to become commonplace.

Ph.D. Dissertation, Department of Computer Science, Rutgers University, New Brunswick, NJ, October 2002.

Back to Brian Davison's publications


Last modified: 30 September 2002
Brian D. Davison