Research Interests
Brian D. Davison
January 2001
The Internet, and more recently, the World Wide Web, has become a fascinating environment in which scientific research can take place. The Web is, perhaps for the first time, a real-world phenomenon that is directly of interest to computer scientists.
Scientific research can often be divided into two inter-related types: the study of some phenomenon (natural or artificial), and the experimental application of knowledge to discover mechanisms that improve a system or to extend understanding. Measurement and evaluation are common characteristics of both types of research. Typically a careful evaluation of measurements of some phenomenon allow for conclusions about that phenomenon. Likewise, the evaluation of measurements of the effects of some improvement can thus determine the benefit or liability.
The sections below illustrate the same division in research. The first two deal predominantly with understanding and building models of natural and artificial phenomena. The second two focus on exploring and testing potential improvements.
Research Topic: The World Wide Web
As a large-scale interaction of human activity and computer systems, the Web has helped to push the modern world into a true information society. By providing random access to almost any document from almost any location, the Web has broken new ground in information access and self-publishing. Among other things, its decentralized growth at both the internetworking level and at the Web page connection level is not well understood. Therefore, I find it interesting and useful to study the Web as a distinct phenomena that can be tested, measured, and examined. To that end, I have investigated textual similarity of nearby pages and descriptors on the Web [Dav00b], the recognition of nepotistic links [Dav00a], and link analysis [DGK+99] and plan to continue to explore both the current Web and the way it grows.Research Topic: Modeling Human-Computer Activity
Human-computer activity has grown over time. However, there has been little improvement in personalized interfaces, even for electronic systems that deal almost exclusively with a single user (e.g. household appliances, single-user PC operating systems, etc.). In this area, I have examined the problem of modeling human actions in both the UNIX shell commands domain [HD97,DH97,DH98] and in the Web [Dav99b,DL00,Dav01a]. Often, simple machine learning techniques alone or in combination' can provide the mechanisms to personalize a variety of systems [HBD00], and to learn to improve a system (such as Web response-times) or to make recommendations (as in a movie or paper recommendation system).User modeling, whether to perform behind-the-scenes optimizations or recommendations, or to perform anomaly detection for potential intrusions (e.g. [Fra94,LB97], is an area with many opportunities in which I would like to continue to work.
Research Topic: Web Architecture
As a universal interface to much of the world's information, we have both a strong responsibility to protect the functionality of the existing Web, but also an awesome opportunity to find improvements that can make Web services better for humankind.Web caching is one technology that has proven beneficial. By studying and communicating the benefits [Dav00c,Dav01c], the use of caching may become more widespread and thus optimize the use of networks and systems involved in serving information over the Web. To measure potential improvements in caching technologies, I have studied evaluation methodologies [Dav99e,Dav99a] as well as designed and built new mechanisms for proxy cache evaluation including simulators [Dav01b] and black-box testing architectures [Dav99d,Dav99c,DK01]. This work is ongoing, as is experimental work using speculative data dissemination to improve user-perceived latencies [Dav99b,DL00,Dav01a].
There are other areas of Web and Internet architecture in which I intend to research. These include mechanisms to facilitate Web load distribution [CCY99] and peer-to-peer networking architectures and applications.
Research Topic: Web Search
Search engines are the dominant mechanism for finding information on the Web. Therefore, it is useful to examine applicable information retrieval mechanisms and to extend them in Web-specific ways. I have investigated the clustering of Web pages [MBDH98], built a search engine prototype [DGK+99], and looked at some of the problems involved in making unbiased search rankings [Dav00a]. I expect to continue to investigate search engine technologies, including crawling, retrieval, ranking, and customization.Summary
Stated simply, my research interests are interdisciplinary and lie predominantly in three areas: The World Wide Web (Caching, Measurement and Performance); Information Retrieval; and Machine Learning. Through each of these, however, runs a strong emphasis on empirical evaluation. The efforts outlined above provide a solid framework for future scientific research to explore, measure, understand, and extend.Bibliography
[CCY99] Valeria Cardellini, Michele Colajanni and Philip S. Yu. Dynamic Load Balancing on Web-Server Systems. IEEE Internet Computing, 3(3):28-39, May/June 1999.
[Dav99a] Brian D. Davison. A Survey of Proxy Cache Evaluation Techniques. In Proceedings of the Fourth International WWW Caching Workshop (WCW99), pages 67-77, San Diego, CA, March 1999.
[Dav99b] Brian D. Davison. Adaptive Web Prefetching. In Proceedings of the 2nd Workshop on Adaptive Systems and User Modeling on the WWW, pages 105-106, Toronto, May 1999. Position paper. Proceedings published as Computing Science Report 99-07, Dept. of Mathematics and Computing Science, Eindhoven University of Technology.
[Dav99c] Brian D. Davison. Measuring the Performance of Prefetching Proxy Caches. Poster presented at the ACM International Student Research Competition, March 24-28, 1999 and at the AT&T Student Research Symposium (a regional ACM Student Research Competition), November 13, 1998. Awarded third place and first place respectively., 1999.
[Dav99d] Brian D. Davison. Simultaneous Proxy Evaluation. In Proceedings of the Fourth International WWW Caching Workshop (WCW99), pages 170-178, San Diego, CA, March 1999.
[Dav99e] Brian D. Davison. Web traffic logs: An imperfect resource for evaluation. In Proceedings of the Ninth Annual Conference of the Internet Society (INET'99), June 1999.
[Dav00a] Brian D. Davison. Recognizing Nepotistic Links on the Web. In Artificial Intelligence for Web Search, pages 23-28. AAAI Press, July 2000. Presented at the AAAI-2000 workshop on Artificial Intelligence for Web Search, Technical Report WS-00-01.
[Dav00b] Brian D. Davison. Topical Locality in the Web. In Proceedings of the 23rd Annual International Conference on Research and Development in Information Retrieval (SIGIR 2000), Athens, Greece, July 2000.
[Dav00c] Brian D. Davison. Is your web site cache friendly? Web Techniques, 5(3):101-103, March 2000. Invited article.
[Dav01a] Brian D. Davison. Combining multiple sources for prefetching. In preparation.
[Dav01b] Brian D. Davison. NCS: The Network and Caching Simulator. In preparation.
[Dav01c] Brian D. Davison. A Web Caching Primer. Under submission.
[DGK+99] Brian D. Davison, Apostolos Gerasoulis, Konstantinos Kleisouris, Yingfang Lu, Hyun-ju Seo, Wei Wang, and Baohua Wu. DiscoWeb: Applying Link Analysis to Web Search. In Poster proceedings of the Eighth International World Wide Web Conference, pages 148-149, Toronto, Canada, May 1999.
[DH97] Brian D. Davison and Haym Hirsh. Toward an adaptive command line interface. In Advances in Human Factors/Ergonomics: Design of Computing Systems: Social and Ergonomic Considerations, pages 505-508, San Francisco, CA, August 1997. Elsevier Science Publishers. Proceedings of the Seventh International Conference on Human-Computer Interaction.
[DH98] Brian D. Davison and Haym Hirsh. Predicting sequences of user actions. In Predicting the Future: AI Approaches to Time-Series Problems, pages 5-12, Madison, WI, July 1998. AAAI Press. Proceedings of AAAI-98/ICML-98 Workshop, published as Technical Report WS-98-07.
[DK01] Brian D. Davison and Chandrasekar Krishnan. ROPE: The Rutgers Online Proxy Evaluator. In preparation.
[DL00] Brian D. Davison and Vincenzo Liberatore. Pushing Politely: Improving Web Responsiveness One Packet at a Time (Extended Abstract). Performance Evaluation Review, 28(2):43-49, September 2000. Presented at the Performance and Architecture of Web Servers (PAWS) Workshop, June 2000.
[Fra94] J. Frank. Machine learning and intrusion detection: Current and future directions. In Proceedings of the 17th National Computer Security Conference, 1994.
[HBD00] Haym Hirsh, Chumki Basu, and Brian D. Davison. Learning to Personalize. Communications of the ACM, 43(8):102-106, August 2000.
[HD97] Haym Hirsh and Brian D. Davison. An adaptive UNIX command-line assistant. In Proceedings of the First International Conference on Autonomous Agents, pages 542-543, Marina del Rey, CA, February 1997. ACM Press.
[LB97] Terran Lane and Carla E. Brodley. An application of machine learning to anomaly detection. In Proceedings of the National Information Systems Security Conference, 1997.
[MBDH98] Sofus Attila Macskassy, Arunava Banerjee, Brian D. Davison, and Haym Hirsh. Human Performance on Clustering Web Pages: A Preliminary Study. In Proceedings of The Fourth International Conference on Knowledge Discovery and Data Mining (KDD-98), pages 264-268, New York City, August 1998.
Last modified: 17 January 2001