Searching the Web and more --- a juxtaposition of online search traces

Full Paper (24 pages)
PDF (409KB)
Brian D. Davison and Wei Zhang

Abstract
The information retrieval task is larger than the problem of searching for documents on the Web. In this paper we broaden our analysis to include search logs of many Web search engines, peer-to-peer query logs, newsgroup search logs, and queries to FTP archives. We calculate and compare the characteristics of each of these query logs from 2003 to find commonalities and differences across a wide spectrum of online query workloads. We found Boolean operator usage to be rare; much longer queries in peer-to-peer traffic than Web; that searchers click on slightly more than two results per query; that peer-to-peer and FTP logs are more likely to include file-type extensions; and, that caching of query results is likely to be of value for both WWW and peer-to-peer traffic.

Technical Report LU-CSE-05-005, Dept. of Computer Science and Engineering, Lehigh University, 2005.

Back to Brian Davison's publications


Last modified: 2 June 2005 Brian D. Davison