Official Emerald Press version
Purpose. This work investigates the sensitivity of ranking performance with respect to the topic distribution of queries selected for ranking evaluation.
Design/methodology/approach. We reweight queries used in two TREC tasks to make them match three real background topic distributions, and show that the performance rankings of retrieval systems are quite different.
Findings. We find that 1) search engines tend to perform similarly on queries about the same topic; and 2) search engine performance is sensitive to the topic distribution of queries used in evaluation.
Originality/value. Using experiments with multiple real-world query logs, we have demonstrated weaknesses in the current evaluation model of retrieval systems.
Online Information Review, Volume 35, Issue 6, pages 893-908, Emerald Press, 2011.
Back to Brian Davison's publications