Improving IP Geolocation using Query Logs

Ovidiu Dan, Vaibhav Parikh and Brian D. Davison

Full Paper (10 pages)
Official ACM published version: http://dx.doi.org/10.1145/2835776.2835820
Author's version: PDF (1.1MB)

Abstract
IP geolocation databases map IP addresses to their geographical locations. These databases are important for several applications such as local search engine relevance, credit card fraud protection, geotargetted advertising, and online content delivery. While they are the most popular method of geolocation, they can have low accuracy at the city level. In this paper we evaluate and improve IP geolocation databases using data collected from search engine logs. We generate a large ground-truth dataset using real time global positioning data extracted from search engine logs. We show that incorrect geolocation information can have a negative impact on implicit user metrics. Using the dataset we measure the accuracy of three state-of-the-art commercial IP geolocation databases. We then introduce a technique to improve existing geolocation databases by mining explicit locations from query logs. We show significant accuracy gains in 44 to 49 out of the top 50 countries, depending on the IP geolocation database. Finally, we validate the approach with a large scale A/B experiment that shows improvements in several user metrics.

In Proceedings of the 9th Annual ACM International Conference on Web Search and Data Mining (WSDM), pages 347-356, San Francisco, CA, February 2016.

Back to Brian Davison's publications


Last modified: 14 February 2016
Brian D. Davison