Distributed Reverse DNS Geolocation

Ovidiu Dan, Vaibhav Parikh and Brian D. Davison

Short Paper (6 pages)
Official IEEE published version: DOI: 10.1109/BigData.2018.8621886
Author's version: PDF (528KB)

Abstract
IP geolocation databases map IP addresses to their geographical locations. These databases are used in a variety of online services to serve local content to users. Here we present methods for extracting locations from the reverse DNS hostnames assigned to IP addresses. We summarize a machine learning based approach which, given a hostname, aims to extract and rank potential location candidates, which can then potentially be fused with other geolocation signals. We show that this approach significantly outperforms a state-of-the-art academic baseline, and it is competitive and complementary to commercial baselines. Since extracting locations from more than a billion reverse DNS hostnames at once poses a significant computational challenge, we develop a distributed version of our algorithm. We perform experiments on a cluster of 2,000 machines to demonstrate that our distributed implementation can scale. We show that compared to the single machine version, our distributed approach can achieve a speedup of more than 150X.

In Proceedings of the 2018 IEEE International Conference on Big Data (BigData), pages 1581-1586, Seattle, WA, December 2018.

Back to Brian Davison's publications


Last modified: 13 February 2019
Brian D. Davison