Full Paper (10 pages)
Official ACM published version: https://dl.acm.org/doi/10.1145/3397271.3401044
arXiv version (with extra visualization in appendix): PDF
Pretrained contextualized language models such as BERT have achieved impressive results on various natural language processing benchmarks. Benefiting from multiple pretraining tasks and large scale training corpora, pretrained models can capture complex syntactic word relations. In this paper, we use the deep contextualized language model BERT for the task of ad hoc table retrieval. We investigate how to encode table content considering the table structure and input length limit of BERT. We also propose an approach that incorporates features from prior literature on table retrieval and jointly trains them with BERT. In experiments on public datasets, we show that our best approach can outperform the previous state-of-the-art method and BERT baselines with a large margin under different evaluation metrics.
In Proceedings of 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 589-598, July 2020.
Back to Brian Davison's publications