Improved Table Retrieval Using Multiple Context Embeddings for Attributes

Mohamed Trabelsi, Brian D. Davison, and Jeff Heflin.

Short Paper (7 pages)
Official IEEE published version: DOI: 10.1109/BigData47090.2019.9005681
Author's version: PDF (295KB)

Abstract
Table retrieval is the task of extracting the most relevant tables to answer a user's query. Table retrieval is an important task because many domains have tables that contain useful information in a structured form. Given a user's query, the goal is to obtain a relevance ranking for query-table pairs, such that higher ranked tables should be more relevant to the query. In this paper, we present a context-aware table retrieval method that is based on a novel embedding for attribute tokens. We find that differentiated types of contexts are useful in building word embeddings. We also find that including a specialized representation of numerical cell values in our model improves table retrieval performance. We use the trained model to predict different contexts of every table. We show that the predicted contexts are useful in ranking tables against a query using a multi-field ranking approach. We evaluate our approach using public WikiTables data, and we demonstrate improvements in terms of NDCG over unsupervised baseline methods in the table retrieval task.

Published in Proceedings of IEEE Big Data 2019, pages 1238-1244, Los Angeles, CA, December 2019.

Back to Brian Davison's publications


Last modified: 4 March 2020
Brian D. Davison