Full Paper (8 pages)
Official ACM published version: http://doi.acm.org/10.1145/1571941.1571973
Author's copy: PDF (155KB)
Discussion boards and online forums are important platforms for people to share information. Users post questions or problems onto discussion boards and rely on others to provide possible solutions and such question-related content sometimes even dominates the whole discussion board. However, to retrieve this kind of information automatically and effectively is still a non-trivial task. In addition, the existence of other types of information (e.g., announcements, plans, elaborations, etc.) makes it difficult to assume that every thread in a discussion board is about a question.
We consider the problems of identifying question-related threads and their potential answers as classification tasks. Experimental results across multiple datasets demonstrate that our method can significantly improve the performance in both question detection and answer finding subtasks. We also do a careful comparison of how different types of features contribute to the final result and show that non-content features play a key role in improving overall performance. Finally, we show that a ranking scheme based on our classification approach can yield much better performance than prior published methods.
In Proceedings of the 32nd Annual International ACM SIGIR Conference on Research & Development on Information Retrieval, pages 171-178, Boston, July 2009.
© ACM, 2009. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution.
Back to Brian Davison's publications