Robust Document Image Understanding Technologies

Full Paper (6 pages)
Postscript (98KB) PDF (71KB)
Henry S. Baird, Daniel Lopresti, Brian D. Davison, and William M. Pottenger.

Abstract
No existing document image understanding technology, whether experimental or commercially available, can guarantee high accuracy across the full range of documents of interest to industrial and government agency users. Ideally, users should be able to search, access, examine, and navigate among document images as effectively as they can among encoded data files, using familiar interfaces and tools as fully as possible. We are investigating novel algorithms and software tools at the frontiers of document image analysis, information retrieval, text mining, and visualization that will assist in the full integration of such documents into collections of textual document images as well as "born digital" documents. Our approaches emphasize versatility first: that is, methods which work reliably across the broadest possible range of documents.

To be published in Proceedings of the CIKM Hardcopy Document Processing Workshop, Washington, DC, November 2004.

Back to Brian Davison's publications


Last modified: 9 September 2004
Brian D. Davison