Lehigh University
CSE DEPT HOME | COLLEGE HOME | LEHIGH HOME | SEARCH



•  Staff


   


[ DRAFT for comments:  some details not yet confirmed.]

Document Analysis and Exploitation

Research Project

Building and maintaining a national resource to support critical research and development in translation, document analysis, preservation, and exploitation.

Director:  Prof. Henry S. Baird
Associate Directors:  Profs. Daniel P. Lopresti & Hank F. Korth
Faculty:  Profs. Brian Davison & Jeff Heflin
Students:  Sui-Yu Wang, Chang An, Pingping Xiu, Dawei (David) Yin, Dezhao Song

In partnership with BBN Technologies (Cambridge, MA): Prem Natarajan, Vice President, Speech, Language, and Multimedia.

Visiting Research ScientistsBart Lamiroy of LORIA INPL - École des Mines de Nancy, France, will visit us November 9-14, and again, probably, starting in January 2010. Ergina Kavallieratou of the University of the Aegean will visit us over winter break.  

[Other Visitors, Potentially:   Prof. George Nagy (ECSE Dept, RPI), Dr. David Doermann (Director, LAMP, UMD), Prof. Thomas Nartker (Director, ISRI, UNLV), Dr. George Thoma, (NLM, NIH).]

Highlighted Technical Goals:
  • Investigate character recognition technologies for both machine print and handwriting in multiple languages.
  • Examine the impact of recognition errors on popular later-stage processes (e.g., information retrieval, summarization, information extraction) and develop more robust approaches that are tolerant of such errors. 
  • Develop methods for detecting and exploiting document "meta-data" -- data that goes beyond the textual content of the document.
  • Investigate autonomously adaptive and/or semi-automated (interactive) recognition systems as a potential solution to extremely hard inputs that cannot yet be handled by machine.
  • Establish a national resource for training and testing data to be used in the design and analysis of document analysis algorithms.
  • Improve protection and preservation of public and private records currently held in hardcopy form by investigating cost-effective user interfaces for both fully-automated and semi-automated document imaging and analysis. 
Contract administered by DARPA Information Processing Technology Office: Joseph Olive, Program Manager.  Funded by Congressional Authorization Fiscal Year 2009 Continuing Resolution (HR 2638): U.S. Senators Arlen Specter and Bob Casey, and U.S. Congressman Charlie Dent (PA-15), sponsors.
image


© 2003 P.C. Rossin College of Engineering & Applied Science
Computer Science & Engineering, Packard Laboratory, Lehigh University, Bethlehem PA 18015