Robust document image understanding technologies

  • Authors:
  • Henry S. Baird;Daniel Lopresti;Brian D. Davison;William M. Pottenger

  • Affiliations:
  • Lehigh University, Bethlehem, PA;Lehigh University, Bethlehem, PA;Lehigh University, Bethlehem, PA;Lehigh University, Bethlehem, PA

  • Venue:
  • Proceedings of the 1st ACM workshop on Hardcopy document processing
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

No existing document image understanding technology, whether experimental or commercially available, can guarantee high accuracy across the full range of documents of interest to industrial and government agency users. Ideally, users should be able to search, access, examine, and navigate among document images as effectively as they can among encoded data files, using familiar interfaces and tools as fully as possible. We are investigating novel algorithms and software tools at the frontiers of document image analysis, information retrieval, text mining, and visualization that will assist in the full integration of such documents into collections of textual document images as well as "born digital" documents. Our approaches emphasize versatility first: that is, methods which work reliably across the broadest possible range of documents.