Treatment of Diagrams in Document Image Analysis

  • Authors:
  • Dorothea Blostein;Edward Lank;Richard Zanibbi

  • Affiliations:
  • -;-;-

  • Venue:
  • Diagrams '00 Proceedings of the First International Conference on Theory and Application of Diagrams
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

Document image analysis is the study of converting documents from paper form to an electronic form that captures the information content of the document. Necessary processing includes recognition of document layout (to determine reading order, and to distinguish text from diagrams), recognition of text (called Optical Character Recognition, OCR), and processing of diagrams and photographs. The processing of diagrams has been an active research area for several decades. A selection of existing diagram recognition techniques are presented in this paper. Challenging problems in diagram recognition include (1) the great diversity of diagram types, (2) the difficulty of adequately describing the syntax and semantics of diagram notations, and (3) the need to handle imaging noise. Recognition techniques that are discussed include blackboard systems, stochastic grammars, Hidden Markov Models, and graph grammars.