A rule learning method for academic document image processing

  • Authors:
  • A. Takasu;S. Satoh;E. Katsura

  • Affiliations:
  • -;-;-

  • Venue:
  • ICDAR '95 Proceedings of the Third International Conference on Document Analysis and Recognition (Volume 1) - Volume 1
  • Year:
  • 1995

Quantified Score

Hi-index 0.00

Visualization

Abstract

A syntactic rule learning method is presented for analyzing document images and constructing a database from them. This method is used in a digital library system named CyberMagazine, where document images are sequentially converted into database tuples by block segmentation, rough classification, and syntactic analysis. The syntactic rule has an ability to analyze symbols located in two dimensional plane, and has a syntax similar to an ordinal context free grammar except for the concatenation of symbols. In the presented learning method, the syntactic rules are generated from a set of parse trees by decomposing the trees according to non terminal symbols, generalizing the decomposed trees to a syntactic rule, and merging them.