Recognition Algorithms for Structured Documents with Variable Content

  • Authors:
  • O. A. Slavin

  • Affiliations:
  • Institute of System Analysis, Russian Academy of Sciences, Moscow, Russia 117312

  • Venue:
  • Programming and Computing Software
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

The paper deals with a wide class of structured documents that cannot be described using one or several models based on associations between the document fields and geometric elements. A formal model of such documents is described that is based on the concept of a multiset. Examples of structured documents of this class are given and a technique for the construction of models of structured documents is proposed. This technique is illustrated using an implementation of an automated document management system. Implemented algorithms for detecting document fields are described, and implementation problems are discussed.