Logical Labeling Using Bayesien Networks

  • Authors:
  • Affiliations:
  • Venue:
  • ICDAR '01 Proceedings of the Sixth International Conference on Document Analysis and Recognition
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

Abstract: This paper discusses logical labeling in documents, which is one basic step in logical structure recognition. Logical labels have to be attributed to text blocks composing the layout structure. Our study is based on physical characteristics having a visual aspect: typographic, geometric and/or topologic attributes. Our objective is to map a low level logical structure, which consists of a set of logical labels, on the extracted layout structure components. We have to build a model that allows this mapping. How ever, the documents we consider have various layout and logical structures, thus, we chose to perform this task by supervised learning on the basis of a set of training documents. This allows us to define a generic method to solve this problem, without imposing any constraint on document structure. We propose a probabilistic model represented by a Bayesian Network (BN), which is a graphical model used in our problem as a classifier. A prototype has been implemented, and applied to tables of contents in periodics.