Predictive Coding for Document Layout Characterization

  • Authors:
  • J. Sauvola;M. Pietikäinen;M. Koivusaari

  • Affiliations:
  • -;-;-

  • Venue:
  • DIA '97 Proceedings of the 1997 Workshop on Document Image Analysis
  • Year:
  • 1997

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose a new approach to document image layout extraction using rapid feature analysis, preclassification and predictive coding. First , a set of layout features is used to render the image profile information. The knowledge base is utilized to rule these early regions into layout labels. The regions found are given a classification tag and a degree of membership into background, text, picture and linedrawing classes. A predictive coding method is used with the preclassification information to rise the confidence of each label, and to integrate the regional domain and the labels into a uniform class without any shape assumption. We have tested our technique using three different databases that comprise over 1000 document images. The results show high degree of confidence in region separation and extraction. The main benefits include robust classification, shape independency and rapid computation.