The Diagonal Split: A Pre-segmentation Step for Page Layout Analysis and Classification

  • Authors:
  • Albert Gordo;Ernest Valveny

  • Affiliations:
  • Computer Vision Center - Computer Science Department, Universitat Autònoma de Barcelona, Spain;Computer Vision Center - Computer Science Department, Universitat Autònoma de Barcelona, Spain

  • Venue:
  • IbPRIA '09 Proceedings of the 4th Iberian Conference on Pattern Recognition and Image Analysis
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Document classification is an important task in all the processes related to document storage and retrieval. In the case of complex documents, structural features are needed to achieve a correct classification. Unfortunately, physical layout analysis is error prone. In this paper we present a pre-segmentation step based on a divide & conquer strategy that can be used to improve the page segmentation results, independently of the segmentation algorithm used. This pre-segmentation step is evaluated in classification and retrieval using the selective CRLA algorithm for layout segmentation together with a clustering based on the voronoi area diagram, and tested on two different databases, MARG and Girona Archives.