A multiresolution approach for page segmentation

  • Authors:
  • L. Cinque;L. Lombardi;G. Manzini

  • Affiliations:
  • Dipartimento di Scienze dell'Informazione, Università La Sapienza di Roma, Via Salaria 113, 00198 Roma, Italy;Dipartimento di Informatica e Sistemistica, Università di Pavia, Via Ferrata 1, 27100 Pavia, Italy;Dipartimento di Informatica e Sistemistica, Università di Pavia, Via Ferrata 1, 27100 Pavia, Italy

  • Venue:
  • Pattern Recognition Letters
  • Year:
  • 1998

Quantified Score

Hi-index 0.10

Visualization

Abstract

In this work we propose a new page segmentation method for recognizing text and graphics based on a multiresolution representation of the page image. Our approach is based on the analysis of a set of feature maps available at different resolution levels. The final output is a description of the physical structure of a page. A page image is broken down into several blocks which represent components of a page, such as text, line-drawings, and pictures. The result, which uses only a small amount of memory in addition to that for the image, may be the first step for a more detailed analysis such as optical character recognition.