Document page segmentation using neuro-fuzzy approach

  • Authors:
  • Laura Caponetti;Ciro Castiello;Przemysław Górecki

  • Affiliations:
  • Universití degli Studi di Bari, Dipartimento di Informatica, Via E. Orabona 4, 70126 Bari, Italy;Universití degli Studi di Bari, Dipartimento di Informatica, Via E. Orabona 4, 70126 Bari, Italy;Wydział Matematyki i Informatyki, Uniwersytet Warmińsko-Mazurski ul. Oczapowskiego 2, 10-719 Olsztyn, Poland

  • Venue:
  • Applied Soft Computing
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this work, we propose a new document page segmentation method, capable of differentiating between text, graphics and background, using a neuro-fuzzy methodology. Our approach is based firstly on the analysis of a set of features extracted from the image, available at different resolution levels. An initial segmentation is obtained by classifying the pixels into coherent regions, which are successively refined by the analysis of their shape. The core of our approach relies on a neuro-fuzzy methodology, for performing the classification processes. The proposed strategy is capable of describing the physical structure of a page in an accurate way and proved to be robust against noise and page skew. Additionally, the knowledge-based neuro-fuzzy methodology allows us to understand the classification mechanisms better, contrary to what happens when other kinds of knowledge-free methods are applied.