Combining fuzzy clustering and morphological methods for old documents recovery

  • Authors:
  • João R. Caldas Pinto;Lourenço Bandeira;João M. C. Sousa;Pedro Pina

  • Affiliations:
  • IDMEC, Instituto Superior Técnico, Lisboa, Portugal;IDMEC, Instituto Superior Técnico, Lisboa, Portugal;IDMEC, Instituto Superior Técnico, Lisboa, Portugal;CVRM / Geo-Systems Centre, Instituto Superior Técnico, Lisboa, Portugal

  • Venue:
  • IbPRIA'05 Proceedings of the Second Iberian conference on Pattern Recognition and Image Analysis - Volume Part II
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we tackle the specific problem of old documents recovery. Spots, print through, underlines and others ageing features are undesirable not only because they harm the visual appearance of the document, but also because they affect future Optical Character Recognition (OCR). This paper proposes a new method integrating fuzzy clustering of color properties of original images and mathematical morphology. We will show that this technique leads to higher quality of the recovered images and, at the same time, it delivers cleaned binary text for OCR applications. The proposed method was applied to books of XIX Century, which were cleaned in a very effective way.