Automatically detecting and classifying noises in document images

  • Authors:
  • Rafael Dueire Lins;Serene Banergee;Marcelo Thielo

  • Affiliations:
  • Universidade Federal de Pernambuco, Recife - Pernambuco, Brazil;HP Labs., Bangalore, India;HP Brazil R&D, Porto Alegre, Brazil

  • Venue:
  • Proceedings of the 2010 ACM Symposium on Applied Computing
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Image filtering to remove noise in document images follows two different approaches. The first one uses human classification of the noise present in an image for identifying a noise filter to use. The second approach is to blindly apply a batch of filters to an image. The former approach, although widely used, may insert noise in the filtering process due to the incorrect classification of the noise or even unsuitable filtering parameters. This paper presents a new paradigm for document image filtering. It aims at doing a more accurate and computationally efficient document cleanup by pre-characterizing the noise that is present in the document based on a set of human labeled training samples. The current focus of the project is on pre-characterization of the following types of noise: back-to-front interference or bleed through, skew and orientation, blur and framing.