On foreground — background separation in low quality document images

  • Authors:
  • Utpal Garain;Thierry Paquet;Laurent Heutte

  • Affiliations:
  • Computer Vision & Pattern Recognition Unit, Indian Statistical Institute, 203, B. T. Road, 700108, Kolkata, INDIA;Laboratoire PSI - FRE CNRS 2645, UFR des Sciences, University of Rouen, 203, B. T. Road, 76821, Mont Saint Aignan cedex, FRANCE;Laboratoire PSI - FRE CNRS 2645, UFR des Sciences, University of Rouen, 203, B. T. Road, 76821, Mont Saint Aignan cedex, FRANCE

  • Venue:
  • International Journal on Document Analysis and Recognition
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper deals with effective separation of foreground and background in low quality document images suffering from various types of degradations including scanning noise, aging effects, uneven background, or foreground, etc. The proposed algorithm shows an excellent adaptability to tackle with these problems of uneven illumination and local changes or nonuniformity in background and foreground colors. The approach is primarily designed for (not restricted to) processing of color documents but it works well in the gray scale domain too. Test document set considers samples (in color as well as in gray scale) of old historical documents including manuscripts of high importance. The data set used in this study consists of hundred images. These images are selected from different sources including image databases that had been scanned from working notebooks of famous writers who used to write with quill or pencil generating very low contrast between foreground and background. Evaluation of foreground extraction method has been judged by computing the accuracy of extracting handwritten lines and words from the test images. This evaluation shows that the proposed method can extract lines and words with accuracies of about 84% and 93%, respectively. Apart from this quantitative method, a qualitative evaluation is also presented to compare the proposed method with one popular technique for foreground/background separation in document images.