Degraded dot matrix character recognition using CSM-based feature extraction

  • Authors:
  • Abderrahmane Namane;El Houssine Soubari;Patrick Meyrueis

  • Affiliations:
  • University Saad Dahlab of Blida, Blida, Algeria;University of Strasbourg, Strasbourg, France;University of Strasbourg, Strasbourg, France

  • Venue:
  • Proceedings of the 10th ACM symposium on Document engineering
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents an OCR method for degraded character recognition applied to a reference number (RN) of 15 printed characters of an invoice document produced by dot-matrix printer. First, the paper deals with the problem of the reference number localization and extraction, in which the characters tops or bottoms are or not touched with a printed reference line of the electrical bill. In case of touched RN, the extracted characters are severely degraded leading to missing parts in the characters tops or bottoms. Secondly, a combined recognition method based on the complementary similarity measure (CSM) method and MLP-based classifier is used. The CSM is used to accept or reject an incoming character. In case of acceptation, the CSM acts as a feature extractor and produces a feature vector of ten component features. The MLP is then trained using these feature vectors. The use of the CSM as a feature extractor tends to make the MLP very powerful and very well suited for rejection. Experimental results on electrical bills show the ability of the model to yield relevant and robust recognition on severely degraded printed characters.