High Accuracy Optical Character Recognition Using Neural Networks with Centroid Dithering
IEEE Transactions on Pattern Analysis and Machine Intelligence
IEEE Transactions on Pattern Analysis and Machine Intelligence
Enhancing Degraded Document Images via Bitmap Clustering and Averaging
ICDAR '97 Proceedings of the 4th International Conference on Document Analysis and Recognition
Hi-index | 0.00 |
This paper presents an OCR method for degraded character recognition applied to a reference number (RN) of 15 printed characters of an invoice document produced by dot-matrix printer. First, the paper deals with the problem of the reference number localization and extraction, in which the characters tops or bottoms are or not touched with a printed reference line of the electrical bill. In case of touched RN, the extracted characters are severely degraded leading to missing parts in the characters tops or bottoms. Secondly, a combined recognition method based on the complementary similarity measure (CSM) method and MLP-based classifier is used. The CSM is used to accept or reject an incoming character. In case of acceptation, the CSM acts as a feature extractor and produces a feature vector of ten component features. The MLP is then trained using these feature vectors. The use of the CSM as a feature extractor tends to make the MLP very powerful and very well suited for rejection. Experimental results on electrical bills show the ability of the model to yield relevant and robust recognition on severely degraded printed characters.