Recognition of writer-independent off-line handwritten Arabic (Indian) numerals using hidden Markov models

Authors:
Sabri Mahmoud
Affiliations:
King Fahd University of Petroleum and Minerals, P.O. Box 1378, Dhahran 31261, Saudi Arabia
Venue:
Signal Processing
Year:
2008

Citing 15
Cited 13

Character recognition—a review

Pattern Recognition
Survey and bibliography of Arabic optical text recognition

Signal Processing
Handwritten Word Recognition Using Segmentation-Free Hidden Markov Modeling and Segmentation-Based Dynamic Programming Techniques

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Bayesian Framework for Deformable Pattern Recognition With Application to Handwritten Character Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence
An Omnifont Open-Vocabulary OCR System for English and Arabic

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Selective Attention-Based Method for Visual Pattern Recognition with Application to Handwritten Digit Recognition and Face Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence
Omnifont and Unlimited-Vocabulary OCR for English and Arabic

ICDAR '97 Proceedings of the 4th International Conference on Document Analysis and Recognition
Recognition of English and Arabic Numerals Using a Dynamic Number of Hidden Neurons

ICDAR '99 Proceedings of the Fifth International Conference on Document Analysis and Recognition
Recognition of Off-Line Handwritten Arabic Words Using Hidden Markov Model Approach

ICPR '02 Proceedings of the 16 th International Conference on Pattern Recognition (ICPR'02) Volume 3 - Volume 3
A Neuro-Heuristic Approach for Segmenting Handwritten Arabic Text

AICCSA '01 Proceedings of the ACS/IEEE International Conference on Computer Systems and Applications
Arabic Hand-Written Text Recognition

AICCSA '01 Proceedings of the ACS/IEEE International Conference on Computer Systems and Applications
Hand-Written Indian Numerals Recognition System Using Template Matching Approaches

AICCSA '01 Proceedings of the ACS/IEEE International Conference on Computer Systems and Applications
Printed arabic character recognition using HMM

Journal of Computer Science and Technology
Recognition of Persian handwritten digits using image profiles of multiple orientations

Pattern Recognition Letters
Offline Arabic Handwriting Recognition: A Survey

IEEE Transactions on Pattern Analysis and Machine Intelligence

Recognition of off-line printed Arabic text using Hidden Markov Models

Signal Processing
A multiple feature/resolution scheme to Arabic (Indian) numerals recognition using hidden Markov models

Signal Processing
Developing typewritten Arabic corpus with multi-fonts (TRACOM)

Proceedings of the International Workshop on Multilingual OCR
Recognition of handwritten Arabic (Indian) numerals using Radon-Fourier-based features

ISPRA'10 Proceedings of the 9th WSEAS international conference on Signal processing, robotics and automation
A novel framework for automatic sorting of postal documents with multi-script address blocks

Pattern Recognition
The use of radon transform in handwritten Arabic (Indian) numerals recognition

WSEAS Transactions on Computers
Performance of hidden Markov model and dynamic Bayesian network classifiers on handwritten Arabic word recognition

Knowledge-Based Systems
A database for offline arabic handwritten text recognition

ICIAR'11 Proceedings of the 8th international conference on Image analysis and recognition - Volume Part II
Recognition of Arabic (Indian) bank check digits using log-gabor filters

Applied Intelligence
Precise and accurate decimal number recognition using Global Motion Estimation

International Journal of Artificial Intelligence and Soft Computing
Arabic handwriting recognition using structural and syntactic pattern attributes

Pattern Recognition
Offline arabic handwritten text recognition: A Survey

ACM Computing Surveys (CSUR)
KHATT: An open Arabic offline handwritten text database

Pattern Recognition

Quantified Score

Hi-index	0.08

Visualization

Abstract

This paper describes a technique for the recognition of optical off-line handwritten Arabic (Indian) numerals using hidden Markov models (HMM). The success of HMM in speech recognition encouraged researchers to apply it to text recognition. In this work we did not follow the general trend of using sliding windows in the direction of the writing line to generate features. Instead we generated features based on the digit as a unit. Angle-, distance-, horizontal-, and vertical-span features are extracted from Arabic (Indian) numerals and used in training and testing the HMM. These features proved to be simple and effective. In addition to the HMM the nearest neighbor classifier is used. The results of both classifiers are then compared. Several experiments were conducted for estimating the suitable number of states for the HMM. The best results were achieved with an HMM model with 10 states. In addition, we experimented with different number of features. The best results were achieved with 120 feature vector representing a digit. A database of 44 writers, each writer wrote 48 samples of each digit resulting in a database of 21,120 samples. The data were size normalized to enable the technique to be size invariant. In extracting the features the center of gravity of the digit is used to make the technique translation invariant. The randomization technique was used to generate Arabic (Indian) numbers for training and testing the HMM classifier. The randomization was done on the number of digits per number and on the digit sequence. About 2171 Arabic (Indian) numbers were generated, totaling 21,120 digits. 1700 numbers (totaling 16,657 digits) were used in training the HMM and 471 numbers (totaling 4463 digits) are used in testing the HMM. The samples of the first 24 writers were used in training the nearest neighbor classifier and the remaining 20 writers' samples were used in testing. The achieved average recognition rates are 97.99% and 94.35% using the HMM and the nearest neighbor classifiers, respectively. The classification errors were analyzed and it was clear that some errors may be attributed to bad data, some to deformation and unbalanced proportion of digit segments, different writing styles of some digits, errors between digit pairs were specified and analyzed, and genuine errors. It was clear that the real misclassification of genuine data, in the case of HMM was nearly 1%. This proves the effectiveness of the presented technique to writer-independent off-line Arabic (Indian) handwritten digit recognition. The technique is writer independent as separate writers' data were used in training of the classifiers and other writers' data were used in the testing phase.