A word shape analysis approach to lexicon based word recognition
Pattern Recognition Letters
An Omnifont Open-Vocabulary OCR System for English and Arabic
IEEE Transactions on Pattern Analysis and Machine Intelligence
A new segmentation technique for omnifont Farsi text
Pattern Recognition Letters
A new algorithm for machine printed Arabic character segmentation
Pattern Recognition Letters
Printed PAW Recognition Based on Planar Hidden Markov Models
ICPR '96 Proceedings of the 13th International Conference on Pattern Recognition - Volume 2
Spectral features for Arabic word recognition
ICASSP '00 Proceedings of the Acoustics, Speech, and Signal Processing, 2000. on IEEE International Conference - Volume 06
Experiments with "Characteristic Loci" for Recognition of Handprinted Characters
IEEE Transactions on Computers
IbPRIA'11 Proceedings of the 5th Iberian conference on Pattern recognition and image analysis
Hi-index | 0.10 |
In this paper, we report on the use of characteristic loci features to cluster printed Farsi subwords, based on their holistic shapes. This yields a pictorial dictionary that can be used in a word recognition system to eliminate the search space. The feature vectors are compressed using PCA. The k-means algorithm is used to cluster 113,340 subwords of 4 fonts and 3 sizes to 300 clusters. The minimum and maximum numbers of cluster members are 59 and 876, respectively. The mean of each cluster is used as its entry in the pictorial dictionary. To evaluate the clustering results, a minimum mean-distance classifier was used to test a set of 5000 subwords. 78.71, 99.01 and 100 percent of these subwords were in the first, first five and first 10 closest clusters, respectively.