Determination of the Script and Language Content of Document Images
IEEE Transactions on Pattern Analysis and Machine Intelligence
Automatic Script Identification From Document Images Using Cluster-Based Templates
IEEE Transactions on Pattern Analysis and Machine Intelligence
Rotation Invariant Texture Features and Their Use in Automatic Script Identification
IEEE Transactions on Pattern Analysis and Machine Intelligence
Trainable Script Identification Strategies for Indian Languages
ICDAR '99 Proceedings of the Fifth International Conference on Document Analysis and Recognition
Language identification for printed text independent of segmentation
ICIP '95 Proceedings of the 1995 International Conference on Image Processing (Vol. 3)-Volume 3 - Volume 3
Multi-Script Line identification from Indian Documents
ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 2
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Page segmentation using texture analysis
Pattern Recognition
Gabor wavelet similarity maps for optimising hierarchical road sign classifiers
Pattern Recognition Letters
Word level multi-script identification
Pattern Recognition Letters
Curvature feature distribution based classification of Indian scripts from document images
Proceedings of the International Workshop on Multilingual OCR
Local features-based script recognition from printed bilingual document images
International Journal of Computer Applications in Technology
Word level identification of Kannada, Hindi and English scripts from a tri-lingual document
International Journal of Computational Vision and Robotics
Proceeding of the workshop on Document Analysis and Recognition
A bilingual Gurmukhi-English OCR based on multiple script identifiers and language models
Proceedings of the 4th International Workshop on Multilingual OCR
Hi-index | 0.00 |
Automatic identification of a script in a given document image facilitates many important applications such as automatic archiving of multilingual documents, searching online archives of document images and for the selection of script specific OCR in a multilingual environment. In this paper, we present a scheme to identify different Indian scripts from a document image. This scheme employs hierarchical classification which uses features consistent with human perception. Such features are extracted from the responses of a multi-channel log-Gabor filter bank, designed at an optimal scale and multiple orientations. In the first stage, the classifier groups the scripts into five major classes using global features. At the next stage, a sub-classification is performed based on script-specific features. All features are extracted globally from a given text block which does not require any complex and reliable segmentation of the document image into lines and characters. Thus the proposed scheme is efficient and can be used for many practical applications which require processing large volumes of data. The scheme has been tested on 10 Indian scripts and found to be robust to skew generated in the process of scanning and relatively insensitive to change in font size. This proposed system achieves an overall classification accuracy of 97.11% on a large testing data set. These results serve to establish the utility of global approach to classification of scripts.