Determination of the Script and Language Content of Document Images
IEEE Transactions on Pattern Analysis and Machine Intelligence
Rotation Invariant Texture Features and Their Use in Automatic Script Identification
IEEE Transactions on Pattern Analysis and Machine Intelligence
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
Script Identification in Printed Bilingual Documents
DAS '02 Proceedings of the 5th International Workshop on Document Analysis Systems V
Script Line Separation from Indian Multi-Script Documents
ICDAR '99 Proceedings of the Fifth International Conference on Document Analysis and Recognition
Multi-Script Line identification from Indian Documents
ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 2
Robust Real-Time Face Detection
International Journal of Computer Vision
Texture for Script Identification
IEEE Transactions on Pattern Analysis and Machine Intelligence
Script Identification Based on Morphological Reconstruction in Document Images
ICPR '06 Proceedings of the 18th International Conference on Pattern Recognition - Volume 02
Word level multi-script identification
Pattern Recognition Letters
Curvature feature distribution based classification of Indian scripts from document images
Proceedings of the International Workshop on Multilingual OCR
Document Image Retrieval Using Feature Combination in Kernel Space
ICPR '10 Proceedings of the 2010 20th International Conference on Pattern Recognition
Bangla/English script identification based on analysis of connected component profiles
DAS'06 Proceedings of the 7th international conference on Document Analysis Systems
Adaptive, quadratic preprocessing of document images for binarization
IEEE Transactions on Image Processing
Hi-index | 0.00 |
Script identification in a multi-lingual document environment has numerous applications in the field of document image analysis, such as indexing and retrieval or as an initial step towards optical character recognition. In this paper, we propose a novel hierarchical framework for script identification in bi-lingual documents. The framework presents a top-down approach by performing page, block/paragraph and word level script identification in multiple stages. We utilize texture and shape based information embedded in the documents at different levels for feature extraction. The prediction task at different levels of hierarchy is performed by Support Vector Machine (SVM) and Rejection based classifier defined using AdaBoost. Experimental evaluation of the proposed concept on document collections of Hindi/English and Bangla/English scripts have shown promising results.