Determination of the Script and Language Content of Document Images
IEEE Transactions on Pattern Analysis and Machine Intelligence
Automatic Script Identification From Document Images Using Cluster-Based Templates
IEEE Transactions on Pattern Analysis and Machine Intelligence
Rotation Invariant Texture Features and Their Use in Automatic Script Identification
IEEE Transactions on Pattern Analysis and Machine Intelligence
Trainable Script Identification Strategies for Indian Languages
ICDAR '99 Proceedings of the Fifth International Conference on Document Analysis and Recognition
Language identification for printed text independent of segmentation
ICIP '95 Proceedings of the 1995 International Conference on Image Processing (Vol. 3)-Volume 3 - Volume 3
Multi-Script Line identification from Indian Documents
ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 2
Natural Image Statistics and Low-Complexity Feature Selection
IEEE Transactions on Pattern Analysis and Machine Intelligence
An analytic distance metric for Gaussian mixture models with application in image retrieval
ICANN'05 Proceedings of the 15th international conference on Artificial neural networks: formal models and their applications - Volume Part II
Script identification from indian documents
DAS'06 Proceedings of the 7th international conference on Document Analysis Systems
Script based text identification: a multi-level architecture
Proceedings of the 2011 Joint Workshop on Multilingual OCR and Analytics for Noisy Unstructured Text Data
Content directed enhancement of degraded document images
Proceeding of the workshop on Document Analysis and Recognition
Text graphic separation in Indian newspapers
Proceedings of the 4th International Workshop on Multilingual OCR
Hi-index | 0.00 |
We present a framework for classification of text document images based on their script. We deal with the domain of Indian scripts which has high inter script similarities. Indian scripts have characteristic curvature distributions which help in visual discrimination of scripts. We use edge direction based features to capture the distribution of curvature. We also use a recently proposed feature selection algorithm to obtain the most discriminating curvature features. We form hierarchy (automatically) based on statistical distances between the script models. Hierarchy allows us to group similar scripts at one level and then focus on the classification between the similar scripts at the next level leading to improvement in accuracy. We show experiments and results on a large set of about 3400 images.