On the Recognition of Printed Characters of Any Font and Size
IEEE Transactions on Pattern Analysis and Machine Intelligence
Estimating attributes: analysis and extensions of RELIEF
ECML-94 Proceedings of the European conference on machine learning on Machine Learning
Automatic Script Identification From Document Images Using Cluster-Based Templates
IEEE Transactions on Pattern Analysis and Machine Intelligence
Rotation Invariant Texture Features and Their Use in Automatic Script Identification
IEEE Transactions on Pattern Analysis and Machine Intelligence
Normalized Cuts and Image Segmentation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Digital Image Processing
Document page decomposition by the bounding-box project
ICDAR '95 Proceedings of the Third International Conference on Document Analysis and Recognition (Volume 2) - Volume 2
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Script Identification Using Steerable Gabor Filters
ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Word level multi-script identification
Pattern Recognition Letters
An optimally weighted fuzzy k-NN algorithm
ICAPR'05 Proceedings of the Third international conference on Advances in Pattern Recognition - Volume Part I
Hi-index | 0.00 |
Current approaches to script identification rely onhand-selected features and often require processing a significantpart of the document to achieve reliable identification.We present an approach that applies a large pool ofimage features to a small training sample and uses subsetfeature selection techniques to automatically select a subsetwith the most discriminating power. At run time we usea classifier coupled with an evidence accumulation engineto report a script label once a preset likelihood thresholdhas been reached. We apply the system to a diverse corpusof printed Russian and English documents that suffer fromcommon degradation problems. Our validation studyshows promising results both in terms of the script identificationaccuracy and the ability to identify script on thescale of individual words and text lines.