A Recognition System for Devnagri and English Handwritten Numerals
ICMI '00 Proceedings of the Third International Conference on Advances in Multimodal Interfaces
Word and Sentence Extraction Using Irregular Pyramid
DAS '02 Proceedings of the 5th International Workshop on Document Analysis Systems V
Script Identification in Printed Bilingual Documents
DAS '02 Proceedings of the 5th International Workshop on Document Analysis Systems V
Multi-Script Line identification from Indian Documents
ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 2
Script Identification Using Steerable Gabor Filters
ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Feature extraction and classification for bilingual script (Gurmukhi and Roman)
ACST'07 Proceedings of the third conference on IASTED International Conference: Advances in Computer Science and Technology
Local features-based script recognition from printed bilingual document images
International Journal of Computer Applications in Technology
Hi-index | 0.00 |
In a multi-lingual country like India, a document may contain more than one script forms. For such a document it is necessary to separate different script forms before feeding them to OCRs of individual script. In this paper an automatic word segmentation approach is described which can separate Roman, Bangla and Devnagari scripts present in a single document. The approach has a tree structure where at first Roman script words are separated using the `headline' feature. The headline is common in Bangla and Devnagari but absent in Roman. Next, Bangla and Devnagari words are separated using some finer characteristics of the character set although recognition of individual character is avoided. At present, the system has an overall accuracy of 96.09%.