Determination of the Script and Language Content of Document Images
IEEE Transactions on Pattern Analysis and Machine Intelligence
Automatic Script Identification From Document Images Using Cluster-Based Templates
IEEE Transactions on Pattern Analysis and Machine Intelligence
Skew Angle Detection of Digitized Indian Script Documents
IEEE Transactions on Pattern Analysis and Machine Intelligence
Rotation Invariant Texture Features and Their Use in Automatic Script Identification
IEEE Transactions on Pattern Analysis and Machine Intelligence
Touching numeral segmentation using water reservoir concept
Pattern Recognition Letters
Automatic Separation of Words in Multi-lingual Multi-script Indian Documents
ICDAR '97 Proceedings of the 4th International Conference on Document Analysis and Recognition
Script Line Separation from Indian Multi-Script Documents
ICDAR '99 Proceedings of the Fifth International Conference on Document Analysis and Recognition
Word-Wise Thai and Roman Script Identification
ACM Transactions on Asian Language Information Processing (TALIP)
Curvature feature distribution based classification of Indian scripts from document images
Proceedings of the International Workshop on Multilingual OCR
Word level identification of Kannada, Hindi and English scripts from a tri-lingual document
International Journal of Computational Vision and Robotics
Script based text identification: a multi-level architecture
Proceedings of the 2011 Joint Workshop on Multilingual OCR and Analytics for Noisy Unstructured Text Data
Bangla/English script identification based on analysis of connected component profiles
DAS'06 Proceedings of the 7th international conference on Document Analysis Systems
Script identification from indian documents
DAS'06 Proceedings of the 7th international conference on Document Analysis Systems
Proceeding of the workshop on Document Analysis and Recognition
Bangla date field extraction in offline handwritten documents
Proceeding of the workshop on Document Analysis and Recognition
Hi-index | 0.00 |
A document page may contain two or more different scripts.For Optical Character Recognition (OCR) of such adocument page, it is necessary to separate different scriptsbefore feeding them to their individual OCR system. In thispaper an automatic scheme is presented to identify text linesof different Indian scripts from a document. For theseparation task at first the scripts are grouped into a fewclasses according to script characteristics. Next featurebased on water reservoir principle, contour tracing, profileetc. are employed to identify them without any expensiveOCR-like algorithms. At present, the system has an overallaccuracy of about 97.52%.