The nature of statistical learning theory
The nature of statistical learning theory
An improved document skew angle estimation technique
Pattern Recognition Letters
Determination of the Script and Language Content of Document Images
IEEE Transactions on Pattern Analysis and Machine Intelligence
Automatic Script Identification From Document Images Using Cluster-Based Templates
IEEE Transactions on Pattern Analysis and Machine Intelligence
Rotation Invariant Texture Features and Their Use in Automatic Script Identification
IEEE Transactions on Pattern Analysis and Machine Intelligence
A Tutorial on Support Vector Machines for Pattern Recognition
Data Mining and Knowledge Discovery
Touching numeral segmentation using water reservoir concept
Pattern Recognition Letters
Classification of Oriental and European Scripts by Using Characteristic Features
ICDAR '97 Proceedings of the 4th International Conference on Document Analysis and Recognition
Multi-Script Line identification from Indian Documents
ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 2
Texture for Script Identification
IEEE Transactions on Pattern Analysis and Machine Intelligence
Identifying Script onWord-Level with Informational Confidenc
ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Script Identification Based on Morphological Reconstruction in Document Images
ICPR '06 Proceedings of the 18th International Conference on Pattern Recognition - Volume 02
SVM Based Scheme for Thai and English Script Identification
ICDAR '07 Proceedings of the Ninth International Conference on Document Analysis and Recognition - Volume 01
Script and Language Identification in Noisy and Degraded Document Images
IEEE Transactions on Pattern Analysis and Machine Intelligence
Perspective rectification of document images using fuzzy set and morphological operations
Image and Vision Computing
Bangla/English script identification based on analysis of connected component profiles
DAS'06 Proceedings of the 7th international conference on Document Analysis Systems
Hi-index | 0.00 |
In some Thai documents, a single text line of a printed document page may contain words of both Thai and Roman scripts. For the Optical Character Recognition (OCR) of such a document page it is better to identify, at first, Thai and Roman script portions and then to use individual OCR systems of the respective scripts on these identified portions. In this article, an SVM-based method is proposed for identification of word-wise printed Roman and Thai scripts from a single line of a document page. Here, at first, the document is segmented into lines and then lines are segmented into character groups (words). In the proposed scheme, we identify the script of a character group combining different character features obtained from structural shape, profile behavior, component overlapping information, topological properties, and water reservoir concept, etc. Based on the experiment on 10,000 data (words) we obtained 99.62% script identification accuracy from the proposed scheme.