The nature of statistical learning theory
The nature of statistical learning theory
Determination of the Script and Language Content of Document Images
IEEE Transactions on Pattern Analysis and Machine Intelligence
How to read less and know more: approximate OCR for Thai
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Reduction Techniques for Instance-BasedLearning Algorithms
Machine Learning
Pattern Recognition with Fuzzy Objective Function Algorithms
Pattern Recognition with Fuzzy Objective Function Algorithms
Language identification of on-line documents using word shapes
ICDAR '97 Proceedings of the 4th International Conference on Document Analysis and Recognition
Language identification for printed text independent of segmentation
ICIP '95 Proceedings of the 1995 International Conference on Image Processing (Vol. 3)-Volume 3 - Volume 3
Applying A Hybrid Method To Handwritten Character Recognition
ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 2 - Volume 02
A comparison of methods for multiclass support vector machines
IEEE Transactions on Neural Networks
Accelerating feature-vector matching using multiple-tree and sub-vector methods
Pattern Recognition
Local features-based script recognition from printed bilingual document images
International Journal of Computer Applications in Technology
Bangla/English script identification based on analysis of connected component profiles
DAS'06 Proceedings of the 7th international conference on Document Analysis Systems
Hi-index | 0.00 |
In this paper, we propose a new approach for identifying the language type of character images. We do this by classifying individual character images to determine the language boundaries in multilingual documents. Two effective methods are considered for this purpose: the prototype classification method and support vector machines (SVM). Due to the large size of our training dataset, we further propose a technique to speed up the training process for both methods. Applying the two methods to classifying characters into Chinese, English, and Japanese (including Hiragana and Katakana) has produced very accurate and comparable test results.