Automatic Script Identification From Document Images Using Cluster-Based Templates
IEEE Transactions on Pattern Analysis and Machine Intelligence
Segmentation of page images using the area Voronoi diagram
Computer Vision and Image Understanding - Special issue on document image understanding and retrieval
An HMM-Based Approach for Off-Line Unconstrained Handwritten Word Modeling and Recognition
IEEE Transactions on Pattern Analysis and Machine Intelligence
Chaincode Contour Processing for Handwritten Word Recognition
IEEE Transactions on Pattern Analysis and Machine Intelligence
The Document Spectrum for Page Layout Analysis
IEEE Transactions on Pattern Analysis and Machine Intelligence
Two Geometric Algorithms for Layout Analysis
DAS '02 Proceedings of the 5th International Workshop on Document Analysis Systems V
Recursive X-Y cut using bounding boxes of connected components
ICDAR '95 Proceedings of the Third International Conference on Document Analysis and Recognition (Volume 2) - Volume 2
Automatic Identification of English, Chinese, Arabic, Devnagari and Bangla Script Line
ICDAR '01 Proceedings of the Sixth International Conference on Document Analysis and Recognition
Script-Independent, HMM-Based Text Line Finding for OCR
ICPR '00 Proceedings of the International Conference on Pattern Recognition - Volume 4
Segmentation of Connected Chinese Characters Based on Genetic Algorithm
ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Arabic Handwriting Recognition Using Baseline Dependant Features and Hidden Markov Modeling
ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Adaptive degraded document image binarization
Pattern Recognition
ICDAR '07 Proceedings of the Ninth International Conference on Document Analysis and Recognition - Volume 01
SVM Based Scheme for Thai and English Script Identification
ICDAR '07 Proceedings of the Ninth International Conference on Document Analysis and Recognition - Volume 01
An Overview of the Tesseract OCR Engine
ICDAR '07 Proceedings of the Ninth International Conference on Document Analysis and Recognition - Volume 02
On Segmentation of Documents in Complex Scripts
ICDAR '07 Proceedings of the Ninth International Conference on Document Analysis and Recognition - Volume 02
A generalised framework for script identification
International Journal on Document Analysis and Recognition
Script-Independent Text Line Segmentation in Freestyle Handwritten Documents
IEEE Transactions on Pattern Analysis and Machine Intelligence
Thai Word Segmentation with Hidden Markov Model and Decision Tree
PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Adapting the Tesseract open source OCR engine for multilingual OCR
Proceedings of the International Workshop on Multilingual OCR
Proceedings of the International Workshop on Multilingual OCR
Script-Independent Handwritten Textlines Segmentation Using Active Contours
ICDAR '09 Proceedings of the 2009 10th International Conference on Document Analysis and Recognition
Learning on the Fly: Font-Free Approaches to Difficult OCR Problems
ICDAR '09 Proceedings of the 2009 10th International Conference on Document Analysis and Recognition
A Steerable Directional Local Profile Technique for Extraction of Handwritten Arabic Text Lines
ICDAR '09 Proceedings of the 2009 10th International Conference on Document Analysis and Recognition
A Multi-Lingual Recognition System for Arabic and Latin Handwriting
ICDAR '09 Proceedings of the 2009 10th International Conference on Document Analysis and Recognition
ICDAR 2009 Handwriting Segmentation Contest
ICDAR '09 Proceedings of the 2009 10th International Conference on Document Analysis and Recognition
Markov models for offline handwriting recognition: a survey
International Journal on Document Analysis and Recognition
SACH'06 Proceedings of the 2006 conference on Arabic and Chinese handwriting recognition
Gabor features for offline Arabic handwriting recognition
DAS '10 Proceedings of the 9th IAPR International Workshop on Document Analysis Systems
Context-aware and content-based dynamic Voronoi page segmentation
DAS '10 Proceedings of the 9th IAPR International Workshop on Document Analysis Systems
Handwritten Arabic text line segmentation using affinity propagation
DAS '10 Proceedings of the 9th IAPR International Workshop on Document Analysis Systems
Script-Free Text Line Segmentation Using Interline Space Model for Printed Document Images
ICDAR '11 Proceedings of the 2011 International Conference on Document Analysis and Recognition
Text Extraction from Video Using Conditional Random Fields
ICDAR '11 Proceedings of the 2011 International Conference on Document Analysis and Recognition
ICDAR '11 Proceedings of the 2011 International Conference on Document Analysis and Recognition
Video Script Identification Based on Text Lines
ICDAR '11 Proceedings of the 2011 International Conference on Document Analysis and Recognition
Combining Multi-scale Character Recognition and Linguistic Knowledge for Natural Scene Text OCR
DAS '12 Proceedings of the 2012 10th IAPR International Workshop on Document Analysis Systems
Offline Handwritten English Character Recognition Based on Convolutional Neural Network
DAS '12 Proceedings of the 2012 10th IAPR International Workshop on Document Analysis Systems
Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition
IEEE Transactions on Audio, Speech, and Language Processing
Real-time scene text localization and recognition
CVPR '12 Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
A Comparative Study of Persian/Arabic Handwritten Character Recognition
ICFHR '12 Proceedings of the 2012 International Conference on Frontiers in Handwriting Recognition
Applying Discriminatively Optimized Feature Transform for HMM-based Off-Line Handwriting Recognition
ICFHR '12 Proceedings of the 2012 International Conference on Frontiers in Handwriting Recognition
Hi-index | 0.00 |
This paper offers an overview of the current approaches to research in the field of off-line multilingual OCR. Typically, off-line OCR systems are designed for a particular script or language. However, the ideal approach to multilingual OCR would likely be to develop a system that can, with the use of language-specific training data, be re-targeted to process different languages with minimal modifications. This is still an open area of research with plenty of challenges. This is particularly true for multilingual handwriting recognition due to the added complexity of variations in writing styles even within the same scripts. Challenges for multilingual OCR in preprocessing, feature extraction, script identification and recognition modeling and a brief survey of research in these areas are presented in the paper. Ideas for future research in multilingual OCR are outlined.