Multilingual machine printed OCR
Hidden Markov models
Performance Improvements to the BBN Byblos OCR System
ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Robust named entity detection using an Arabic offline handwriting recognition system
Proceedings of The Third Workshop on Analytics for Noisy Unstructured Text Data
SACH'06 Proceedings of the 2006 conference on Arabic and Chinese handwriting recognition
The BBN document analysis service: a platform for multilingual document translation
DAS '10 Proceedings of the 9th IAPR International Workshop on Document Analysis Systems
Multilingual OCR research and applications: an overview
Proceedings of the 4th International Workshop on Multilingual OCR
Hi-index | 0.00 |
In this paper, we present a new, script-independent, HMM-based technique to locate text lines on images containing one or more paragraphs of single-column text. The parameters of the HMMs are trained on-line on each image using an unsupervised training procedure. We present results of line finding experiments in Arabic, Chinese and English to demonstrate the performance as well as the script-independent nature of the technique. Comparison of HMM-based line finding with manual line finding shows that the use of HMM-based technique does not lead to a significant increase in the recognition error rate.