Advances in the BBN BYBLOS OCR System
ICDAR '99 Proceedings of the Fifth International Conference on Document Analysis and Recognition
Videotext OCR Using Hidden Markov Models
ICDAR '01 Proceedings of the Sixth International Conference on Document Analysis and Recognition
Script-Independent, HMM-Based Text Line Finding for OCR
ICPR '00 Proceedings of the International Conference on Pattern Recognition - Volume 4
Offline Arabic Handwriting Recognition: A Survey
IEEE Transactions on Pattern Analysis and Machine Intelligence
Hi-index | 0.00 |
In this paper, we describe four recent enhancements to the BBN Byblos OCR System, a multilingual HMMbased character recognition system which has been demonstrated on a variety of languages, including English, Arabic, Chinese, and Japanese. These enhancements are implemented as optional extensions to the system and provide improved performance for certain scripts or domains. Projection-based reestimation of line boundaries reduces instability in the presence of some types of noise. An alternate modeling strategy used in the first of two recognition search passes substantially increases speed on languages with a large number of characters. Another speed improvement comes from automatic discovery and modeling of sub-characters. The use of Heteroschedastic Linear Discriminant Analysis (HLDA) makes modeling more tractable by reducing feature-space dimensionality.