HMM-based script identification for OCR
Proceedings of the 4th International Workshop on Multilingual OCR
Recognition of Bangla compound characters using structural decomposition
Pattern Recognition
Hi-index | 0.00 |
Optical character recognition is carried out using techniques borrowed from statistical machine translation. In particular, the use of multiple simple feature functions in linear combination, along with minimum-error-rate training, integrated decoding, and $N$-gram language modeling is found to be remarkably effective, across several scripts and languages. Results are presented using both synthetic and real data in five languages.