Parsing of Graph-Representable Pictures
Journal of the ACM (JACM)
Optical Character Recognition: An Illustrated Guide to the Frontier
Optical Character Recognition: An Illustrated Guide to the Frontier
Enhancing Degraded Document Images via Bitmap Clustering and Averaging
ICDAR '97 Proceedings of the 4th International Conference on Document Analysis and Recognition
On OCR of Degraded Documents Using Fuzzy Multifactorial Analysis
AFSS '02 Proceedings of the 2002 AFSS International Conference on Fuzzy Systems. Calcutta: Advances in Soft Computing
A Complete OCR for Printed Hindi Text in Devanagari Script
ICDAR '01 Proceedings of the Sixth International Conference on Document Analysis and Recognition
Image Parsing: Unifying Segmentation, Detection, and Recognition
ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
INFTY: an integrated OCR system for mathematical documents
Proceedings of the 2003 ACM symposium on Document engineering
Probabilistic Finite-State Machines-Part I
IEEE Transactions on Pattern Analysis and Machine Intelligence
Bottom-up/Top-Down Image Parsing by Attribute Graph Grammar
ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
Compilers: Principles, Techniques, and Tools (2nd Edition)
Compilers: Principles, Techniques, and Tools (2nd Edition)
A post-processing scheme for malayalam using statistical sub-character language models
DAS '10 Proceedings of the 9th IAPR International Workshop on Document Analysis Systems
Hi-index | 0.00 |
Though, Indian language OCRs have shown significant improvement in classification rates in recent years, recognition of degraded words still pose a big challenge for the development of robust OCR systems. Ours is an attempt to formulate the problem of degraded word recognition in a generic and formal structure. We formulate the problem of degraded word recognition as a probabilistic parsing problem. A probabilistic parsing based framework is used to rank and validate various possible hypotheses. We effectively combine it with an alternate word generator, symbol recognizer and verification unit to improve recognition rates of degraded words without compromising good characters. We demonstrate our method on Malayalam. We experiment our method on a complete annotated book, where around 65% of the degraded words are correctly recognized using this approach.