Towards recognition of degraded words by probabilistic parsing

  • Authors:
  • Karthika Mohan;K. J. Jinesh;C. V. Jawahar

  • Affiliations:
  • CVIT, IIIT, Hyderabad, AP, India;CVIT, IIIT, Hyderabad, AP, India;CVIT, IIIT, Hyderabad, AP, India

  • Venue:
  • Proceedings of the Seventh Indian Conference on Computer Vision, Graphics and Image Processing
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Though, Indian language OCRs have shown significant improvement in classification rates in recent years, recognition of degraded words still pose a big challenge for the development of robust OCR systems. Ours is an attempt to formulate the problem of degraded word recognition in a generic and formal structure. We formulate the problem of degraded word recognition as a probabilistic parsing problem. A probabilistic parsing based framework is used to rank and validate various possible hypotheses. We effectively combine it with an alternate word generator, symbol recognizer and verification unit to improve recognition rates of degraded words without compromising good characters. We demonstrate our method on Malayalam. We experiment our method on a complete annotated book, where around 65% of the degraded words are correctly recognized using this approach.