Shape-DNA: Effective Character Restoration and Enhancement for Arabic Text Documents

  • Authors:
  • Gulcin Caner;Ismail Haritaoglu

  • Affiliations:
  • -;-

  • Venue:
  • ICPR '10 Proceedings of the 2010 20th International Conference on Pattern Recognition
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a novel learning-based image restoration and enhancement technique for improving character recognition performance of OCR products for degraded documents or documents/text captured with mobile devices such as camera-phones. The proposed technique is language independent and can simultaneously increase the effective resolution and restore broken characters with artifacts due to image capturing device such as a low quality/low resolution camera, or due to previous pre-processing such as extracting text region from the document image. The proposed technique develops a predictive relationship between high-resolution training images and their low-resolution/degraded counterparts, and exploits this relationship in a probabilistic scheme to generate a high resolution image from a low quality, low-resolution text image. We present a fast and scalable implementation of the proposed character restoration algorithm to improve the text recognition for document/text images captured by mobile phones. Experimental results demonstrate that the system effectively increases OCR performance for documents captured by mobile imaging devices, from levels of 50% to levels of over 80% for non-latin document/scene text images at 120dpi.