Development of an Assamese OCR using Bangla OCR

  • Authors:
  • Subhankar Ghosh;P. K. Bora;Sanjib Das;B. B. Chaudhuri

  • Affiliations:
  • IIT Guwahati, Guwahati, India;IIT Guwahati, Guwahati, India;IIT Guwahati, Guwahati, India;CVPR Unit, ISI, Kolkata, India

  • Venue:
  • Proceeding of the workshop on Document Analysis and Recognition
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper refers to the development of an OCR for the Assamese language by modifying an existing OCR for the Bangla language. This modification is feasible because the Assamese script is similar, except for a few characters, to the Bangla script. The OCR incorporates a two stage recognizer using SVM classifier with no post-processing. A spell-checker capable of detecting most errors and interactively recommending some corrections is implemented. The OCR is tested with about 1800 pages of good quality printed documents. The accuracy achieved is about 97%.