A Bilingual OCR for Hindi-Telugu Documents and its Applications

  • Authors:
  • C. V. Jawahar;M. N. S. S. K. Pavan Kumar;S. S. Ravi Kiran

  • Affiliations:
  • -;-;-

  • Venue:
  • ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 1
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes the character recognition processfrom printed documents containing Hindi and Telugu text.Hindi and Telugu are among the most popular languages inIndia. The bilingual recognizer is based on Principal ComponentAnalysis followed by support vector classification.This attains an overall accuracy of approximately 96.7%.Extensive experimentation is carried out on an independenttest set of approximately 200000 characters. Applicationsbased on this OCR are sketched.