Managing multilingual OCR project using XML

  • Authors:
  • Gaurav Harit;K. J. Jinesh;Ritu Garg;C. V. Jawahar;Santanu Chaudhury

  • Affiliations:
  • IIT, Kharagpur;IIIT, Hyderabad;IIT, Delhi;IIIT, Hyderabad;IIT, Delhi

  • Venue:
  • Proceedings of the International Workshop on Multilingual OCR
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents an XML-based scheme for managing a large multilingual OCR project. In particular we describe how a new XML based tagging scheme has been exploited to achieve the objectives of the project. Managing a large multi-lingual OCR project involving multiple research groups, developing script specific and script independent technologies in a collaborative fashion is a challenging problem. In this paper, we present some of the software and data management strategies designed for the project aimed at developing OCR for 11 scripts of Indian origin for which mature OCR technology was not available.