A semi-automatic adaptive OCR for digital libraries

  • Authors:
  • Sachin Rawat;K. S. Sesh Kumar;Million Meshesha;Indraneel Deb Sikdar;A. Balasubramanian;C. V. Jawahar

  • Affiliations:
  • Centre for Visual Information Technology, International Institute of Information Technology, Hyderabad, India;Centre for Visual Information Technology, International Institute of Information Technology, Hyderabad, India;Centre for Visual Information Technology, International Institute of Information Technology, Hyderabad, India;Centre for Visual Information Technology, International Institute of Information Technology, Hyderabad, India;Centre for Visual Information Technology, International Institute of Information Technology, Hyderabad, India;Centre for Visual Information Technology, International Institute of Information Technology, Hyderabad, India

  • Venue:
  • DAS'06 Proceedings of the 7th international conference on Document Analysis Systems
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a novel approach for designing a semi-automatic adaptive OCR for large document image collections in digital libraries. We describe an interactive system for continuous improvement of the results of the OCR. In this paper a semi-automatic and adaptive system is implemented. Applicability of our design for the recognition of Indian Languages is demonstrated. Recognition errors are used to train the OCR again so that it adapts and learns for improving its accuracy. Limited human intervention is allowed for evaluating the output of the system and take corrective actions during the recognition process.