Language identification for interactive handwriting transcription of multilingual documents

  • Authors:
  • Miguel A. del Agua;Nicolás Serrano;Alfons Juan

  • Affiliations:
  • DSIC/ITI, Universitat Politècnica de València;DSIC/ITI, Universitat Politècnica de València;DSIC/ITI, Universitat Politècnica de València

  • Venue:
  • IbPRIA'11 Proceedings of the 5th Iberian conference on Pattern recognition and image analysis
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

An effective approach to handwriting transcription of (old) documents is to follow a sequential, line-by-line transcription of the whole document, in which a continuously retrained system interacts with the user. In the case of multilingual documents, however, a minor yet important issue for this interactive approach is to first identify the language of the current text line image to be transcribed. In this paper, we propose a probabilistic framework and three techniques for this purpose. Empirical results are reported on an entire 764-page multilingual document for which previous empirical tests were limited to its first 180 pages, written only in Spanish.