Ground truth creation for handwriting recognition in historical documents

  • Authors:
  • Andreas Fischer;Emanuel Indermühle;Horst Bunke;Gabriel Viehhauser;Michael Stolz

  • Affiliations:
  • Institute of Computer Science and Applied Mathematics, Bern, Switzerland;Institute of Computer Science and Applied Mathematics, Bern, Switzerland;Institute of Computer Science and Applied Mathematics, Bern, Switzerland;Institut für Germanistik, CH, Bern;Institut für Germanistik, CH, Bern

  • Venue:
  • DAS '10 Proceedings of the 9th IAPR International Workshop on Document Analysis Systems
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Handwriting recognition in historical documents is vital for the creation of digital libraries. The creation of readily available ground truth data plays a central role for the development of new recognition technologies. For historical documents, ground truth creation is more difficult and time-consuming when compared with modern documents. In this paper, we present a semi-automatic ground truth creation proceeding for historical documents that takes into account noisy background and transcription alignment. The proposed ground truth creation is demonstrated for the IAM Historical Handwriting Database (IAM-HistDB) that is currently under construction and will include several hundred Old German manuscripts. With a small set of algorithmic tools and few manual interactions, it is shown how laypersons can efficiently create a ground truth for handwriting recognition.