Towards Semi-supervised Transcription of Handwritten Historical Weather Reports

  • Authors:
  • Jan Richarz;Szilard Vajda;Gernot A. Fink

  • Affiliations:
  • -;-;-

  • Venue:
  • DAS '12 Proceedings of the 2012 10th IAPR International Workshop on Document Analysis Systems
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper addresses the automatic transcription of handwritten documents with a regular tabular structure. A method for extracting machine printed tables from images is proposed, using very little prior knowledge about the document layout. The detected table serves as query for retrieving and fitting a structural template, which is then used to extract handwritten text fields. A semi-supervised learning approach is applied to this fields, aiming at minimizing the human labeling effort for recognizer training. The effectiveness of the proposed approach is demonstrated experimentally on a set of historical weather reports. Compared to using all labels, competitive recognition performance is achieved by labeling only a small fraction of the data, keeping the required human effort very low.