Glyph extraction from historic document images

  • Authors:
  • Lothar Meyer-Lerbs;Arne Schuldt;Björn Gottfried

  • Affiliations:
  • University of Bremen, Bremen, Germany;University of Bremen, Bremen, Germany;University of Bremen, Bremen, Germany

  • Venue:
  • Proceedings of the 10th ACM symposium on Document engineering
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper is about the reproduction of ancient texts with vectorised fonts. While for OCR only recognition rates count, a reproduction process does not necessarily require the recognition of characters. Our system aims at extracting all characters from printed historic documents without the employment of knowledge of language, font, or writing system. It searches for the best prototypes and creates a document-specific font from these glyphs. To reach this goal, many common OCR preprocessing steps are no longer adequate. We describe the necessary changes of our system that deals particularly with documents typeset in Fraktur. On the one hand, algorithms are described that extract glyphs accurately for the purpose of precise reproduction. On the other hand, classification results of extracted Fraktur glyphs are presented for different shape descriptors.