Poor access to digitised historical texts: the solutions of the IMPACT project

  • Authors:
  • Hildelies Balk

  • Affiliations:
  • National Library of the Netherlands, Netherlands

  • Venue:
  • Proceedings of The Third Workshop on Analytics for Noisy Unstructured Text Data
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

While there is an increasing demand for digitally available material (text that is not digital is becoming virtually invisible), digitised material is becoming available too slowly and in too small quantities. And even if the material is digitised, the OCR (optical character recognition) technology does often not produce satisfactory results, especially for historical documents. This is due to various problems such as historic fonts, complex layouts, ink shining through and historical spelling variants.