Retrieval in text collections with historic spelling using linguistic and spelling variants

  • Authors:
  • Andrea Ernst-Gerlach;Norbert Fuhr

  • Affiliations:
  • University of Duisburg-Essen, Duisburg, Germany;University of Duisburg-Essen, Duisburg, Germany

  • Venue:
  • Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a new approach for the retrieval of texts with non-standard spelling, which is important for historic texts e.g. in English or German. In this paper, we describe the overall architecture of our system, followed by its evaluation. Given a search term as lemma, we use a dictionary of contemporary German for finding all inflected and derived forms of the lemma. Then we apply transformation rules (derived from training data) for generating historic spelling variants. For the evaluation, we regard the resulting retrieval quality. The experimental results show that we can improve the retrieval quality for historic collections substantially.