Using contextual spelling correction to improve retrieval effectiveness in degraded text collections

  • Authors:
  • Patrick Ruch

  • Affiliations:
  • Swiss Federal Institute of Technology, Lausanne - Switzerland

  • Venue:
  • COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

The study presented relies on the design and evaluation of an improved IR system susceptible to cope with textual misspellings. After selecting an optimal weighting scheme for the engine, we evaluate the effect of misspellings on the retrieval effectiveness. Then, we compare the improvement brought to the engine by the adjunction of two different non-interactive spelling correction strategies: a classical one, based on a string-to-string edit distance calculus, and a contextual one, which adds linguistically-motivated features to the string distance module. The results for the latter suggest that average precision in degraded texts can be reduced to a few percents (4%).