CLEF Experiments at Maryland: Statistical Stemming and Backoff Translation

  • Authors:
  • Douglas W. Oard;Gina-Anne Levow;Clara I. Cabezas

  • Affiliations:
  • -;-;-

  • Venue:
  • CLEF '00 Revised Papers from the Workshop of Cross-Language Evaluation Forum on Cross-Language Information Retrieval and Evaluation
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

The University of Maryland participated in the CLEF 2000 multilingual task, submitting three official runs that explored the impact of applying language-independent stemming techniques to dictionarybased cross-language information retrieval. The paper begins by describing a cross-language information retrieval architecture based on balanced document translation. A four-stage backoff strategy for improving the coverage of dictionary-based translation techniques is then introduced, and an implementation based on automatically trained statistical stemming is presented. Results indicate that competitive performance can be achieved using four-stage backoff translation in conjunction with freely available bilingual dictionaries, but that the the usefulness of the statistical stemming algorithms that were tried varies considerably across the three languages to which they were applied.