Dictionary-based techniques for cross-language information retrieval

  • Authors:
  • Gina-Anne Levow;Douglas W. Oard;Philip Resnik

  • Affiliations:
  • Department of Computer Science, University of Chicago, 1100 E, 58th Street, Chicago, IL;College of Information Studies and Institute for Advanced Computer Studies, University of Maryland, College Park, MD;Department of Linguistics and Institute for Advanced Computer Studies, University of Maryland, College Park, MD

  • Venue:
  • Information Processing and Management: an International Journal - Special issue: Cross-language information retrieval
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Cross-language information retrieval (CLIR) systems allow users to find documents written in different languages from that of their query. Simple knowledge structures such as bilingual term lists have proven to be a remarkably useful basis for bridging that language gap. A broad array of dictionary-based techniques have demonstrated utility, but comparison across techniques has been difficult because evaluation results often span only a limited range of conditions. This article identifies the key issues in dictionary-based CLIR, develops unified frameworks for term selection and term translation that help to explain the relationships among existing techniques, and illustrates the effect of those techniques using four contrasting languages for systematic experiments with a uniform query translation architecture. Key results include identification of a previously unseen dependence of pre- and post-translation expansion on orthographic cognates and development of a query-specific measure for translation fanout that helps to explain the utility of structured query methods.