Advances in multilingual text retrieval

  • Authors:
  • Mark Davis

  • Affiliations:
  • New Mexico State University, Las Cruces, NM

  • Venue:
  • TIPSTER '96 Proceedings of a workshop on held at Vienna, Virginia: May 6-8, 1996
  • Year:
  • 1996

Quantified Score

Hi-index 0.00

Visualization

Abstract

Multilingual text retrieval extends the basic monolingual detection task to include retrieving relevant documents in languages other than the query language. The task therefore merges efforts in machine translation with efforts in text retrieval, but the machine translation component may be substantially simplified due to some basic assumptions about the design and implementation of high-performance text retrieval systems. A primary consideration is that most modern text retrieval systems regard queries and documents as unordered "bags" of words. The translation of an unordered set of terms is therefore approximately the translation of the terms themselves. Although a linearity assumption such as this breaks down when considering phrasal elements in most languages, it is reasonably accurate for many terms and becomes increasingly accurate at the sentence level and above.