Advances in multilingual text retrieval

Authors:
Mark Davis
Affiliations:
New Mexico State University, Las Cruces, NM
Venue:
TIPSTER '96 Proceedings of a workshop on held at Vienna, Virginia: May 6-8, 1996
Year:
1996

Citing 2
Cited 0

Accurate methods for the statistics of surprise and coincidence

Computational Linguistics - Special issue on using large corpora: I
Text alignment in the real world: improving alignments of noisy translations using common lexical features, string matching strategies and n-gram comparisons

EACL '95 Proceedings of the seventh conference on European chapter of the Association for Computational Linguistics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Multilingual text retrieval extends the basic monolingual detection task to include retrieving relevant documents in languages other than the query language. The task therefore merges efforts in machine translation with efforts in text retrieval, but the machine translation component may be substantially simplified due to some basic assumptions about the design and implementation of high-performance text retrieval systems. A primary consideration is that most modern text retrieval systems regard queries and documents as unordered "bags" of words. The translation of an unordered set of terms is therefore approximately the translation of the terms themselves. Although a linearity assumption such as this breaks down when considering phrasal elements in most languages, it is reasonably accurate for many terms and becomes increasingly accurate at the sentence level and above.