Using Corpus-Based Approaches in a System for Multilingual Information Retrieval

Authors:
Martin Braschler;Peter Schäuble
Affiliations:
Eurospider Information Technology AG, Schaffhauserstrasse 18, CH-8006 Zürich, Switzerland. braschler@eurospider.com;Eurospider Information Technology AG, Schaffhauserstrasse 18, CH-8006 Zürich, Switzerland. schauble@eurospider.com
Venue:
Information Retrieval
Year:
2000

Citing 8
Cited 3

Relevance feedback and other query modification techniques

Information retrieval
Query expansion using local and global document analysis

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Pivoted document length normalization

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Experiments in multilingual information retrieval using the SPIDER system

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Phrasal translation and query expansion techniques for cross-language information retrieval

Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Variations in relevance judgments and the measurement of retrieval effectiveness

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Multimedia Information Retrieval: Content-Based Information Retrieval from Large Text and Audio Databases

Multimedia Information Retrieval: Content-Based Information Retrieval from Large Text and Audio Databases
A program for aligning sentences in bilingual corpora

Computational Linguistics - Special issue on using large corpora: I

Meta-data Extraction and Query Translation. Treatment of Semantic Heterogeneity

ECDL '02 Proceedings of the 6th European Conference on Research and Advanced Technology for Digital Libraries
Technical issues of cross-language information retrieval: a review

Information Processing and Management: an International Journal - Special issue: Cross-language information retrieval
Comparative study of monolingual and multilingual search models for use with asian languages

ACM Transactions on Asian Language Information Processing (TALIP)

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a system for multilingual information retrieval that allows users to formulate queries in their preferred language and retrieve relevant information from a collection containing documents in multiple languages. The system is based on a process of document level alignments, where documents of different languages are paired according to their similarity. The resulting mapping allows us to produce a multilingual comparable corpus. Such a corpus has multiple interesting applications. It allows us to build a data structure for query translation in cross-language information retrieval (CLIR). Moreover, we also perform pseudo relevance feedback on the alignments to improve our retrieval results. And finally, multiple retrieval runs can be merged into one unified result list. The resulting system is inexpensive, adaptable to domain-specific collections and new languages and has performed very well at the TREC-7 conference CLIR system comparison.