A Language-Independent Approach to European Text Retrieval

Authors:
Paul McNamee;James Mayfield;Christine D. Piatko
Affiliations:
-;-;-
Venue:
CLEF '00 Revised Papers from the Workshop of Cross-Language Evaluation Forum on Cross-Language Information Retrieval and Evaluation
Year:
2000

Citing 3
Cited 11

Word association norms, mutual information, and lexicography

Computational Linguistics
A language modeling approach to information retrieval

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
A hidden Markov model information retrieval system

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval

Improving stemming for Arabic information retrieval: light stemming and co-occurrence analysis

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Cross-language information retrieval: experiments based on CLEF 2000 corpora

Information Processing and Management: an International Journal
Haircut: a system for multilingual text retrieval in java

Journal of Computing Sciences in Colleges
Cross-Language Evaluation Forum: Objectives, Results, Achievements

Information Retrieval
Combining Multiple Strategies for Effective Monolingual and Cross-Language Retrieval

Information Retrieval
Character N-Gram Tokenization for European Language Text Retrieval

Information Retrieval
Embedding web-based statistical translation models in cross-language information retrieval

Computational Linguistics - Special issue on web as corpus
Technical issues of cross-language information retrieval: a review

Information Processing and Management: an International Journal - Special issue: Cross-language information retrieval
Entity extraction without language-specific resources

COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Word normalization and decompounding in mono- and bilingual IR

Information Retrieval
Report on thomson legal and regulatory experiments at CLEF-2004

CLEF'04 Proceedings of the 5th conference on Cross-Language Evaluation Forum: multilingual Information Access for Text, Speech and Images

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present an approach to multilingual information retrieval that does not depend on the existence of specific linguistic resources such as stemmers or thesauri. Using the HAIRCUT system we participated in the monolingual, bilingual, and multilingual tasks of the CLEF-2000 evaluation. Our approach, based on combining the benefits of words and character n-grams, was effective for both language-independent monolingual retrieval as well as for cross-language retrieval using translated queries. After describing our monolingual retrieval approach we compare a translation method using aligned parallel corpora to commercial machine translation software.