Word association norms, mutual information, and lexicography
Computational Linguistics
A language modeling approach to information retrieval
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
A hidden Markov model information retrieval system
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Improving stemming for Arabic information retrieval: light stemming and co-occurrence analysis
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Cross-language information retrieval: experiments based on CLEF 2000 corpora
Information Processing and Management: an International Journal
Haircut: a system for multilingual text retrieval in java
Journal of Computing Sciences in Colleges
Cross-Language Evaluation Forum: Objectives, Results, Achievements
Information Retrieval
Combining Multiple Strategies for Effective Monolingual and Cross-Language Retrieval
Information Retrieval
Character N-Gram Tokenization for European Language Text Retrieval
Information Retrieval
Embedding web-based statistical translation models in cross-language information retrieval
Computational Linguistics - Special issue on web as corpus
Technical issues of cross-language information retrieval: a review
Information Processing and Management: an International Journal - Special issue: Cross-language information retrieval
Entity extraction without language-specific resources
COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Word normalization and decompounding in mono- and bilingual IR
Information Retrieval
Report on thomson legal and regulatory experiments at CLEF-2004
CLEF'04 Proceedings of the 5th conference on Cross-Language Evaluation Forum: multilingual Information Access for Text, Speech and Images
Hi-index | 0.00 |
We present an approach to multilingual information retrieval that does not depend on the existence of specific linguistic resources such as stemmers or thesauri. Using the HAIRCUT system we participated in the monolingual, bilingual, and multilingual tasks of the CLEF-2000 evaluation. Our approach, based on combining the benefits of words and character n-grams, was effective for both language-independent monolingual retrieval as well as for cross-language retrieval using translated queries. After describing our monolingual retrieval approach we compare a translation method using aligned parallel corpora to commercial machine translation software.