Statistical cross-language information retrieval using n-best query translations

Authors:
Marcello Federico;Nicola Bertoldi
Affiliations:
ITC-irst Centro per la Ricerca Scientifica e Tecnologica, Trento, Italy;ITC-irst Centro per la Ricerca Scientifica e Tecnologica, Trento, Italy
Venue:
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Year:
2002

Citing 10
Cited 24

Principles of artificial intelligence

Principles of artificial intelligence
A tutorial on hidden Markov models and selected applications in speech recognition

Readings in speech recognition
Information retrieval: data structures and algorithms

Information retrieval: data structures and algorithms
Using statistical testing in the evaluation of retrieval experiments

SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Information retrieval as statistical translation

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Evaluating a probabilistic model for cross-lingual information retrieval

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
A study of smoothing methods for language models applied to Ad Hoc information retrieval

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Revised Papers from the Workshop of Cross-Language Evaluation Forum on Cross-Language Information Retrieval and Evaluation

CLEF '00 Revised Papers from the Workshop of Cross-Language Evaluation Forum on Cross-Language Information Retrieval and Evaluation
ITC-irst at CLEF 2000: Italian Monolingual Track

CLEF '00 Revised Papers from the Workshop of Cross-Language Evaluation Forum on Cross-Language Information Retrieval and Evaluation
The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II

Statistical Models for Monolingual and Bilingual Information Retrieval

Information Retrieval
Embedding web-based statistical translation models in cross-language information retrieval

Computational Linguistics - Special issue on web as corpus
Using the web for automated translation extraction in cross-language information retrieval

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Parsimonious language models for information retrieval

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Relevancy based semantic interoperation of reuse repositories

Proceedings of the 12th ACM SIGSOFT twelfth international symposium on Foundations of software engineering
Technical issues of cross-language information retrieval: a review

Information Processing and Management: an International Journal - Special issue: Cross-language information retrieval
Structured queries, language modeling, and relevance modeling in cross-language information retrieval

Information Processing and Management: an International Journal - Special issue: Cross-language information retrieval
A maximum coherence model for dictionary-based cross-language information retrieval

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Study of cross lingual information retrieval using on-line translation systems

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Chinese OOV translation and post-translation query expansion in chinese--english cross-lingual information retrieval

ACM Transactions on Asian Language Information Processing (TALIP)
Categorization-driven cross-language retrieval of medical information

Journal of the American Society for Information Science and Technology
A statistical framework for query translation disambiguation

ACM Transactions on Asian Language Information Processing (TALIP)
A Hybrid Technique for English-Chinese Cross Language Information Retrieval

ACM Transactions on Asian Language Information Processing (TALIP)
Indonesian-Japanese CLIR using only limited resource

CLIIR '06 Proceedings of the Workshop on How Can Computational Linguistics Improve Information Retrieval?
The ITC-irst news on demand platform

ECIR'03 Proceedings of the 25th European conference on IR research
A progressive algorithm for cross-language information retrieval based on dictionary translation

AIRS'08 Proceedings of the 4th Asia information retrieval conference on Information retrieval technology
To translate or not to translate?

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
An empirical comparison of translation disambiguation techniques for chinese–english cross-language information retrieval

AIRS'06 Proceedings of the Third Asia conference on Information Retrieval Technology
MSU at ImageCLEF: cross language and interactive image retrieval

CLEF'04 Proceedings of the 5th conference on Cross-Language Evaluation Forum: multilingual Information Access for Text, Speech and Images
Matching meaning for cross-language information retrieval

Information Processing and Management: an International Journal
An information-based cross-language information retrieval model

ECIR'12 Proceedings of the 34th European conference on Advances in Information Retrieval
Translation techniques in cross-language information retrieval

ACM Computing Surveys (CSUR)
Mining a multilingual association dictionary from Wikipedia for cross-language information retrieval

Journal of the American Society for Information Science and Technology
Flat vs. hierarchical phrase-based translation models for cross-language information retrieval

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a novel statistical model for cross-language information retrieval. Given a written query in the source language, documents in the target language are ranked by integrating probabilities computed by two statistical models: a query-translation model, which generates most probable term-by-term translations of the query, and a query-document model, which evaluates the likelihood of each document and translation. Integration of the two scores is performed over the set of N most probable translations of the query. Experimental results with values N=1, 5, 10 are presented on the Italian-English bilingual track data used in the CLEF 2000 and 2001 evaluation campaigns.