Iterative translation disambiguation for cross-language information retrieval

Authors:
Christof Monz;Bonnie J. Dorr
Affiliations:
University of Maryland, College Park, MD;University of Maryland, College Park, MD
Venue:
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Year:
2005

Citing 19
Cited 18

Exploring statistics: a modern introduction to data analysis and inference

Exploring statistics: a modern introduction to data analysis and inference
Word association norms, mutual information, and lexicography

Computational Linguistics
Non-parametric significance tests of retrieval performance comparisons

Journal of Information Science
Statistical inference in retrieval effectiveness evaluation

Information Processing and Management: an International Journal
The effects of query structure and dictionary setups in dictionary-based cross-language information retrieval

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Foundations of statistical natural language processing

Foundations of statistical natural language processing
Query term disambiguation for Web cross-language information retrieval using a search engine

IRAL '00 Proceedings of the fifth international workshop on on Information retrieval with Asian languages
Information Retrieval

Information Retrieval
Resolving query translation ambiguity using a decaying co-occurrence model and syntactic dependence relations

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Using Statistical Term Similarity for Sense Disambiguationin Cross-Language Information Retrieval

Information Retrieval
Shallow Morphological Analysis in Monolingual Information Retrieval for Dutch, German, and Italian

CLEF '01 Revised Papers from the Second Workshop of the Cross-Language Evaluation Forum on Evaluation of Cross-Language Information Retrieval Systems
Monolingual Document Retrieval for European Languages

Information Retrieval
Using the web to obtain frequencies for unseen bigrams

Computational Linguistics - Special issue on web as corpus
Accurate methods for the statistics of surprise and coincidence

Computational Linguistics - Special issue on using large corpora: I
Term-list translation using mono-lingual word co-occurrence vectors

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
An empirical study of smoothing techniques for language modeling

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Using mutual information to resolve query translation ambiguities and query term weighting

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Effective phrase translation extraction from alignment models

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
The Alignment Template Approach to Statistical Machine Translation

Computational Linguistics

A statistical framework for query translation disambiguation

ACM Transactions on Asian Language Information Processing (TALIP)
Term disambiguation techniques based on target document collection for cross-language information retrieval: an empirical comparison of performance between techniques

Information Processing and Management: an International Journal
Cross-lingual query suggestion using query logs of different languages

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
User-assisted query translation for interactive cross-language information retrieval

Information Processing and Management: an International Journal
Extending query translation to cross-language query expansion with markov chain models

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
A Hybrid Technique for English-Chinese Cross Language Information Retrieval

ACM Transactions on Asian Language Information Processing (TALIP)
Gcon: a graph-based technique for resolving ambiguity in query translation candidates

Proceedings of the 2008 ACM symposium on Applied computing
Learning weights for translation candidates in Japanese-Chinese information retrieval

Expert Systems with Applications: An International Journal
An automatic translation of tags for multimedia contents using folksonomy networks

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Research on Lesk-C-Based WSD and Its Application in English-Chinese Bi-directional CLIR

AIRS '09 Proceedings of the 5th Asia Information Retrieval Symposium on Information Retrieval Technology
Cross-lingual semantic relatedness using encyclopedic knowledge

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
Exploiting query logs for cross-lingual query suggestions

ACM Transactions on Information Systems (TOIS)
Cross language text classification by model translation and semi-supervised learning

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Elhuyar-IXA: semantic relatedness and cross-lingual passage retrieval

CLEF'09 Proceedings of the 10th cross-language evaluation forum conference on Multilingual information access evaluation: text retrieval experiments
Learning inter-related statistical query translation models for English-Chinese bi-directional CLIR

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Translation techniques in cross-language information retrieval

ACM Computing Surveys (CSUR)
Rada Mihalcea and Dragomir Radev: Graph-based natural language processing and information retrieval

Machine Translation
Finding synonyms and other semantically-similar terms from coselection data

AWC '13 Proceedings of the First Australasian Web Conference - Volume 144

Quantified Score

Hi-index	0.00

Visualization

Abstract

Finding a proper distribution of translation probabilities is one of the most important factors impacting the effectiveness of a cross-language information retrieval system. In this paper we present a new approach that computes translation probabilities for a given query by using only a bilingual dictionary and a monolingual corpus in the target language. The algorithm combines term association measures with an iterative machine learning approach based on expectation maximization. Our approach considers only pairs of translation candidates and is therefore less sensitive to data-sparseness issues than approaches using higher n-grams. The learned translation probabilities are used as query term weights and integrated into a vector-space retrieval system. Results for English-German cross-lingual retrieval show substantial improvements over a baseline using dictionary lookup without term weighting.