Querying across languages: a dictionary-based approach to multilingual information retrieval
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Phrasal translation and query expansion techniques for cross-language information retrieval
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Query term disambiguation for Web cross-language information retrieval using a search engine
IRAL '00 Proceedings of the fifth international workshop on on Information retrieval with Asian languages
Improving query translation for cross-language information retrieval using statistical models
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Relevance based language models
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Statistical cross-language information retrieval using n-best query translations
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Cross-lingual relevance models
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
A Tutorial on Support Vector Machines for Pattern Recognition
Data Mining and Knowledge Discovery
Using Statistical Translation Models for Bilingual IR
CLEF '01 Revised Papers from the Second Workshop of the Cross-Language Evaluation Forum on Evaluation of Cross-Language Information Retrieval Systems
Embedding web-based statistical translation models in cross-language information retrieval
Computational Linguistics - Special issue on web as corpus
Using mutual information to resolve query translation ambiguities and query term weighting
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
A study of statistical models for query translation: finding a good unit of translation
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Statistical query translation models for cross-language information retrieval
ACM Transactions on Asian Language Information Processing (TALIP)
Information Processing and Management: an International Journal
A Hybrid Technique for English-Chinese Cross Language Information Retrieval
ACM Transactions on Asian Language Information Processing (TALIP)
Gcon: a graph-based technique for resolving ambiguity in query translation candidates
Proceedings of the 2008 ACM symposium on Applied computing
A progressive algorithm for cross-language information retrieval based on dictionary translation
AIRS'08 Proceedings of the 4th Asia information retrieval conference on Information retrieval technology
A refinement framework for cross language text categorization
AIRS'08 Proceedings of the 4th Asia information retrieval conference on Information retrieval technology
Enhancing query translation with relevance feedback in translingual information retrieval
Information Processing and Management: an International Journal
Translation techniques in cross-language information retrieval
ACM Computing Surveys (CSUR)
Mining a multilingual association dictionary from Wikipedia for cross-language information retrieval
Journal of the American Society for Information Science and Technology
Flat vs. hierarchical phrase-based translation models for cross-language information retrieval
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Hi-index | 0.00 |
One key to cross-language information retrieval is how to efficiently resolve the translation ambiguity of queries given their short length. This problem is even more challenging when only bilingual dictionaries are available, which is the focus of this paper. In the previous research of cross-language information retrieval using bilingual dictionaries, the word co-occurrence statistics is used to determine the most likely translations of queries. In this paper, we propose a novel statistical model, named ``maximum coherence model'', which estimates the translation probabilities of query words that are consistent with the word co-occurrence statistics. Unlike the previous work, where a binary decision is made for the selection of translations, the new model maintains the uncertainty in translating query words when their sense ambiguity is difficult to resolve. Furthermore, this new model is able to estimate translations of multiple query words simultaneously. This is in contrast to many previous approaches where translations of individual query words are determined independently. Empirical studies with TREC datasets have shown that the maximum coherence model achieves a relative 10% - 40% improvement in cross-language information retrieval, comparing to other approaches that also use word co-occurrence statistics for sense disambiguation.