Lexical ambiguity and information retrieval
ACM Transactions on Information Systems (TOIS)
Using WordNet to disambiguate word senses for text retrieval
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Word sense disambiguation and information retrieval
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
A survey of multilingual text retrieval
A survey of multilingual text retrieval
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
A language modeling approach to information retrieval
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Information retrieval as statistical translation
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Machine translation and monolingual information retrieval (poster abstract)
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Document language models, query models, and risk minimization for information retrieval
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Relevance based language models
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
A study of smoothing methods for language models applied to Ad Hoc information retrieval
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
A Linguistically Motivated Probabilistic Model of Information Retrieval
ECDL '98 Proceedings of the Second European Conference on Research and Advanced Technology for Digital Libraries
A systematic comparison of various statistical alignment models
Computational Linguistics
Word sense disambiguation in information retrieval revisited
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Combination Approaches for Multilingual Text Retrieval
Information Retrieval
Embedding web-based statistical translation models in cross-language information retrieval
Computational Linguistics - Special issue on web as corpus
The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
A DP based search using monotone alignments in statistical translation
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Cluster-based retrieval using language models
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Corpus structure, language models, and ad hoc information retrieval
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Information retrieval using word senses: root sense tagging approach
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Should we translate the documents or the queries in cross-language information retrieval?
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Semantic indexing using WordNet senses
RANLPIR '00 Proceedings of the ACL-2000 workshop on Recent advances in natural language processing and information retrieval: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 11
Improving the estimation of relevance models using large external corpora
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Moses: open source toolkit for statistical machine translation
ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Query dependent pseudo-relevance feedback based on wikipedia
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Lattice-based minimum error rate training for statistical machine translation
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Scaling up word sense disambiguation via parallel texts
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 3
A comparative study of methods for estimating query language models with pseudo feedback
Proceedings of the 18th ACM conference on Information and knowledge management
Exploiting bilingual information to improve web search
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Statistical Machine Translation
Statistical Machine Translation
Utilizing passage-based language models for ad hoc document retrieval
Information Retrieval
Positional relevance model for pseudo-relevance feedback
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Multilingual PRF: english lends a helping hand
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
A cross-lingual framework for monolingual biomedical information retrieval
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Does word sense disambiguation improve information retrieval?
Proceedings of the fourth workshop on Exploiting semantic annotations in information retrieval
Hi-index | 0.01 |
Word ambiguity and vocabulary mismatch are critical problems in information retrieval. To deal with these problems, this paper proposes the use of translated words to enrich document representation, going beyond the words in the original source language to represent a document. In our approach, each original document is automatically translated into an auxiliary language, and the resulting translated document serves as a semantically enhanced representation for supplementing the original bag of words. The core of our translation representation is the expected term frequency of a word in a translated document, which is calculated by averaging the term frequencies over all possible translations, rather than focusing on the 1-best translation only. To achieve better efficiency of translation, we do not rely on full-fledged machine translation, but instead use monotonic translation by removing the time-consuming reordering component. Experiments carried out on standard TREC test collections show that our proposed translation representation leads to statistically significant improvements over using only the original language of the document collection.