Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
IR evaluation methods for retrieving highly relevant documents
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Improving query translation for cross-language information retrieval using statistical models
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Optimizing search engines using clickthrough data
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Query type classification for web document retrieval
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Cross-Language Evaluation Forum: Objectives, Results, Achievements
Information Retrieval
An efficient boosting algorithm for combining preferences
The Journal of Machine Learning Research
Learning random walk models for inducing word dependency distributions
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Learning to rank using gradient descent
ICML '05 Proceedings of the 22nd international conference on Machine learning
Weakly supervised named entity transliteration and discovery from multilingual comparable corpora
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Mining key phrase translations from web corpora
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Pegasos: Primal Estimated sub-GrAdient SOlver for SVM
Proceedings of the 24th international conference on Machine learning
A support vector method for optimizing average precision
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Learning to rank relational objects and its application to web search
Proceedings of the 17th international conference on World Wide Web
CLEF 2005: multilingual retrieval by combining multiple multilingual ranked lists
CLEF'05 Proceedings of the 6th international conference on Cross-Language Evalution Forum: accessing Multilingual Information Repositories
Selection and merging strategies for multilingual information retrieval
CLEF'04 Proceedings of the 5th conference on Cross-Language Evaluation Forum: multilingual Information Access for Text, Speech and Images
Multilingual PRF: english lends a helping hand
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Multilingual pseudo-relevance feedback: performance study of assisting languages
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Fractional similarity: cross-lingual feature selection for search
ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
From bilingual dictionaries to interlingual document representations
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Learning regional transliteration variants
Information Processing and Management: an International Journal
Hi-index | 0.00 |
The leading web search engines have spent a decade building highly specialized ranking functions for English web pages. One of the reasons these ranking functions are effective is that they are designed around features such as PageRank, automatic query and domain taxonomies, and click-through information, etc. Unfortunately, many of these features are absent or altered in other languages. In this work, we show how to exploit these English features for a subset of Chinese queries which we call linguistically non-local (LNL). LNL Chinese queries have a minimally ambiguous English translation which also functions as a good English query. We first show how to identify pairs of Chinese LNL queries and their English counterparts from Chinese and English query logs. Then we show how to effectively exploit these pairs to improve Chinese relevance ranking. Our improved relevance ranker proceeds by (1) translating a query into English, (2) computing a cross-lingual relational graph between the Chinese and English documents, and (3) employing the relational ranking method of Qin et al. [15] to rank the Chinese documents. Our technique gives consistent improvements over a state-of-the-art Chinese mono-lingual ranker on web search data from the Microsoft Live China search engine.