Fractional similarity: cross-lingual feature selection for search

Authors:
Jagadeesh Jagarlamudi;Paul N. Bennett
Affiliations:
University of Maryland, Computer Science, College Park MD;Microsoft Research, One Microsoft Way, Redmond WA
Venue:
ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Year:
2011

Citing 15
Cited 2

IR evaluation methods for retrieving highly relevant documents

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
A taxonomy of web search

ACM SIGIR Forum
Understanding user goals in web search

Proceedings of the 13th international conference on World Wide Web
Simple BM25 extension to multiple weighted fields

Proceedings of the thirteenth ACM international conference on Information and knowledge management
Optimizing web search using web click-through data

Proceedings of the thirteenth ACM international conference on Information and knowledge management
Linear discriminant model for information retrieval

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Learning to rank using gradient descent

ICML '05 Proceedings of the 22nd international conference on Machine learning
Improving web search ranking by incorporating user behavior information

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Evaluation over thousands of queries

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Using English information in non-English web search

Proceedings of the 2nd ACM workshop on Improving non english web searching
Multi-task learning for learning to rank in web search

Proceedings of the 18th ACM conference on Information and knowledge management
Model adaptation via model interpolation and boosting for web search ranking

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Adapting boosting for information retrieval measures

Information Retrieval
Multilingual PRF: english lends a helping hand

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Multilingual pseudo-relevance feedback: performance study of assisting languages

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

Leveraging interlingual classification to improve web search

Proceedings of the 21st international conference companion on World Wide Web
Modeling click-through based word-pairs for web search

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

Training data as well as supplementary data such as usagebased click behavior may abound in one search market (i.e., a particular region, domain, or language) and be much scarcer in another market. Transfer methods attempt to improve performance in these resourcescarce markets by leveraging data across markets. However, differences in feature distributions across markets can change the optimal model. We introduce a method called Fractional Similarity, which uses query-based variance within a market to obtain more reliable estimates of feature deviations across markets. An empirical analysis demonstrates that using this scoring method as a feature selection criterion in cross-lingual transfer improves relevance ranking in the foreign language and compares favorably to a baseline based on KL divergence