A comparison of two methods for Boolean query relevancy feedback
Information Processing and Management: an International Journal
Identifying word correspondence in parallel texts
HLT '91 Proceedings of the workshop on Speech and Natural Language
Query expansion using local and global document analysis
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Cumulated gain-based evaluation of IR techniques
ACM Transactions on Information Systems (TOIS)
TnT: a statistical part-of-speech tagger
ANLC '00 Proceedings of the sixth conference on Applied natural language processing
Accurate unlexicalized parsing
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Improving web search ranking by incorporating user behavior information
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Identifying "best bet" web search results by mining past user behavior
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Coupling feature selection and machine learning methods for navigational query identification
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Random walks on the click graph
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
A regression framework for learning ranking functions using relative relevance judgments
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Active exploration for learning rankings from clickthrough data
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
An experimental comparison of click position-bias models
WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Mining the search trails of surfing crowds: identifying relevant websites from user activity
Proceedings of the 17th international conference on World Wide Web
A user browsing model to predict search engine click data from past observations.
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Smoothing clickthrough data for web search ranking
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Hi-index | 0.00 |
User clicks on a URL in response to a query are extremely useful predictors of the URL's relevance to that query. Exact match click features tend to suffer from severe data sparsity issues in web ranking. Such sparsity is particularly pronounced for new URLs or long queries where each distinct query-url pair will rarely occur. To remedy this, we present a set of straightforward yet informative query-url n-gram features that allows for generalization of limited user click data to large amounts of unseen query-url pairs. The method is motivated by techniques leveraged in the NLP community for dealing with unseen words. We find that there are interesting regularities across queries and their preferred destination URLs; for example, queries containing "form" tend to lead to clicks on URLs containing "pdf". We evaluate our set of new query-url features on a web search ranking task and obtain improvements that are statistically significant at a p-value