Exploring reductions for long web queries

Authors:
Niranjan Balasubramanian;Giridhar Kumaran;Vitor R. Carvalho
Affiliations:
University of Massachusetts Amherst, Amherst, MA, USA;One Microsoft Way, Redmond, MA, USA;One Microsoft Way, Redmond, MA, USA
Venue:
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Year:
2010

Citing 11
Cited 23

Cumulated gain-based evaluation of IR techniques

ACM Transactions on Information Systems (TOIS)
Optimizing search engines using clickthrough data

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Discovering key concepts in verbose queries

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Improved query difficulty prediction for the web

Proceedings of the 17th ACM conference on Information and knowledge management
Analysis of long queries in a large scale search log

Proceedings of the 2009 workshop on Web Search Click Data
An improved markov random field model for supporting verbose queries

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Reducing long queries using query quality predictors

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
A Query Substitution-Search Result Refinement Approach for Long Query Web Searches

WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
A term dependency-based approach for query terms ranking

Proceedings of the 18th ACM conference on Information and knowledge management
Learning concept importance using a weighted dependence model

Proceedings of the third ACM international conference on Web search and data mining
Predicting query performance on the web

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval

LambdaMerge: merging the results of query reformulations

Proceedings of the fourth ACM international conference on Web search and data mining
Three sequential positions of query repair in interactions with internet search engines

Proceedings of the ACM 2011 conference on Computer supported cooperative work
Introducing the user-over-ranking hypothesis

ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Parameterized concept weighting in verbose queries

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Synthesizing high utility suggestions for rare web search queries

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Modeling subset distributions for verbose queries

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Tossing coins to trim long queries

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Applying the user-over-ranking hypothesis to query formulation

ICTIR'11 Proceedings of the Third international conference on Advances in information retrieval theory
Navigating the user query space

SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
Effective query formulation with multiple information sources

Proceedings of the fifth ACM international conference on Web search and data mining
Random selection assisted long web search query optimization

Proceedings of the 50th Annual Southeast Regional Conference
Interactive search support for difficult web queries

ECIR'12 Proceedings of the 34th European conference on Advances in Information Retrieval
Adaptive query suggestion for difficult queries

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Generating reformulation trees for complex queries

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Query performance prediction for IR

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Query by babbling: a research agenda

Proceedings of the first workshop on Information and knowledge management for developing region
Automatically mining question reformulation patterns from search log data

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
Harvesting visual concepts for image search with complex queries

Proceedings of the 20th ACM international conference on Multimedia
Characterizing web search queries that match very few or no results

Proceedings of the 21st ACM international conference on Information and knowledge management
Compact query term selection using topically related text

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Automatic query reformulations for text retrieval in software engineering

Proceedings of the 2013 International Conference on Software Engineering
Predicting the impact of expansion terms using semantic and user interaction features

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Document Score Distribution Models for Query Performance Inference and Prediction

ACM Transactions on Information Systems (TOIS)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Long queries form a difficult, but increasingly important segment for web search engines. Query reduction, a technique for dropping unnecessary query terms from long queries, improves performance of ad-hoc retrieval on TREC collections. Also, it has great potential for improving long web queries (upto 25% improvement in NDCG@5). However, query reduction on the web is hampered by the lack of accurate query performance predictors and the constraints imposed by search engine architectures and ranking algorithms. In this paper, we present query reduction techniques for long web queries that leverage effective and efficient query performance predictors. We propose three learning formulations that combine these predictors to perform automatic query reduction. These formulations enable trading of average improvements for the number of queries impacted, and enable easy integration into the search engine's architecture for rank-time query reduction. Experiments on a large collection of long queries issued to a commercial search engine show that the proposed techniques significantly outperform baselines, with more than 12% improvement in NDCG@5 in the impacted set of queries. Extension to the formulations such as result interleaving further improves results. We find that the proposed techniques deliver consistent retrieval gains where it matters most: poorly performing long web queries.