Domain dependent query reformulation for web search

Authors:
Van Dang;Giridhar Kumaran;Adam Troy
Affiliations:
University of Massachusetts Amherst, Amherst, MA, USA;Microsoft Corporation, Redmond, WA, USA;Microsoft Corporation, Redmond, WA, USA
Venue:
Proceedings of the 21st ACM international conference on Information and knowledge management
Year:
2012

Citing 25
Cited 0

Agglomerative clustering of a search engine query log

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Clustering user queries of a search engine

Proceedings of the 10th international conference on World Wide Web
Probabilistic query expansion using query logs

Proceedings of the 11th international conference on World Wide Web
Cumulated gain-based evaluation of IR techniques

ACM Transactions on Information Systems (TOIS)
Optimizing search engines using clickthrough data

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
A systematic comparison of various statistical alignment models

Computational Linguistics
Query word deletion prediction

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
Query chains: learning to rank from implicit feedback

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Concept-based interactive query expansion

Proceedings of the 14th ACM international conference on Information and knowledge management
The Alignment Template Approach to Statistical Machine Translation

Computational Linguistics
Generating query substitutions

Proceedings of the 15th international conference on World Wide Web
Exploring distributional similarity based models for query spelling correction

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Learning a spelling error model from search query logs

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Context sensitive stemming for web search

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
A unified and discriminative model for query refinement

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Query suggestion using hitting time

Proceedings of the 17th ACM conference on Information and knowledge management
Mining term association patterns from search logs for effective query reformulation

Proceedings of the 17th ACM conference on Information and knowledge management
Beyond the session timeout: automatic hierarchical segmentation of search topics in query logs

Proceedings of the 17th ACM conference on Information and knowledge management
Smoothing clickthrough data for web search ranking

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Query reformulation using anchor text

Proceedings of the third ACM international conference on Web search and data mining
Result enrichment in commerce search using browse trails

Proceedings of the fourth ACM international conference on Web search and data mining
Query rewriting using monolingual statistical machine translation

Computational Linguistics
Efficient query rewrite for structured web queries

Proceedings of the 20th ACM international conference on Information and knowledge management
Query recommendation using query logs in search engines

EDBT'04 Proceedings of the 2004 international conference on Current Trends in Database Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

Query reformulation has been studied as a domain independent task. Existing work attempts to expand a query or substitute its terms with the same set of candidates regardless of the domain of this query. Since terms might be semantically related in one domain but not in others, it is more effective to provide candidates for queries with respect to their domain. This paper demonstrates the advantage of this domain dependent query reformulation approach, which learns its candidates, using a standard technique, for each domain from a separate sample of data derived automatically from a generic query log. Our results show that our approach statistically significantly outperforms the domain independent approach, which learns to reformulate from the same log using the same technique, on a large query set consisting of both health and commerce queries. Our results have very practical interpretation: while building different reformulation systems to handle queries from different domains does not require additional manual effort, it provides substantially better retrieval effectiveness than having a single system handling all queries. Additionally, we show that leveraging domain specific manually labelled data leads to further improvement.