A unified and discriminative model for query refinement

Authors:
Jiafeng Guo;Gu Xu;Hang Li;Xueqi Cheng
Affiliations:
Institute of Computing Technology, CAS, Beijing, China;Microsoft Research Asia, Beijing, China;Microsoft Research Asia, Beijing, China;Institute of Computing Technology, CAS, Beijing, China
Venue:
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Year:
2008

Citing 19
Cited 38

On the limited memory BFGS method for large scale optimization

Mathematical Programming: Series A and B
Information retrieval: data structures and algorithms

Information retrieval: data structures and algorithms
Concept based query expansion

SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Query expansion using local and global document analysis

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Fast and effective query refinement

Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Agglomerative clustering of a search engine query log

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Cumulated gain-based evaluation of IR techniques

ACM Transactions on Information Systems (TOIS)
From E-Sex to E-Commerce: Web Search Changes

Computer
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Query word deletion prediction

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Mining anchor text for query refinement

Proceedings of the 13th international conference on World Wide Web
Dynamic conditional random fields: factorized probabilistic models for labeling and segmenting sequence data

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Large Margin Methods for Structured and Interdependent Output Variables

The Journal of Machine Learning Research
Learning structured prediction models: a large margin approach

ICML '05 Proceedings of the 22nd international conference on Machine learning
Generating query substitutions

Proceedings of the 15th international conference on World Wide Web
Exploring distributional similarity based models for query spelling correction

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Learning a spelling error model from search query logs

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Context sensitive stemming for web search

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Evaluation of phrasal query suggestions

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management

Analysis of long queries in a large scale search log

Proceedings of the 2009 workshop on Web Search Click Data
Context-aware query classification

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Two-stage query segmentation for information retrieval

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
A Query Substitution-Search Result Refinement Approach for Long Query Web Searches

WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Analyzing and evaluating query reformulation strategies in web search logs

Proceedings of the 18th ACM conference on Information and knowledge management
An analysis framework for search sequences

Proceedings of the 18th ACM conference on Information and knowledge management
Learning concept importance using a weighted dependence model

Proceedings of the third ACM international conference on Web search and data mining
Evaluating verbose query processing techniques

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
The power of naive query segmentation

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
A structured approach to query recommendation with social annotation data

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Improving verbose queries using subset distribution

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Modeling reformulation using passage analysis

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Structural annotation of search queries using pseudo-relevance feedback

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Using web-scale N-grams to improve base NP parsing performance

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
LambdaMerge: merging the results of query reformulations

Proceedings of the fourth ACM international conference on Web search and data mining
Managing misspelled queries in IR applications

Information Processing and Management: an International Journal
A unified framework for recommending diverse and relevant queries

Proceedings of the 20th international conference on World wide web
Query segmentation revisited

Proceedings of the 20th international conference on World wide web
Joint annotation of search queries

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Mining Concept Sequences from Large-Scale Search Logs for Context-Aware Query Suggestion

ACM Transactions on Intelligent Systems and Technology (TIST)
Query Reformulation for Task-Oriented Web Searches

WI-IAT '11 Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 03
Using query log and social tagging to refine queries based on latent topics

Proceedings of the 20th ACM international conference on Information and knowledge management
Enriching textbooks with images

Proceedings of the 20th ACM international conference on Information and knowledge management
Automatic query reformulation with syntactic operators to alleviate search difficulty

Proceedings of the 20th ACM international conference on Information and knowledge management
Effective query formulation with multiple information sources

Proceedings of the fifth ACM international conference on Web search and data mining
Random selection assisted long web search query optimization

Proceedings of the 50th Annual Southeast Regional Conference
Ontology based segmentation of geo-referenced queries

ICWE'11 Proceedings of the 11th international conference on Current Trends in Web Engineering
Adaptive query suggestion for difficult queries

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Generating reformulation trees for complex queries

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
A generalized hidden Markov model with discriminative training for query spelling correction

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Linguistically-adapted structural query annotation for digital libraries in the social sciences

LaTeCH '12 Proceedings of the 6th Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities
A discriminative model for query spelling correction with latent structural SVM

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Domain dependent query reformulation for web search

Proceedings of the 21st ACM international conference on Information and knowledge management
Robust query rewriting using anchor data

Proceedings of the sixth ACM international conference on Web search and data mining
Modeling reformulation using query distributions

ACM Transactions on Information Systems (TOIS)
Utilizing query change for session search

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Mining search and browse logs for web search: A Survey

ACM Transactions on Intelligent Systems and Technology (TIST) - Survey papers, special sections on the semantic adaptive social web, intelligent systems for health informatics, regular papers
Indexing Word Sequences for Ranked Retrieval

ACM Transactions on Information Systems (TOIS)

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper addresses the issue of query refinement, which involves reformulating ill-formed search queries in order to enhance relevance of search results. Query refinement typically includes a number of tasks such as spelling error correction, word splitting, word merging, phrase segmentation, word stemming, and acronym expansion. In previous research, such tasks were addressed separately or through employing generative models. This paper proposes employing a unified and discriminative model for query refinement. Specifically, it proposes a Conditional Random Field (CRF) model suitable for the problem, referred to as Conditional Random Field for Query Refinement (CRF-QR). Given a sequence of query words, CRF-QR predicts a sequence of refined query words as well as corresponding refinement operations. In that sense, CRF-QR differs greatly from conventional CRF models. Two types of CRF-QR models, namely a basic model and an extended model are introduced. One merit of employing CRF-QR is that different refinement tasks can be performed simultaneously and thus the accuracy of refinement can be enhanced. Furthermore, the advantages of discriminative models over generative models can be fully leveraged. Experimental results demonstrate that CRF-QR can significantly outperform baseline methods. Furthermore, when CRF-QR is used in web search, a significant improvement of relevance can be obtained.