Identifying ambiguous queries in web search

Authors:
Ruihua Song;Zhenxiao Luo;Ji-Rong Wen;Yong Yu;Hsiao-Wuen Hon
Affiliations:
Microsoft Research Asia;Fudan University;Microsoft Research Asia;Shanghai Jiao Tong University;Microsoft Research Asia
Venue:
Proceedings of the 16th international conference on World Wide Web
Year:
2007

Citing 2
Cited 11

Predicting query performance

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Q2C@UST: our winning solution to query classification in KDDCUP 2005

ACM SIGKDD Explorations Newsletter

To personalize or not to personalize: modeling queries with variation in user intent

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Query-log mining for detecting spam

AIRWeb '08 Proceedings of the 4th international workshop on Adversarial information retrieval on the web
Multiple intents re-ranking

Proceedings of the forty-first annual ACM symposium on Theory of computing
Estimating query performance using class predictions

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Intent based clustering of search engine query log

CASE'09 Proceedings of the fifth annual IEEE international conference on Automation science and engineering
Coniunge et impera: multiple-graph mining for query-log analysis

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I
A Case Study of Collaboration and Reputation in Social Web Search

ACM Transactions on Intelligent Systems and Technology (TIST)
Predicting query performance via classification

ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval
Click patterns: an empirical representation of complex query intents

Proceedings of the 21st ACM international conference on Information and knowledge management
A learning approach to optimizing exploration---exploitation tradeoff in relevance feedback

Information Retrieval
Intent models for contextualising and diversifying query suggestions

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

It is widely believed that some queries submitted to search engines are by nature ambiguous (e.g., java, apple). However, few studies have investigated the questions of "how many queries are ambiguous?" and "how can we automatically identify an ambiguous query?" This paper deals with these issues. First, we construct the taxonomy of query ambiguity, and ask human annotators to manually classify queries based upon it. From manually labeled results, we find that query ambiguity is to some extent predictable. We then use a supervised learning approach to automatically classify queries as being ambiguous or not. Experimental results show that we can correctly identify 87% of labeled queries. Finally, we estimate that about 16% of queries in a real search log are ambiguous.