Generating query substitutions
Proceedings of the 15th international conference on World Wide Web
InfoScale '06 Proceedings of the 1st international conference on Scalable information systems
Investigating the querying and browsing behavior of advanced search engine users
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Recognition and classification of noun phrases in queries for effective retrieval
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Unsupervised query segmentation using generative language models and wikipedia
Proceedings of the 17th international conference on World Wide Web
Two-stage query segmentation for information retrieval
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
The linguistic structure of English web-search queries
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Query segmentation based on eigenspace similarity
ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
Exploring web scale language models for search query processing
Proceedings of the 19th international conference on World wide web
The power of naive query segmentation
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Structural annotation of search queries using pseudo-relevance feedback
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Unsupervised query segmentation using only query logs
Proceedings of the 20th international conference companion on World wide web
Proceedings of the 20th international conference on World wide web
Joint annotation of search queries
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Unsupervised query segmentation using clickthrough for information retrieval
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Evaluating the potential of explicit phrases for retrieval quality
ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval
From keywords to keyqueries: content descriptors for the web
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
From search session detection to search mission detection
Proceedings of the 10th Conference on Open Research Areas in Information Retrieval
On segmentation of eCommerce queries
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Mining search and browse logs for web search: A Survey
ACM Transactions on Intelligent Systems and Technology (TIST) - Survey papers, special sections on the semantic adaptive social web, intelligent systems for health informatics, regular papers
Hi-index | 0.00 |
Query segmentation is the problem of identifying those keywords in a query, which together form compound concepts or phrases like "new york times". Such segments can help a search engine to better interpret a user's intents and to tailor the search results more appropriately. Our contributions to this problem are threefold. (1) We conduct the first large-scale study of human segmentation behavior based on more than 500000 segmentations. (2) We show that the traditionally applied segmentation accuracy measures are not appropriate for such large-scale corpora and introduce new, more robust measures. (3) We develop a new query segmentation approach with the basic idea that, in cases of doubt, it is often better to (partially) leave queries without any segmentation. This new in-doubt-without approach chooses different segmentation strategies depending on query types. A large-scale evaluation shows substantial improvement upon the state of the art in terms of segmentation accuracy. To draw a complete picture, we also evaluate the impact of segmentation strategies on retrieval performance in a TREC setting. It turns out that more accurate segmentation not necessarily yields better retrieval performance. Based on this insight, we propose an in-doubt-without variant which achieves the best retrieval performance despite leaving many queries unsegmented. But there is still room for improvement: the optimum segmentation strategy which always chooses the segmentation that maximizes retrieval performance, significantly outperforms all other tested approaches.