Unsupervised query segmentation using generative language models and wikipedia
Proceedings of the 17th international conference on World Wide Web
Named entity recognition in query
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the 20th international conference on World wide web
An IR-based evaluation framework for web search query segmentation
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Interactive pattern mining on hidden data: a sampling-based solution
Proceedings of the 21st ACM international conference on Information and knowledge management
Towards optimum query segmentation: in doubt without
Proceedings of the 21st ACM international conference on Information and knowledge management
Analyzing linguistic structure of web search queries
Proceedings of the 22nd international conference on World Wide Web companion
On segmentation of eCommerce queries
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Efficient parsing-based search over structured data
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Hi-index | 0.00 |
We introduce an unsupervised query segmentation scheme that uses query logs as the only resource and can effectively capture the structural units in queries. We believe that Web search queries have a unique syntactic structure which is distinct from that of English or a bag-of-words model. The segments discovered by our scheme help understand this underlying grammatical structure. We apply a statistical model based on Hoeffding's Inequality to mine significant word n-grams from queries and subsequently use them for segmenting the queries. Evaluation against manually segmented queries shows that this technique can detect rare units that are missed by our Pointwise Mutual Information (PMI) baseline.