Using search-logs to improve query tagging

Authors:
Kuzman Ganchev;Keith Hall;Ryan McDonald;Slav Petrov
Affiliations:
Google, Inc.;Google, Inc.;Google, Inc.;Google, Inc.
Venue:
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
Year:
2012

Citing 9
Cited 1

Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
Discriminative training methods for hidden Markov models: theory and experiments with perceptron algorithms

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
QuestionBank: creating a corpus of parse-annotated questions

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Extracting structured information from user queries with semi-supervised conditional random fields

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
The linguistic structure of English web-search queries

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Semantic tagging of web search queries

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Understanding the semantic structure of noun phrase queries

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Structural annotation of search queries using pseudo-relevance feedback

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Piggyback: using search engines for robust cross-domain named entity recognition

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1

INDREX: in-database distributional relation extraction

Proceedings of the sixteenth international workshop on Data warehousing and OLAP

Quantified Score

Hi-index	0.00

Visualization

Abstract

Syntactic analysis of search queries is important for a variety of information-retrieval tasks; however, the lack of annotated data makes training query analysis models difficult. We propose a simple, efficient procedure in which part-of-speech tags are transferred from retrieval-result snippets to queries at training time. Unlike previous work, our final model does not require any additional resources at run-time. Compared to a state-of-the-art approach, we achieve more than 20% relative error reduction. Additionally, we annotate a corpus of search queries with part-of-speech tags, providing a resource for future work on syntactic query analysis.