Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
QuestionBank: creating a corpus of parse-annotated questions
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Extracting structured information from user queries with semi-supervised conditional random fields
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
The linguistic structure of English web-search queries
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Semantic tagging of web search queries
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Understanding the semantic structure of noun phrase queries
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Structural annotation of search queries using pseudo-relevance feedback
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Piggyback: using search engines for robust cross-domain named entity recognition
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
INDREX: in-database distributional relation extraction
Proceedings of the sixteenth international workshop on Data warehousing and OLAP
Hi-index | 0.00 |
Syntactic analysis of search queries is important for a variety of information-retrieval tasks; however, the lack of annotated data makes training query analysis models difficult. We propose a simple, efficient procedure in which part-of-speech tags are transferred from retrieval-result snippets to queries at training time. Unlike previous work, our final model does not require any additional resources at run-time. Compared to a state-of-the-art approach, we achieve more than 20% relative error reduction. Additionally, we annotate a corpus of search queries with part-of-speech tags, providing a resource for future work on syntactic query analysis.