The ATIS spoken language systems pilot corpus
HLT '90 Proceedings of the workshop on Speech and Natural Language
Head-driven statistical models for natural language parsing
Head-driven statistical models for natural language parsing
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
A simple pattern-matching algorithm for recovering empty nodes and their antecedents
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Fostering Multi-Modal Summarization for Trend Information
KES '07 Knowledge-Based Intelligent Information and Engineering Systems and the XVII Italian Workshop on Neural Networks on Proceedings of the 11th International Conference
Adapting a lexicalized-grammar parser to contrasting domains
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Exact phrases in information retrieval for question answering
IRQA '08 Coling 2008: Proceedings of the 2nd workshop on Information Retrieval for Question Answering
Efficient graph-based semi-supervised learning of structured tagging models
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Uptraining for accurate deterministic question parsing
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Evaluation of dependency parsers on unbounded dependencies
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
From symbolic to sub-symbolic information in question classification
Artificial Intelligence Review
Bootstrapping multiple-choice tests with THE-MENTOR
CICLing'11 Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part I
Parsing natural language queries for life science knowledge
BioNLP '11 Proceedings of BioNLP 2011 Workshop
The Uppsala-FBK systems at WMT 2011
WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
Training a parser for machine translation reordering
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Training dependency parsers by jointly optimizing multiple objectives
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Exploring linguistically-rich patterns for question generation
UCNLG+EVAL '11 Proceedings of the UCNLG+Eval: Language Generation and Evaluation Workshop
Syntactic annotations for the Google Books Ngram Corpus
ACL '12 Proceedings of the ACL 2012 System Demonstrations
Using search-logs to improve query tagging
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
Improved parsing and POS tagging using inter-sentence consistency constraints
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Automatic keyword extraction from single-sentence natural language queries
PRICAI'12 Proceedings of the 12th Pacific Rim international conference on Trends in Artificial Intelligence
Learning dependency-based compositional semantics
Computational Linguistics
Learning domain differences automatically for dependency parsing adaptation
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Hi-index | 0.00 |
This paper describes the development of QuestionBank, a corpus of 4000 parse-annotated questions for (i) use in training parsers employed in QA, and (ii) evaluation of question parsing. We present a series of experiments to investigate the effectiveness of QuestionBank as both an exclusive and supplementary training resource for a state-of-the-art parser in parsing both question and non-question test sets. We introduce a new method for recovering empty nodes and their antecedents (capturing long distance dependencies) from parser output in CFG trees using LFG f-structure reentrancies. Our main findings are (i) using QuestionBank training data improves parser performance to 89.75% labelled bracketing f-score, an increase of almost 11% over the baseline; (ii) back-testing experiments on non-question data (Penn-II WSJ Section 23) shows that the retrained parser does not suffer a performance drop on non-question material; (iii) ablation experiments show that the size of training material provided by QuestionBank is sufficient to achieve optimal results; (iv) our method for recovering empty nodes captures long distance dependencies in questions from the ATIS corpus with high precision (96.82%) and low recall (39.38%). In summary, QuestionBank provides a useful new resource in parser-based QA research.