TSNLP: Test Suites for Natural Language Processing
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
The Penn Treebank: annotating predicate argument structure
HLT '94 Proceedings of the workshop on Human Language Technology
COLING-GEE '02 Proceedings of the 2002 workshop on Grammar engineering and evaluation - Volume 15
Error mining for wide-coverage grammar engineering
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
SETQA-NLP '09 Proceedings of the Workshop on Software Engineering, Testing, and Quality Assurance for Natural Language Processing
Mining syntactically annotated corpora with XQuery
LAW '07 Proceedings of the Linguistic Annotation Workshop
Hi-index | 0.00 |
This paper reports on guiding parser development by extracting information from output of a large-scale parser applied to Wikipedia documents. Data-driven parser improvement is especially important for applications where the corpus may differ from that originally used to develop the core grammar and where efficiency concerns affect whether a new construction should be added, or existing analyses modified. The large size of the corpus in question also brings scalability concerns to the foreground.