Handbook for Language Engineers
Handbook for Language Engineers
The use of instrumentation in grammar engineering
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
COLING-GEE '02 Proceedings of the 2002 workshop on Grammar engineering and evaluation - Volume 15
IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
Software testing and the naturally occurring data assumption in natural language processing
SETQA-NLP '08 Software Engineering, Testing, and Quality Assurance for Natural Language Processing
Research on Language and Computation
Hi-index | 0.00 |
We present a validation methodology for a cross-linguistic grammar resource which produces output in the form of small grammars based on elicited typological descriptions. Evaluating the resource entails sampling from a very large space of language types, the type and range of which preclude the use of standard test suites development techniques. We produce a database from which gold standard test suites for these grammars can be generated on demand, including well-formed strings paired with all of their valid semantic representations as well as a sample of ill-formed strings. These string-semantics pairs are selected from a set of candidates by a system of regular-expression based filters. The filters amount to an alternative grammar building system, whose generative capacity is limited compared to the actual grammars. We perform error analysis of the discrepancies between the test suites and grammars for a range of language types, and update both systems appropriately. The resulting resource serves as a point of comparison for regression testing in future development.