TextTree construction for parser and treebank development

Authors:
Paula S. Newman
Affiliations:
-
Venue:
Software '05 Proceedings of the Workshop on Software
Year:
2005

Citing 5
Cited 1

Procedure for quantitatively comparing the syntactic coverage of English grammars

HLT '91 Proceedings of the workshop on Speech and Natural Language
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
Robustness beyond shallowness: incremental deep parsing

Natural Language Engineering
Parse fitting and prose fixing: getting a hold on ill-formedness

Computational Linguistics - Special issue on ill-formed input
Parsing the wall street journal using a Lexical-Functional Grammar and discriminative estimation techniques

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics

RH: a retro hybrid parser

NAACL-Short '07 Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers

Quantified Score

Hi-index	0.00

Visualization

Abstract

TextTrees, introduced in (Newman, 2005), are skeletal representations formed by systematically converting parser output trees into unlabeled indented strings with minimal bracketing. Files of TextTrees can be read rapidly to evaluate the results of parsing long documents, and are easily edited to allow limited-cost treebank development. This paper reviews the TextTree concept, and then describes the implementation of the almost parser- and grammar-independent TextTree generator, as well as auxiliary methods for producing parser review files and inputs to bracket scoring tools. The results of some limited experiments in TextTree usage are also provided.