Definitional and human constraints on structural annotation of english*

Authors:
Geoffrey Sampson;Anna Babarczy
Affiliations:
Department of informatics, university of sussex, falmer, brighton, bn1 9qj, england e-mail: grs2@sussex.ac.uk;Department of cognitive science, budapest university of technology & economics, 1111 budapest, stoczek utca 2, hungary e-mail: babarczy@cogsci.bme.hu
Venue:
Natural Language Engineering
Year:
2008

Citing 6
Cited 0

Foundations of statistical natural language processing

Foundations of statistical natural language processing
Automatic labeling of semantic roles

Computational Linguistics
A test of the leaf-ancestor metric for parse accuracy

Natural Language Engineering
The Penn Chinese TreeBank: Phrase structure annotation of a large corpus

Natural Language Engineering
Definitional, personal, and mechanical constraints on part of speech annotation performance

Natural Language Engineering
A robust combination strategy for semantic role labeling

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

The limits on predictability and refinement of English structural annotation are examined by comparing independent annotations, by experienced analysts using the same detailed published guidelines, of a common sample of written texts. Three conclusions emerge. First, while it is not easy to define watertight boundaries between the categories of a comprehensive structural annotation scheme, limits on inter-annotator agreement are in practice set more by the difficulty of conforming to a well-defined scheme than by the difficulty of making a scheme well defined. Secondly, although usage is often structurally ambiguous, commonly the alternative analyses are logical distinctions without a practical difference – which raises questions about the role of grammar in human linguistic behaviour. Finally, one specific area of annotation is strikingly more problematic than any other area examined, though this area (classifying the functions of clause-constituents) seems a particularly significant one for human language use. These findings should be of interest both to computational linguists and to students of language as an aspect of human cognition.