Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Facilitating treebank annotation using a statistical parser
HLT '01 Proceedings of the first international conference on Human language technology research
Building a large-scale annotated Chinese corpus
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Applying Co-Training to reference resolution
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Bootstrapping coreference classifiers with multiple machine learning algorithms
EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
A semi-automatic method for annotating a biomedical proposition bank
LAC '06 Proceedings of the Workshop on Frontiers in Linguistically Annotated Corpora 2006
Semi-automated named entity annotation
LAW '07 Proceedings of the Linguistic Annotation Workshop
Complex linguistic annotation --- no easy way out!: a case from Bangla and Hindi POS labeling tasks
ACL-IJCNLP '09 Proceedings of the Third Linguistic Annotation Workshop
Bringing active learning to life
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Hi-index | 0.00 |
Corpora with high-quality linguistic annotations are an essential component in many NLP applications and a valuable resource for linguistic research. For obtaining these annotations, a large amount of manual effort is needed, making the creation of these resources time-consuming and costly. One attempt to speed up the annotation process is to use supervised machine-learning systems to automatically assign (possibly erroneous) labels to the data and ask human annotators to correct them where necessary. However, it is not clear to what extent these automatic pre-annotations are successful in reducing human annotation effort, and what impact they have on the quality of the resulting resource. In this article, we present the results of an experiment in which we assess the usefulness of partial semi-automatic annotation for frame labeling. We investigate the impact of automatic pre-annotation of differing quality on annotation time, consistency and accuracy. While we found no conclusive evidence that it can speed up human annotation, we found that automatic pre-annotation does increase its overall quality.