Reducing the need for double annotation

Authors:
Dmitriy Dligach;Martha Palmer
Affiliations:
University of Colorado at Boulder;University of Colorado at Boulder
Venue:
LAW V '11 Proceedings of the 5th Linguistic Annotation Workshop
Year:
2011

Citing 16
Cited 1

A sequential algorithm for training text classifiers

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
Detecting errors within a corpus using anomaly detection

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
(Semi-)automatic detection of errors in PoS-tagged corpora

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Detecting errors in corpora using support vector machines

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Correcting category errors in text classification

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
An empirical study of the behavior of active learning for word sense disambiguation

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Get another label? improving data quality and data mining using multiple, noisy labelers

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Novel semantic features for verb sense disambiguation

HLT-Short '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers
Data quality from crowdsourcing: a study of annotation selection criteria

HLT '09 Proceedings of the NAACL HLT 2009 Workshop on Active Learning for Natural Language Processing
Correcting dependency annotation errors

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Cheap and fast---but is it good?: evaluating non-expert annotations for natural language tasks

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Word sense disambiguation using OntoNotes: an empirical study

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
OntoNotes: the 90% solution

NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
Unsupervised detection of annotation inconsistencies using Apriori algorithm

ACL-IJCNLP '09 Proceedings of the Third Linguistic Annotation Workshop
Learning outliers to refine a corpus for chinese webpage categorization

ICNC'05 Proceedings of the First international conference on Advances in Natural Computation - Volume Part I

Quality control mechanisms for crowdsourcing: peer review, arbitration, & expertise at familysearch indexing

Proceedings of the 2013 conference on Computer supported cooperative work

Quantified Score

Hi-index	0.00

Visualization

Abstract

The quality of annotated data is crucial for supervised learning. To eliminate errors in single annotated data, a second round of annotation is often used. However, is it absolutely necessary to double annotate every example? We show that it is possible to reduce the amount of the second round of annotation by more than half without sacrificing the performance.