Compensating for annotation errors in training a relation extractor

Authors:
Bonan Min;Ralph Grishman
Affiliations:
New York University, Broadway, New York, NY;New York University, Broadway, New York, NY
Venue:
EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Year:
2012

Citing 15
Cited 0

Transductive Inference for Text Classification using Support Vector Machines

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Kernel methods for relation extraction

The Journal of Machine Learning Research
A novel use of statistical parsing to extract information from text

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Combining lexical, syntactic, and semantic features with maximum entropy models for extracting relations

ACLdemo '04 Proceedings of the ACL 2004 on Interactive poster and demonstration sessions
Extracting relations with integrated information using kernel methods

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Exploring various knowledge in relation extraction

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
A composite kernel to extract relations between entities with both flat and structured features

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
A shortest path dependency kernel for relation extraction

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Exploring syntactic features for relation extraction using a convolution tree kernel

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Exploiting constituent dependencies for tree kernel-based semantic relation extraction

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Learning to classify texts using positive and unlabeled data

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
To annotate more accurately or to annotate more

LAW IV '10 Proceedings of the Fourth Linguistic Annotation Workshop
Semi-supervised relation extraction with large-scale word clustering

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Exploiting syntactico-semantic structures for relation extraction

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Mining inter-entity semantic relations using improved transductive learning

IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

The well-studied supervised Relation Extraction algorithms require training data that is accurate and has good coverage. To obtain such a gold standard, the common practice is to do independent double annotation followed by adjudication. This takes significantly more human effort than annotation done by a single annotator. We do a detailed analysis on a snapshot of the ACE 2005 annotation files to understand the differences between single-pass annotation and the more expensive nearly three-pass process, and then propose an algorithm that learns from the much cheaper single-pass annotation and achieves a performance on a par with the extractor trained on multi-pass annotated data. Furthermore, we show that given the same amount of human labor, the better way to do relation annotation is not to annotate with high-cost quality assurance, but to annotate more.