Redundancy-based correction of automatically extracted facts

Authors:
Roman Yangarber;Lauri Jokipii
Affiliations:
University of Helsinki, Finland;University of Helsinki, Finland
Venue:
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Year:
2005

Citing 5
Cited 4

A Mutually Beneficial Integration of Data Mining and Information Extraction

Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
Information extraction for enhanced access to disease outbreak reports

Journal of Biomedical Informatics - Special issue: Sublanguage
Complexity of event structure in IE scenarios

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Improving name tagging by reference resolution and relation detection

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Extracting information about outbreaks of infectious epidemics

HLT-Demo '05 Proceedings of HLT/EMNLP on Interactive Demonstrations

Extracting information about outbreaks of infectious epidemics

HLT-Demo '05 Proceedings of HLT/EMNLP on Interactive Demonstrations
Can document selection help semi-supervised learning?: a case study on event extraction

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Employing compositional semantics and discourse consistency in Chinese event extraction

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Using compositional semantics and discourse consistency to improve Chinese trigger identification

Information Processing and Management: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

The accuracy of event extraction is limited by a number of complicating factors, with errors compounded at all sages inside the Information Extraction pipeline. In this paper, we present methods for recovering automatically from errors committed in the pipeline processing. Recovery is achieved via post-processing facts aggregated over a large collection of documents, and suggesting corrections based on evidence external to the document. A further improvement is derived from propagating multiple, locally non-best slot fills through the pipeline. Evaluation shows that the global analysis is over 10 times more likely to suggest valid corrections to the local-only analysis than it is to suggest erroneous ones. This yields a substantial overall gain, with no supervised training.