Analysis and improvement of minimally supervised machine learning for relation extraction

Authors:
Hans Uszkoreit;Feiyu Xu;Hong Li
Affiliations:
DFKI GmbH, LT Lab, Saarbrücken;DFKI GmbH, LT Lab, Saarbrücken;DFKI GmbH, LT Lab, Saarbrücken
Venue:
NLDB'09 Proceedings of the 14th international conference on Applications of Natural Language to Information Systems
Year:
2009

Citing 12
Cited 5

Combining labeled and unlabeled data with co-training

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Snowball: extracting relations from large plain-text collections

DL '00 Proceedings of the fifth ACM conference on Digital libraries
Extracting Patterns and Relations from the World Wide Web

WebDB '98 Selected papers from the International Workshop on The World Wide Web and Databases
Scenario customization for information extraction

Scenario customization for information extraction
Unsupervised word sense disambiguation rivaling supervised methods

ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Automatic acquisition of hyponyms from large text corpora

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Bootstrapping

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
An improved extraction pattern representation model for automatic IE pattern acquisition

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Understanding the Yarowsky Algorithm

Computational Linguistics
Using the Web to Reduce Data Sparseness in Pattern-Based Information Extraction

PKDD 2007 Proceedings of the 11th European conference on Principles and Practice of Knowledge Discovery in Databases
Improving semi-supervised acquisition of relation extraction patterns

IEBeyondDoc '06 Proceedings of the Workshop on Information Extraction Beyond The Document
Automatically generating extraction patterns from untagged text

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 2

Boosting relation extraction with limited closed-world knowledge

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Learning relation extraction grammars with minimal human intervention: strategy, results, insights and plans

CICLing'11 Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part II
Dependency graphs as a generic interface between parsers and relation extraction rule learning

KI'11 Proceedings of the 34th Annual German conference on Advances in artificial intelligence
Minimally supervised domain-adaptive parse reranking for relation extraction

IWPT '11 Proceedings of the 12th International Conference on Parsing Technologies
Editorial: Minimally-supervised learning of domain-specific causal relations using an open-domain corpus as knowledge base

Data & Knowledge Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

The main contribution of this paper is a systematic analysis of a minimally supervised machine learning method for relation extraction grammars. The method is based on a bootstrapping approach in which the bootstrapping is triggered by semantic seeds. The starting point of our analysis is the pattern-learning graph which is a subgraph of the bipartite graph representing all connections between linguistic patterns and relation instances exhibited by the data. It is shown that the performance of such general learning framework for actual tasks is dependent on certain properties of the data and on the selection of seeds. Several experiments have been conducted to gain explanatory insights into the interaction of these two factors. From the investigation of more effective seeds and benevolent data we understand how to improve the learning in less fortunate configurations. A relation extraction method only based on positive examples cannot avoid all false positives, especially when the data properties yield a high recall. Therefore, negative seeds are employed to learn negative patterns, which boost precision.