An analysis of open information extraction based on semantic role labeling

Authors:
Janara Christensen; Mausam;Stephen Soderland;Oren Etzioni
Affiliations:
University of Washington, Seattle, WA, USA;University of Washington, Seattle, WA, USA;University of Washington, Seattle, WA, USA;University of Washington, Seattle, WA, USA
Venue:
Proceedings of the sixth international conference on Knowledge capture
Year:
2011

Citing 18
Cited 3

Learning Information Extraction Rules for Semi-Structured and Free Text

Machine Learning - Special issue on natural language learning
Snowball: extracting relations from large plain-text collections

DL '00 Proceedings of the fifth ACM conference on Digital libraries
The Berkeley FrameNet Project

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
The role of lexico-semantic feedback in open-domain textual question-answering

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Preemptive information extraction using unrestricted relation discovery

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
A global joint model for semantic role labeling

Computational Linguistics
Tree kernels for semantic role labeling

Computational Linguistics
The importance of syntactic parsing and inference in semantic role labeling

Computational Linguistics
Using Wikipedia to bootstrap open information extraction

ACM SIGMOD Record
Machine reading

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
The effect of syntactic representation on semantic role labeling

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Shallow semantic parsing for spoken language understanding

NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
Open information extraction from the web

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Open knowledge extraction through compositional language processing

STEP '08 Proceedings of the 2008 Conference on Semantics in Text Processing
A probabilistic model of redundancy in information extraction

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Coupled semi-supervised learning for information extraction

Proceedings of the third ACM international conference on Web search and data mining
Open information extraction for the web

Open information extraction for the web
Open information extraction using Wikipedia

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

Open language learning for information extraction

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
KrakeN: N-ary facts in open information extraction

AKBC-WEKEX '12 Proceedings of the Joint Workshop on Automatic Knowledge Base Construction and Web-scale Knowledge Extraction
Open domain knowledge extraction: inference on a web scale

Proceedings of the 3rd International Conference on Web Intelligence, Mining and Semantics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Open Information Extraction extracts relations from text without requiring a pre-specified domain or vocabulary. While existing techniques have used only shallow syntactic features, we investigate the use of semantic role labeling techniques for the task of Open IE. Semantic role labeling (SRL) and Open IE, although developed mostly in isolation, are quite related. We compare SRL-based open extractors, which perform computationally expensive, deep syntactic analysis, with TextRunner, an open extractor, which uses shallow syntactic analysis but is able to analyze many more sentences in a fixed amount of time and thus exploit corpus-level statistics. Our evaluation answers questions regarding these systems, including, can SRL extractors, which are trained on PropBank, cope with heterogeneous text found on the Web? Which extractor attains better precision, recall, f-measure, or running time? How does extractor performance vary for binary, n-ary and nested relations? How much do we gain by running multiple extractors? How do we select the optimal extractor given amount of data, available time, types of extractions desired?