TextRunner: open information extraction on the web

Authors:
Alexander Yates;Michael Cafarella;Michele Banko;Oren Etzioni;Matthew Broadhead;Stephen Soderland
Affiliations:
University of Washington, Seattle, WA;University of Washington, Seattle, WA;University of Washington, Seattle, WA;University of Washington, Seattle, WA;University of Washington, Seattle, WA;University of Washington, Seattle, WA
Venue:
NAACL-Demonstrations '07 Proceedings of Human Language Technologies: The Annual Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations
Year:
2007

Citing 4
Cited 25

Unsupervised named-entity extraction from the web: an experimental study

Artificial Intelligence
Integrating probabilistic extraction models and data mining to discover relations and patterns in text

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Preemptive information extraction using unrestricted relation discovery

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Open information extraction from the web

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence

Entity relation discovery from web tables and links

Proceedings of the 19th international conference on World wide web
From information to knowledge: harvesting entities and relationships from web sources

Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Learning arguments and supertypes of semantic relations using recursive patterns

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Find your advisor: robust knowledge gathering from the web

Procceedings of the 13th International Workshop on the Web and Databases
UTD: Classifying semantic relations by combining lexical and semantic resources

SemEval '10 Proceedings of the 5th International Workshop on Semantic Evaluation
Automatic rule refinement for information extraction

Proceedings of the VLDB Endowment
Towards technology structure mining from scientific literature

ISWC'10 Proceedings of the 9th international semantic web conference on The semantic web - Volume Part II
FACTO: a fact lookup engine based on web tables

Proceedings of the 20th international conference on World wide web
Integrating linked open data with unstructured text for intelligence gathering tasks

Proceedings of the 8th International Workshop on Information Integration on the Web: in conjunction with WWW 2011
WebSets: extracting sets of entities from the web using unsupervised information extraction

Proceedings of the fifth ACM international conference on Web search and data mining
A generative model for unsupervised discovery of relations and argument classes from clinical texts

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Random walk inference and learning in a large scale knowledge base

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
DIADEM: domain-centric, intelligent, automated data extraction methodology

Proceedings of the 21st international conference companion on World Wide Web
Using a lexical dictionary and a folksonomy to automatically construct domain ontologies

AI'11 Proceedings of the 24th international conference on Advances in Artificial Intelligence
Crowdsourced comprehension: predicting prerequisite structure in Wikipedia

Proceedings of the Seventh Workshop on Building Educational Applications Using NLP
Pattern learning for relation extraction with a hierarchical topic model

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
Annotated Gigaword

AKBC-WEKEX '12 Proceedings of the Joint Workshop on Automatic Knowledge Base Construction and Web-scale Knowledge Extraction
Large-Scale learning of relation-extraction rules with distant supervision from the web

ISWC'12 Proceedings of the 11th international conference on The Semantic Web - Volume Part I
Numeric Query Answering on the Web

International Journal on Semantic Web & Information Systems
A model for information extraction in portuguese based on text patterns

CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume 2
Provenance-based dictionary refinement in information extraction

Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Topic hierarchy construction for the organization of multi-source user generated contents

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Methods for exploring and mining tables on Wikipedia

Proceedings of the ACM SIGKDD Workshop on Interactive Data Exploration and Analytics
Self-supervised automated wrapper generation for weblog data extraction

BNCOD'13 Proceedings of the 29th British National conference on Big Data
Effective named entity recognition for idiosyncratic web collections

Proceedings of the 23rd international conference on World wide web

Quantified Score

Hi-index	0.00

Visualization

Abstract

Traditional information extraction systems have focused on satisfying precise, narrow, pre-specified requests from small, homogeneous corpora. In contrast, the TextRunner system demonstrates a new kind of information extraction, called Open Information Extraction (OIE), in which the system makes a single, data-driven pass over the entire corpus and extracts a large set of relational tuples, without requiring any human input. (Banko et al., 2007) TextRunner is a fully-implemented, highly scalable example of OIE. TextRunner's extractions are indexed, allowing a fast query mechanism.