Open information extraction using Wikipedia

Authors:
Fei Wu;Daniel S. Weld
Affiliations:
University of Washington, Seattle, WA;University of Washington, Seattle, WA
Venue:
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Year:
2010

Citing 20
Cited 56

Learning to extract symbolic knowledge from the World Wide Web

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Snowball: extracting relations from large plain-text collections

DL '00 Proceedings of the fifth ACM conference on Digital libraries
Information extraction from research papers using conditional random fields

Information Processing and Management: an International Journal
Coarse-to-fine n-best parsing and MaxEnt discriminative reranking

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Extracting relations with integrated information using kernel methods

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
A composite kernel to extract relations between entities with both flat and structured features

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
A shortest path dependency kernel for relation extraction

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Preemptive information extraction using unrestricted relation discovery

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Yago: a core of semantic knowledge

Proceedings of the 16th international conference on World Wide Web
Autonomously semantifying wikipedia

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Information extraction from Wikipedia: moving down the long tail

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
What Have Innsbruck and Leipzig in Common? Extracting Semantics from Wiki Content

ESWC '07 Proceedings of the 4th European conference on The Semantic Web: Research and Applications
StatSnowball: a statistical approach to extracting entity relationships

Proceedings of the 18th international conference on World wide web
Joint inference in information extraction

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Turning web text and search queries into factual knowledge: hierarchical class attribute extraction

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Open information extraction from the web

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Open knowledge extraction through compositional language processing

STEP '08 Proceedings of the 2008 Conference on Semantics in Text Processing
Unsupervised learning of semantic relations between concepts of a molecular biology ontology

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Distant supervision for relation extraction without labeled data

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Learning 5000 relational extractors

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

Learning 5000 relational extractors

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Towards open ontology learning and filtering

Information Systems
RDR-based open IE for the web document

Proceedings of the sixth international conference on Knowledge capture
An analysis of open information extraction based on semantic role labeling

Proceedings of the sixth international conference on Knowledge capture
Knowledge-based weak supervision for information extraction of overlapping relations

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Relation guided bootstrapping of semantic lexicons

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Automatic acquisition of taxonomies in different languages from multiple Wikipedia versions

i-KNOW '11 Proceedings of the 11th International Conference on Knowledge Management and Knowledge Technologies
Using the web to validate lexico-semantic relations

EPIA'11 Proceedings of the 15th Portugese conference on Progress in artificial intelligence
An up-to-date knowledge-based literature search and exploration framework for focused bioscience domains

Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium
Learning to simplify sentences with quasi-synchronous grammar and integer programming

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Identifying relations for open information extraction

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Mining special features to improve the performance of e-commerce product selection and resume processing

International Journal of Computational Science and Engineering
Leveraging different meronym discovery methods for bridging resolution in french

DAARC'11 Proceedings of the 8th international conference on Anaphora Processing and Applications
Supporting resource-based learning on the web using automatically extracted large-scale taxonomies from multiple wikipedia versions

ICWL'11 Proceedings of the 10th international conference on Advances in Web-Based Learning
Combining flat and structured approaches for temporal slot filling or: how much to compress?

CICLing'12 Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part II
A comparison of layout based bibliographic metadata extraction techniques

Proceedings of the 2nd International Conference on Web Intelligence, Mining and Semantics
Open information extraction: the second generation

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume One
Extracting information networks from the blogosphere

ACM Transactions on the Web (TWEB)
Efficient indexing and querying over syntactically annotated trees

Proceedings of the VLDB Endowment
Instance-driven attachment of semantic annotations over conceptual hierarchies

EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Dependency-based open information extraction

ROBUS-UNSUP '12 Proceedings of the Joint Workshop on Unsupervised and Semi-Supervised Learning in NLP
Semi-supervised learning for automatic conceptual property extraction

CMCL '12 Proceedings of the 3rd Workshop on Cognitive Modeling and Computational Linguistics
A comparison of Chinese parsers for stanford dependencies

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
A graph-based cross-lingual projection approach for weakly supervised relation extraction

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
Pattern learning for relation extraction with a hierarchical topic model

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
Open language learning for information extraction

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Weakly supervised training of semantic parsers

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Identifying constant and unique relations by using time-series text

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Ensemble semantics for large-scale unsupervised relation extraction

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
KrakeN: N-ary facts in open information extraction

AKBC-WEKEX '12 Proceedings of the Joint Workshop on Automatic Knowledge Base Construction and Web-scale Knowledge Extraction
BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network

Artificial Intelligence
Crosslingual distant supervision for extracting relations of different complexity

Proceedings of the 21st ACM international conference on Information and knowledge management
WiSeNet: building a wikipedia-based semantic network with ontologized relations

Proceedings of the 21st ACM international conference on Information and knowledge management
Improving open information extraction for informal web documents with ripple-down rules

PKAW'12 Proceedings of the 12th Pacific Rim conference on Knowledge Management and Acquisition for Intelligent Systems
Collaboratively built semi-structured content and Artificial Intelligence: The story so far

Artificial Intelligence
Transforming Wikipedia into a large scale multilingual concept network

Artificial Intelligence
Finding additional semantic entity information for search engines

Proceedings of the Seventeenth Australasian Document Computing Symposium
Wiki3C: exploiting wikipedia for context-aware concept categorization

Proceedings of the sixth ACM international conference on Web search and data mining
Wikipedia entity expansion and attribute extraction from the web using semi-supervised learning

Proceedings of the sixth ACM international conference on Web search and data mining
A model for information extraction in portuguese based on text patterns

CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume 2
Open domain knowledge extraction: inference on a web scale

Proceedings of the 3rd International Conference on Web Intelligence, Mining and Semantics
Automatic organization of human task goals for web-scale problem solving knowledge

Proceedings of the seventh international conference on Knowledge capture
Juggling the Jigsaw: towards automated problem inference from network trouble tickets

nsdi'13 Proceedings of the 10th USENIX conference on Networked Systems Design and Implementation
ClausIE: clause-based open information extraction

Proceedings of the 22nd international conference on World Wide Web
Exploiting unstructured web information for managing linked data spaces

Proceedings of the 17th Panhellenic Conference on Informatics
Automatic Mapping of Wikipedia Templates for Fast Deployment of Localised DBpedia Datasets

Proceedings of the 13th International Conference on Knowledge Management and Knowledge Technologies
Assessing sparse information extraction using semantic contexts

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
How ontologies are made: Studying the hidden social dynamics behind collaborative ontology engineering projects

Web Semantics: Science, Services and Agents on the World Wide Web
Beyond search: Retrieving complete tuples from a text-database

Information Systems Frontiers
Knowledge base population and visualization using an ontology based on semantic roles

Proceedings of the 2013 workshop on Automated knowledge base construction
Cross-Lingual Annotation Projection for Weakly-Supervised Relation Extraction

ACM Transactions on Asian Language Information Processing (TALIP)
Integrating syntactic and semantic analysis into the open information extraction paradigm

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Integration of large scale knowledge bases using probabilistic graphical models

Proceedings of the 7th ACM international conference on Web search and data mining
ReliAble dependency arc recognition

Expert Systems with Applications: An International Journal
Acquisition of open-domain classes via intersective semantics

Proceedings of the 23rd international conference on World wide web
WHAD: Wikipedia historical attributes data

Language Resources and Evaluation

Quantified Score

Hi-index	0.00

Visualization

Abstract

Information-extraction (IE) systems seek to distill semantic relations from natural-language text, but most systems use supervised learning of relation-specific examples and are thus limited by the availability of training data. Open IE systems such as TextRunner, on the other hand, aim to handle the unbounded number of relations found on the Web. But how well can these open systems perform? This paper presents WOE, an open IE system which improves dramatically on TextRunner's precision and recall. The key to WOE's performance is a novel form of self-supervised learning for open extractors -- using heuristic matches between Wikipedia infobox attribute values and corresponding sentences to construct training data. Like TextRunner, WOE's extractor eschews lexicalized features and handles an unbounded set of semantic relations. WOE can operate in two modes: when restricted to POS tag features, it runs as quickly as TextRunner, but when set to use dependency-parse features its precision and recall rise even higher.