Knowledge-based weak supervision for information extraction of overlapping relations

Authors:
Raphael Hoffmann;Congle Zhang;Xiao Ling;Luke Zettlemoyer;Daniel S. Weld
Affiliations:
University of Washington, Seattle, WA;University of Washington, Seattle, WA;University of Washington, Seattle, WA;University of Washington, Seattle, WA;University of Washington, Seattle, WA
Venue:
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Year:
2011

Citing 15
Cited 19

Solving the multiple instance problem with axis-parallel rectangles

Artificial Intelligence
Constructing Biological Knowledge Bases by Extracting Information from Text Sources

Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology
Discriminative training methods for hidden Markov models: theory and experiments with perceptron algorithms

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Incorporating non-local information into information extraction systems by Gibbs sampling

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
An end-to-end discriminative approach to machine translation

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Preemptive information extraction using unrestricted relation discovery

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Autonomously semantifying wikipedia

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Automatically refining the wikipedia infobox ontology

Proceedings of the 17th international conference on World Wide Web
Open information extraction from the web

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
CRYSTAL inducing a conceptual dictionary

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Distant supervision for relation extraction without labeled data

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Open information extraction using Wikipedia

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Learning 5000 relational extractors

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Collective cross-document relation extraction without labelled data

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Modeling relations and their mentions without labeled text

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III

Open information extraction: the second generation

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume One
Reducing wrong labels in distant supervision for relation extraction

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Big data versus the crowd: looking for relationships in all the right places

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Pattern learning for relation extraction with a hierarchical topic model

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
Linking named entities to any database

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Multi-instance multi-label learning for relation extraction

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Open language learning for information extraction

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Weakly supervised training of semantic parsers

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Monte Carlo MCMC: efficient inference by approximate sampling

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Adding distributional semantics to knowledge base entities through web-scale entity linking

AKBC-WEKEX '12 Proceedings of the Joint Workshop on Automatic Knowledge Base Construction and Web-scale Knowledge Extraction
Monte Carlo MCMC: efficient inference by sampling factors

AKBC-WEKEX '12 Proceedings of the Joint Workshop on Automatic Knowledge Base Construction and Web-scale Knowledge Extraction
Learning to predict from textual data

Journal of Artificial Intelligence Research
Assessing sparse information extraction using semantic contexts

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Feature-based models for improving the quality of noisy training data for relation extraction

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Extracting meronyms for a biology knowledge base using distant supervision

Proceedings of the 2013 workshop on Automated knowledge base construction
A study of the knowledge base requirements for passing an elementary science test

Proceedings of the 2013 workshop on Automated knowledge base construction
A survey of noise reduction methods for distant supervision

Proceedings of the 2013 workshop on Automated knowledge base construction
Integrating syntactic and semantic analysis into the open information extraction paradigm

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Timeline generation: tracking individuals on twitter

Proceedings of the 23rd international conference on World wide web

Quantified Score

Hi-index	0.00

Visualization

Abstract

Information extraction (IE) holds the promise of generating a large-scale knowledge base from the Web's natural language text. Knowledge-based weak supervision, using structured data to heuristically label a training corpus, works towards this goal by enabling the automated learning of a potentially unbounded number of relation extractors. Recently, researchers have developed multi-instance learning algorithms to combat the noisy training data that can come from heuristic labeling, but their models assume relations are disjoint --- for example they cannot extract the pair Founded(Jobs, Apple) and CEO-of(Jobs, Apple). This paper presents a novel approach for multi-instance learning with overlapping relations that combines a sentence-level extraction model with a simple, corpus-level component for aggregating the individual facts. We apply our model to learn extractors for NY Times text using weak supervision from Free-base. Experiments show that the approach runs quickly and yields surprising gains in accuracy, at both the aggregate and sentence level.