Exploring content models for multi-document summarization
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Open information extraction from the web
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Distant supervision for relation extraction without labeled data
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Collective cross-document relation extraction without labelled data
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Knowledge-based weak supervision for information extraction of overlapping relations
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Knowledge base population: successful approaches and challenges
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Structured relation discovery using generative models
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Pattern learning for relation extraction with a hierarchical topic model
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
Multi-instance multi-label learning for relation extraction
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
A survey of noise reduction methods for distant supervision
Proceedings of the 2013 workshop on Automated knowledge base construction
Hi-index | 0.00 |
Supervised relation extraction from text relies on annotated data. Distant supervision is a scheme to obtain noisy training data by using a knowledge base of relational tuples as the ground truth and finding entity pair matches in a text corpus. We propose and evaluate two feature-based models for increasing the quality of distant supervision extraction patterns. The first model is an extension of a hierarchical topic model that induces background, relation specific and argument-pair specific feature distributions. The second model is a perceptron, trained to match an objective function that enforces two constraints: 1) an at-least-one semantics, i.e. at least one training example per relational tuple is assumed to be correct; 2) high scores for a dedicated NIL label that accounts for the noise in the training data. For both algorithms, neither explicit negative data nor the ratio of negatives has to be provided. Both algorithms give improvements over a maximum likelihood baseline as well as over a previous topic model without features, evaluated on TAC KBP data.