Towards learning rules from natural texts

Authors:
Janardhan Rao Doppa;Mohammad NasrEsfahani;Mohammad S. Sorower;Thomas G. Dietterich;Xiaoli Fern;Prasad Tadepalli
Affiliations:
Oregon State University, Corvallis, OR;Oregon State University, Corvallis, OR;Oregon State University, Corvallis, OR;Oregon State University, Corvallis, OR;Oregon State University, Corvallis, OR;Oregon State University, Corvallis, OR
Venue:
FAM-LbR '10 Proceedings of the NAACL HLT 2010 First International Workshop on Formalisms and Methodology for Learning by Reading
Year:
2010

Citing 11
Cited 2

Efficient top-down induction of logic programs

ACM SIGART Bulletin
Multitask Learning

Machine Learning - Special issue on inductive transfer
Combining labeled and unlabeled data with co-training

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Learning to construct knowledge bases from the World Wide Web

Artificial Intelligence - Special issue on Intelligent internet systems
WHIRL: a word-based information representation language

Artificial Intelligence - Special issue on Intelligent internet systems
Learning Logical Definitions from Relations

Machine Learning
Induction of Decision Trees

Machine Learning
Unsupervised word sense disambiguation rivaling supervised methods

ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Handling Missing Values when Applying Classification Models

The Journal of Machine Learning Research
Open information extraction from the web

Communications of the ACM - Surviving the data deluge
Coupled semi-supervised learning for information extraction

Proceedings of the third ACM international conference on Web search and data mining

Learning to "read between the lines" using Bayesian logic programs

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Reporting bias and knowledge acquisition

Proceedings of the 2013 workshop on Automated knowledge base construction

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we consider the problem of inductively learning rules from specific facts extracted from texts. This problem is challenging due to two reasons. First, natural texts are radically incomplete since there are always too many facts to mention. Second, natural texts are systematically biased towards novelty and surprise, which presents an unrepresentative sample to the learner. Our solutions to these two problems are based on building a generative observation model of what is mentioned and what is extracted given what is true. We first present a Multiple-predicate Bootstrapping approach that consists of iteratively learning if-then rules based on an implicit observation model and then imputing new facts implied by the learned rules. Second, we present an iterative ensemble colearning approach, where multiple decision-trees are learned from bootstrap samples of the incomplete training data, and facts are imputed based on weighted majority.