Towards learning rules from natural texts

  • Authors:
  • Janardhan Rao Doppa;Mohammad NasrEsfahani;Mohammad S. Sorower;Thomas G. Dietterich;Xiaoli Fern;Prasad Tadepalli

  • Affiliations:
  • Oregon State University, Corvallis, OR;Oregon State University, Corvallis, OR;Oregon State University, Corvallis, OR;Oregon State University, Corvallis, OR;Oregon State University, Corvallis, OR;Oregon State University, Corvallis, OR

  • Venue:
  • FAM-LbR '10 Proceedings of the NAACL HLT 2010 First International Workshop on Formalisms and Methodology for Learning by Reading
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we consider the problem of inductively learning rules from specific facts extracted from texts. This problem is challenging due to two reasons. First, natural texts are radically incomplete since there are always too many facts to mention. Second, natural texts are systematically biased towards novelty and surprise, which presents an unrepresentative sample to the learner. Our solutions to these two problems are based on building a generative observation model of what is mentioned and what is extracted given what is true. We first present a Multiple-predicate Bootstrapping approach that consists of iteratively learning if-then rules based on an implicit observation model and then imputing new facts implied by the learned rules. Second, we present an iterative ensemble colearning approach, where multiple decision-trees are learned from bootstrap samples of the incomplete training data, and facts are imputed based on weighted majority.