Inducing classification and regression trees in first order logic

  • Authors:
  • Stefan Kramer;Gerhard Widmer

  • Affiliations:
  • Univ. Freiburg, Freiburg, Germany;Univ. of Vienna, Vienna, Austria and Austrian Research Institute for Artificial Intelligence, Vienna, Austria

  • Venue:
  • Relational Data Mining
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this chapter, we present a system that enhances the representational capabilities of decision and regression tree learning by extending it to first-order logic, i.e., relational representations as commonly used in Inductive Logic Programming. We describe an algorithm named Structural Classification and Regression Trees (S-CART), which is capable of inducing first-order trees for both classification and regression problems, i.e., for the prediction of either discrete classes or numerical values. We arrive at this algorithm by a strategy called upgrading-we start from a propositional induction algorithm and turn it into a relational learner by devising suitable extensions of the representation language and the associated algorithms. In particular, we have upgraded CART, the classical method for learning classification and regression trees, to handle relational examples and background knowledge. The system constructs a tree containing a literal (an atomic formula or its negation) or a conjunction of literals in each node, and assigns either a discrete class or a numerical value to each leaf. In addition, we have extended the CART methodology by adding linear regression models to the leaves of the trees; this does not have a counter part in CART, but was inspired by its approach to pruning. The regression variant of S-CART is one of the few systems applicable to Relational Regression problems. Experiments in several real-world domains demonstrate that the approach is useful and competitive with existing methods, indicating that the advantage of relatively small and comprehensible models does not come at the expense of predictive accuracy.