HTILDE: scaling up relational decision trees for very large databases

  • Authors:
  • Carina Lopes;Gerson Zaverucha

  • Affiliations:
  • Federal University of Rio de Janeiro (UFRJ), Rio de Janeiro, RJ, Brazil;Federal University of Rio de Janeiro (UFRJ), Rio de Janeiro, RJ, Brazil

  • Venue:
  • Proceedings of the 2009 ACM symposium on Applied Computing
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Nowadays, many organizations have relational databases with millions of records and an important question is how to extract information from them. This work proposes HTILDE (Hoeffding TILDE) to handle very large relational databases, based on the Inductive Logic Programming (ILP) system TILDE (Top-down Induction of Logical Decision Trees) and the propositional Very Fast Decision Tree (VFDT) learner. It is an incremental and anytime algorithm that uses the Hoeffding bound to find out the amount of examples that must be considered for choosing the best test for a node. The results show that, compared to TILDE, HTILDE generates theories from very large relational datasets more efficiently without harming their quality measures (F-measure, precision, recall and accuracy). Also, HTILDE learns less complex theories than TILDE.