A Note on Two Simple Transformations for Improving the Efficiency of an ILP System

  • Authors:
  • Vítor Santos Costa;Ashwin Srinivasan;Rui Camacho

  • Affiliations:
  • -;-;-

  • Venue:
  • ILP '00 Proceedings of the 10th International Conference on Inductive Logic Programming
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

Inductive Logic Programming (ILP) systems have had noteworthy successes in extracting comprehensible and accurate models for data drawn from a number of scientific and engineering domains. These results suggest that ILP methods could enhance the model-construction capabilities of software tools being developed for the emerging discipline of "knowledge discovery from databases." One significant concern in the use of ILP for this purpose is that of efficiency. The performance of modern ILP systems is principally affected by two issues: (1) they often have to search through very large numbers of possible rules (usually in the form of definite clauses); (2) they have to score each rule on the data (usually in the form of ground facts) to estimate "goodness". Stochastic and greedy approaches have been proposed to alleviate the complexity arising from each of these issues. While these techniques can result in order-of-magnitude improvements in the worst-case search complexity of an ILP system, they do so at the expense of exactness. As this may be unacceptable in some situations, we examine two methods that result in admissible transformations of clauses examined in a search. While the methods do not alter the size ofthe search space (that is, the number of clauses examined), they can alleviate the theorem-proving effort required to estimate goodness. The first transformation simply involves eliminating literals using a weak test for redundancy. The second involves partitioning the set of literals within a clause into groups that can be executed independently of each other. The efficacy ofthe se transformations are evaluated empirically on a number of well-known ILP datasets. The results suggest that for problems that require the use of highly nondeterminate predicates, the transformations can provide significant gains as the complexity of clauses sought increases.