Experimental investigation of pruning methods for relational pattern discovery

  • Authors:
  • Irene Weber

  • Affiliations:
  • Institut für Informatik, Universität Stuttgart, Stuttgart, Germany

  • Venue:
  • ILP'02 Proceedings of the 12th international conference on Inductive logic programming
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Finding all interesting patterns in a database is a data mining task that typically requires a complete search through the hypothesis space. Several ILP systems address this task, e.g., [Deh98, Wro97, FL01]. Safe pruning techniques that reduce the size of the hypothesis space without the risk of missing interesting patterns are very important for this task. This paper is concerned with the effectiveness of pruning techniques in this setting. The addressed pruning techniques are (1) optimum estimates, (2) a pruning technique based on subset tests that is derived from the Apriori search algorithm, (3) pruning based on taxonomies, and (4) to consider only most general patterns as interesting. Methods (1) to (3) are safe pruning techniques that find all interesting patterns; method (4) reduces the number of accepted patterns. The effect of these pruning methods is investigated by experiments within a range of different specific task settings and two databases. Experimental results indicate that optimum estimates and Apriori-style pruning are effective and reliable pruning techniques that produce little additional cost. The effect of taxonomies for pruning is smaller, and it varies over different task settings. In the experiments, the restriction to most general patterns considerably reduces the search costs as well as the set of accepted patterns.