Optimization of a language for data mining

  • Authors:
  • Rosa Meo

  • Affiliations:
  • Università degli Studi di Torino, corso Svizzera 185 - 10149 - Torino - Italy

  • Venue:
  • Proceedings of the 2003 ACM symposium on Applied computing
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Constraint-based mining has attracted in recent years the interest of the data mining research community because it increases the relevance of the result set, reduces its volume and the amount of workload. However, constrained-based mining will be completely feasible only when efficient optimizers for mining languages will be available.This paper is a first step towards the construction of optimizers for a constraint-based mining language. It provides the guidelines for the comparison of classes of statements by means of the relationships existing between their result sets. Furthermore it identifies as useful information to the optimization the presence of unique constraints and functional dependencies in the schema of the database. We show the practical implications of the discussed principles with a set of algorithms designed for a specific mining language. These algorithms use also a new designed index, called mining index that allows to reduce the portion of the database to be read in response to some classes of queries. In these cases the workload of the mining engine is greatly reduced or completely avoided in a significant subset of the cases.