The iZi project: easy prototyping of interesting pattern mining algorithms

  • Authors:
  • Frédéric Flouvat;Fabien De Marchi;Jean-Marc Petit

  • Affiliations:
  • University of New Caledonia, PPME, Noumea, New Caledonia;Université de Lyon, CNRS, LIRIS, France;Université de Lyon, INSA-Lyon, LIRIS, France

  • Venue:
  • PAKDD'09 Proceedings of the 13th Pacific-Asia international conference on Knowledge discovery and data mining: new frontiers in applied data mining
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In the last decade, many data mining tools have been developed. They address most of the classical data mining problems such as classification, clustering or pattern mining. However, providing classical solutions for classical problems is not always sufficient. This is especially true for pattern mining problems known to be "representable as set", an important class of problems which have many applications such as in data mining, in databases, in artificial intelligence, or in software engineering. A common idea is to say that solutions devised so far for classical pattern mining problems, such as frequent itemset mining, should be useful to answer these tasks. Unfortunately, it seems rather optimistic to envision the application of most of publicly available tools even for closely related problems. In this context, the main contribution of this paper is to propose a modular and efficient tool in which users can easily adapt and control several pattern mining algorithms. From a theoretical point of view, this work takes advantage of the common theoretical background of pattern mining problems isomorphic to boolean lattices. This tool, a C++ library called iZi, has been devised and applied to several problems such as itemset mining, constraint mining in relational databases, and query rewriting in data integration systems. According to our first results, the programs obtained using the library have very interesting performance characteristics regarding simplicity of their development. The library is open source and freely available on the Web.