Integrating constraint programming and itemset mining

  • Authors:
  • Siegfried Nijssen;Tias Guns

  • Affiliations:
  • K.U. Leuven, Leuven, Belgium;K.U. Leuven, Leuven, Belgium

  • Venue:
  • ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part II
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Over the years many pattern mining tasks and algorithms have been proposed. Traditionally, the focus of these studies was on the efficiency of the computation and the scalability towards very large databases. Little research has however been done on a general framework that encompasses several of these problems. In earlier work we showed how constraint programming (CP) can offer such a general framework; unfortunately, however, we also found that out-of-the-box CP solvers lack the efficiency and scalability achieved by specialized itemset mining systems, which could discourage their use. Here we study the question whether a framework can be built that inherits the generality of CP systems and the efficiency of specialized algorithms. We propose a CP-based framework for pattern mining that avoids the redundant representations and propagations found in existing CP systems. We show experimentally that an implementation of this framework performs comparable to specialized itemset mining systems; furthermore, under certain conditions it lists itemsets with polynomial delay, which demonstrates that it also is a promising approach for analyzing pattern mining tasks from more theoretical perspectives. This is illustrated on a graph mining problem.