Extending the soft constraint based mining paradigm

  • Authors:
  • Stefano Bistarelli;Francesco Bonchi

  • Affiliations:
  • Dipartimento di Scienze, Università degli Studi "G. D'Annunzio", Pescara, Italy and Istituto di Informatica e Telematica, CNR, Pisa, Italy;Pisa KDD Laboratory, ISTI, C.N.R., Pisa, Italy

  • Venue:
  • KDID'06 Proceedings of the 5th international conference on Knowledge discovery in inductive databases
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

The paradigm of pattern discovery based on constraints has been recognized as a core technique in inductive querying: constraints provide to the user a tool to drive the discovery process towards potentially interesting patterns, with the positive side effect of achieving a more efficient computation. So far the research on this paradigm has mainly focussed on the latter aspect: the development of efficient algorithms for the evaluation of constraint-based mining queries. Due to the lack of research on methodological issues, the constraint-based pattern mining framework still suffers from many problems which limit its practical relevance. In our previous work [5], we analyzed such limitations and showed how they flow out from the same source: the fact that in the classical constraint-based mining, a constraint is a rigid boolean function which returns either true or false. To overcome such limitations we introduced the new paradigm of pattern discovery based on Soft Constraints, and instantiated our idea to the fuzzy soft constraints. In this paper we extend the framework to deal with probabilistic and weighted soft constraints: we provide theoretical basis and detailed experimental analysis. We also discuss a straightforward solution to deal with top-k queries. Finally we show how the ideas presented in this paper have been implemented in a real Inductive Database system.