Efficient mining under rich constraints derived from various datasets

  • Authors:
  • Arnaud Soulet;Jiří Kléma;Bruno Crémilleux

  • Affiliations:
  • GREYC, Université de Caen, Caen Cédex, France;GREYC, Université de Caen, Caen Cédex, France and Department of Cybernetics, Czech Technical University, Prague;GREYC, Université de Caen, Caen Cédex, France

  • Venue:
  • KDID'06 Proceedings of the 5th international conference on Knowledge discovery in inductive databases
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Mining patterns under many kinds of constraints is a key point to successfully get new knowledge. In this paper, we propose an efficient new algorithm MUSIC-DFS which soundly and completely mines patterns with various constraints from large data and takes into account external data represented by several heterogeneous datasets. Constraints are freely built of a large set of primitives and enable to link the information scattered in various knowledge sources. Efficiency is achieved thanks to a new closure operator providing an interval pruning strategy applied during the depth-first search of a pattern space. A transcriptomic case study shows the effectiveness and scalability of our approach. It also demonstrates a way to employ background knowledge, such as free texts or gene ontologies, in the discovery of meaningful patterns.