CLAP: Collaborative pattern mining for distributed information systems

  • Authors:
  • Xingquan Zhu;Bin Li;Xindong Wu;Dan He;Chengqi Zhang

  • Affiliations:
  • QCIS Centre, Faculty of Eng. & Info. Technology, Univ. of Technology, Sydney, Ultimo 2007, Australia and Dept. of Computer Science & Eng., Florida Atlantic University, Boca Raton, FL 33431, USA;QCIS Centre, Faculty of Eng. & Info. Technology, Univ. of Technology, Sydney, Ultimo 2007, Australia;Dept. of Computer Science, University of Vermont, Burlington VT 05404, USA and School of Computer Science and Information Engineering, Hefei University of Technology, Hefei 230009, China;Dept. of Computer Science, Univ. of California at Los Angeles, Los Angeles, CA, 90095, USA;QCIS Centre, Faculty of Eng. & Info. Technology, Univ. of Technology, Sydney, Ultimo 2007, Australia

  • Venue:
  • Decision Support Systems
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

The purpose of data mining from distributed information systems is usually threefold: (1) identifying locally significant patterns in individual databases; (2) discovering emerging significant patterns after unifying distributed databases in a single view; and (3) finding patterns which follow special relationships across different data collections. While existing research has significantly advanced the techniques for mining local and global patterns (the first two goals), very little attempt has been made to discover patterns across distributed databases (the third goal). Moreover, no framework currently exists to support the mining of all three types of patterns. This paper proposes solutions to discover patterns from distributed databases. More specifically, we consider pattern mining as a query process where the purpose is to discover patterns from distributed databases with patterns' relationships satisfying user specified query constraints. We argue that existing self-contained mining frameworks are neither efficient, nor feasible to fulfill the objective, mainly because their pattern pruning is single-database oriented. To solve the problem, we advocate a cross-database pruning concept and propose a collaborative pattern (CLAP) mining framework with cross-database pruning mechanisms for distributed pattern mining. In CLAP, distributed databases collaboratively exchange pattern information between sites so that each site can leverage information from other sites to gain cross-database pruning. Experimental results show that CLAP fits a niche position, and demonstrate that CLAP not only outperforms its other peers with significant runtime performance gains, but also helps find patterns incapable of being discovered by others.