BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Scalable parallel data mining for association rules
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Query flocks: a generalization of association-rule mining
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Efficient mining of emerging patterns: discovering trends and differences
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining frequent patterns without candidate generation
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Summary cache: a scalable wide-area web cache sharing protocol
IEEE/ACM Transactions on Networking (TON)
Detecting Group Differences: Mining Contrast Sets
Data Mining and Knowledge Discovery
Parallel and Distributed Association Mining: A Survey
IEEE Concurrency
Efficient Mining of Association Rules in Distributed Databases
IEEE Transactions on Knowledge and Data Engineering
Parallel Mining of Association Rules
IEEE Transactions on Knowledge and Data Engineering
Synthesizing High-Frequency Rules from Different Data Sources
IEEE Transactions on Knowledge and Data Engineering
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total
ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
DualMiner: a dual-pruning algorithm for itemsets with constraints
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
ReCoM: reinforcement clustering of multi-type interrelated data objects
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Mining Frequent Itemsets in Distributed and Dynamic Databases
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
On detecting differences between groups
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Communication-Efficient Distributed Mining of Association Rules
Data Mining and Knowledge Discovery
CrossMine: Efficient Classification Across Multiple Database Relations
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
The Bloomier filter: an efficient data structure for static support lookup tables
SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms
Knowledge Discovery in Multiple Databases
Knowledge Discovery in Multiple Databases
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Finding (Recently) Frequent Items in Distributed Data Streams
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Bloom Filter-Based XML Packets Filtering for Millions of Path Queries
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
A distributed learning framework for heterogeneous data sources
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Distributed higher order association rule mining using information extracted from textual data
ACM SIGKDD Explorations Newsletter - Natural language processing and text mining
Mining Minimal Distinguishing Subsequence Patterns with Gap Constraints
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Systematic Approach for Optimizing Complex Mining Tasks on Multiple Databases
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Mining Multiple Data Sources: Local Pattern Analysis
Data Mining and Knowledge Discovery
Toward terabyte pattern mining: an architecture-conscious solution
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Distributed classification in peer-to-peer networks
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Association-based similarity testing and its applications
Intelligent Data Analysis
Preserving privacy in association rule mining with bloom filters
Journal of Intelligent Information Systems
ODAM: An Optimized Distributed Association Rule Mining Algorithm
IEEE Distributed Systems Online
Conceptual equivalence for contrast mining in classification learning
Data & Knowledge Engineering
MMIS07, 08: mining multiple information sources workshop report
ACM SIGKDD Explorations Newsletter
Distributed data mining: why do more than aggregating models
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Multiple information sources cooperative learning
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Robust ensemble learning for mining noisy data streams
Decision Support Systems
Data mining for credit card fraud: A comparative study
Decision Support Systems
Mining comparative opinions from customer reviews for Competitive Intelligence
Decision Support Systems
Distributed pattern discovery in multiple streams
PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Distributed classification of Gaussian space-time sources in wireless sensor networks
IEEE Journal on Selected Areas in Communications
Distributed customer behavior prediction using multiplex data: A collaborative MK-SVM approach
Knowledge-Based Systems
Mining stable patterns in multiple correlated databases
Decision Support Systems
Quality of information-based source assessment and selection
Neurocomputing
Hi-index | 0.00 |
The purpose of data mining from distributed information systems is usually threefold: (1) identifying locally significant patterns in individual databases; (2) discovering emerging significant patterns after unifying distributed databases in a single view; and (3) finding patterns which follow special relationships across different data collections. While existing research has significantly advanced the techniques for mining local and global patterns (the first two goals), very little attempt has been made to discover patterns across distributed databases (the third goal). Moreover, no framework currently exists to support the mining of all three types of patterns. This paper proposes solutions to discover patterns from distributed databases. More specifically, we consider pattern mining as a query process where the purpose is to discover patterns from distributed databases with patterns' relationships satisfying user specified query constraints. We argue that existing self-contained mining frameworks are neither efficient, nor feasible to fulfill the objective, mainly because their pattern pruning is single-database oriented. To solve the problem, we advocate a cross-database pruning concept and propose a collaborative pattern (CLAP) mining framework with cross-database pruning mechanisms for distributed pattern mining. In CLAP, distributed databases collaboratively exchange pattern information between sites so that each site can leverage information from other sites to gain cross-database pruning. Experimental results show that CLAP fits a niche position, and demonstrate that CLAP not only outperforms its other peers with significant runtime performance gains, but also helps find patterns incapable of being discovered by others.