Communication-efficient distributed mining of association rules

Authors:
Assaf Schuster;Ran Wolff
Affiliations:
Computer Science Department, TECHNION - Israel Institute of Technology, Technion City, Haifa 32000, Israel;Computer Science Department, TECHNION - Israel Institute of Technology, Technion City, Haifa 32000, Israel
Venue:
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Year:
2001

Citing 11
Cited 17

An effective hash-based algorithm for mining association rules

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Mining the most interesting rules

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Parallel data mining for association rules on shared-memory multi-processors

Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
A fast distributed algorithm for mining association rules

DIS '96 Proceedings of the fourth international conference on on Parallel and distributed information systems
Parallel Mining of Association Rules

IEEE Transactions on Knowledge and Data Engineering
Scalable Parallel Data Mining for Association Rules

IEEE Transactions on Knowledge and Data Engineering
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Discovery of Multiple-Level Association Rules from Large Databases

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Mining Generalized Association Rules

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
SPRINT: A Scalable Parallel Classifier for Data Mining

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Effect of Data Skewness in Parallel Mining of Association Rules

PAKDD '98 Proceedings of the Second Pacific-Asia Conference on Research and Development in Knowledge Discovery and Data Mining

A High-Performance Distributed Algorithm for Mining Association Rules

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Association Rule Mining in Peer-to-Peer Systems

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Communication-Efficient Distributed Mining of Association Rules

Data Mining and Knowledge Discovery
A high-performance distributed algorithm for mining association rules

Knowledge and Information Systems
A sampling-based framework for parallel data mining

Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
Distributed approximate mining of frequent patterns

Proceedings of the 2005 ACM symposium on Applied computing
Distributed higher order association rule mining using information extracted from textual data

ACM SIGKDD Explorations Newsletter - Natural language processing and text mining
Toward terabyte pattern mining: an architecture-conscious solution

Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Approximate mining of frequent patterns on streams

Intelligent Data Analysis - Knowlegde Discovery from Data Streams
A Fast Parallel Association Rules Mining Algorithm Based on FP-Forest

ISNN '08 Proceedings of the 5th international symposium on Neural Networks: Advances in Neural Networks, Part II
Parallel Method for Mining High Utility Itemsets from Vertically Partitioned Distributed Databases

KES '09 Proceedings of the 13th International Conference on Knowledge-Based and Intelligent Information and Engineering Systems: Part I
Distributed threshold querying of general functions by a difference of monotonic representation

Proceedings of the VLDB Endowment
A parallel, distributed algorithm for relational frequent pattern discovery from very large data sets

Intelligent Data Analysis - Ubiquitous Knowledge Discovery
Distributed subgroup mining

PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
An efficient distributed algorithm for mining association rules

ISPA'06 Proceedings of the 4th international conference on Parallel and Distributed Processing and Applications
Privacy-preserving frequent itemsets mining via secure collaborative framework

Security and Communication Networks
A new parallel association rule mining algorithm on distributed shared memory system

International Journal of Business Intelligence and Data Mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Mining for associations between items in large transactional databases is a central problem in the field of knowledge discovery. When the database is partitioned among several share-nothing machines, the problem can be addressed using distributed data mining algorithms. One such algorithm, called CD, was proposed by Agrawal and Shafer in [1] and was later enhanced by the FDM algorithm of Cheung, Han et al. [5].The main problem with these algorithms is that they do not scale well with the number of partitions. They are thus impractical for use in modern distributed environments such as peer-to-peer systems, in which hundreds or thousands of computers may interact. In this paper we present a set of new algorithms that solve the Distributed Association Rule Mining problem using far less communication. In addition to being very efficient, the new algorithms are also extremely robust. Unlike existing algorithms, they continue to be efficient even when the data is skewed or the partition sizes are imbalanced. We present both experimental and theoretical results concerning the behavior of these algorithms and explain how they can be implemented in different settings.