Association Rule Mining in Peer-to-Peer Systems

Authors:
Ran Wolff;Assaf Schuster
Affiliations:
-;-
Venue:
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Year:
2003

Citing 11
Cited 14

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Dynamic itemset counting and implication rules for market basket data

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Communication-efficient distributed mining of association rules

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
A fast distributed algorithm for mining association rules

DIS '96 Proceedings of the fourth international conference on on Parallel and distributed information systems
Parallel Algorithms for Discovery of Association Rules

Data Mining and Knowledge Discovery
Parallel Mining of Association Rules

IEEE Transactions on Knowledge and Data Engineering
Scalable Parallel Data Mining for Association Rules

IEEE Transactions on Knowledge and Data Engineering
Incremental Mining of Constrained Associations

HiPC '00 Proceedings of the 7th International Conference on High Performance Computing
Mining Association Rules: Anti-Skew Algorithms

ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Approximate frequency counts over data streams

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases

k-TTP: a new privacy model for large-scale distributed environments

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Distributed approximate mining of frequent patterns

Proceedings of the 2005 ACM symposium on Applied computing
Veracity radius: capturing the locality of distributed computations

Proceedings of the twenty-fifth annual ACM symposium on Principles of distributed computing
Want scalable computing?: speculate!

ACM SIGACT News
Client-side web mining for community formation in peer-to-peer environments

ACM SIGKDD Explorations Newsletter
Learning quantifiable associations via principal sparse non-negative matrix factorization

Intelligent Data Analysis
Distributed feature extraction in a p2p setting: a case study

Future Generation Computer Systems - Special section: Data mining in grid computing environments
Approximate mining of frequent patterns on streams

Intelligent Data Analysis - Knowlegde Discovery from Data Streams
Efficient algorithms for incremental Web log mining with dynamic thresholds

The VLDB Journal — The International Journal on Very Large Data Bases
Performance study of distributed Apriori-like frequent itemsets mining

Knowledge and Information Systems
Mining quantitative associations in large database

APWeb'05 Proceedings of the 7th Asia-Pacific web conference on Web Technologies Research and Development
A scalable distributed stream mining system for highway traffic data

PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
A local facility location algorithm for sensor networks

DCOSS'05 Proceedings of the First IEEE international conference on Distributed Computing in Sensor Systems
Efficient dynamic aggregation

DISC'06 Proceedings of the 20th international conference on Distributed Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

We extend the problem of association rule mining -a key data mining problem - to systems in which thedatabase is partitioned among a very large number ofcomputers that are dispersed over a wide area. Such computing systems include GRID computing platforms, federated database systems, and peer-to-peer computing environments. The scale of these systems poses several difficulties, such as the impracticality of global communications and global synchronization, dynamic topology changes ofthe network, on-the-fly data updates, the need to share resources with other applications, and the frequent failureand recovery of resources.We present an algorithm by which every node in thesystem can reach the exact solution, as if it were giventhe combined database. The algorithm is entirely asynchronous, imposes very little communication overhead,transparently tolerates network topology changes andnode failures, and quickly adjusts to changes in the dataas they occur. Simulation of up to 10,000 nodes show thatthe algorithm is local: all rules, except for those whoseconfidence is about equal to the confidence threshold, arediscovered using information gathered from a very smallvicinity, whose size is independent of the size of the system.