Association Rule Mining in Peer-to-Peer Systems

  • Authors:
  • Ran Wolff;Assaf Schuster

  • Affiliations:
  • -;-

  • Venue:
  • ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

We extend the problem of association rule mining -a key data mining problem - to systems in which thedatabase is partitioned among a very large number ofcomputers that are dispersed over a wide area. Such computing systems include GRID computing platforms, federated database systems, and peer-to-peer computing environments. The scale of these systems poses several difficulties, such as the impracticality of global communications and global synchronization, dynamic topology changes ofthe network, on-the-fly data updates, the need to share resources with other applications, and the frequent failureand recovery of resources.We present an algorithm by which every node in thesystem can reach the exact solution, as if it were giventhe combined database. The algorithm is entirely asynchronous, imposes very little communication overhead,transparently tolerates network topology changes andnode failures, and quickly adjusts to changes in the dataas they occur. Simulation of up to 10,000 nodes show thatthe algorithm is local: all rules, except for those whoseconfidence is about equal to the confidence threshold, arediscovered using information gathered from a very smallvicinity, whose size is independent of the size of the system.