Privacy Preserving BIRCH Algorithm for Clustering over Arbitrarily Partitioned Databases

  • Authors:
  • P. Krishna Prasad;C. Pandu Rangan

  • Affiliations:
  • Department of Computer Science and Engineering, Indian Institute of Technology - Madras, Chennai - 600036, India;Department of Computer Science and Engineering, Indian Institute of Technology - Madras, Chennai - 600036, India

  • Venue:
  • ADMA '07 Proceedings of the 3rd international conference on Advanced Data Mining and Applications
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

BIRCH algorithm [22] is a well known algorithm for clustering for effectively computing clusters in a large data set. As the data is typically distributed over several sites, clustering over distributed data is an important problem. The data can be distributed in horizontal, vertical or arbitrarily partitioned databases. But, because of privacy issues no party may share its data to other parties. The problem is how the parties can cluster the distributed data without breaching privacy of others data. The solutions in arbitrarily partitioned database setting generally work for both horizontal and vertically partitioned databases. In our work we give a procedure for securely running BIRCH algorithm over arbitrarily partitioned database. We introduce secure protocols for distance metrics and give a procedure for using these metrics in securely computing clusters over arbitrarily partitioned database.