Large-scale k-means clustering with user-centric privacy preservation

  • Authors:
  • Jun Sakuma;Shigenobu Kobayashi

  • Affiliations:
  • Tokyo Institute of Technology, Yokohama, Japan;Tokyo Institute of Technology, Yokohama, Japan

  • Venue:
  • PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

A k-means clustering with new privacy-preserving concept, user-centric privacy preservation, is presented. In this framework, users can conduct data mining using their private information with storing them in their local storages. After the computation, they obtain only mining result without disclosing private information to others. The number of parties that join conventional privacy-preserving data mining has been assumed to be two. In our framework, we assume large numbers of parties join the protocol, therefore, not only scalability but also asynchronism and fault-tolerance is important. Considering this, we propose a k-mean algorithm combined with a decentralized cryptographic protocol and a gossip-based protocol. The computational complexity is O(log n) with respect to the number of parties n and experimental results show that our protocol is scalable even with one million parties.