Large-scale k-means clustering with user-centric privacy preservation

Authors:
Jun Sakuma;Shigenobu Kobayashi
Affiliations:
Tokyo Institute of Technology, Yokohama, Japan;Tokyo Institute of Technology, Yokohama, Japan
Venue:
PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
Year:
2008

Citing 9
Cited 2

k-anonymity: a model for protecting privacy

International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Privacy preserving mining of association rules

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Gossip-Based Computation of Aggregate Information

FOCS '03 Proceedings of the 44th Annual IEEE Symposium on Foundations of Computer Science
Privacy-preserving k-means clustering over vertically partitioned data

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Foundations of Cryptography: Volume 2, Basic Applications

Foundations of Cryptography: Volume 2, Basic Applications
Privacy-preserving distributed k-means clustering over arbitrarily partitioned data

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
How to generate and exchange secrets

SFCS '86 Proceedings of the 27th Annual Symposium on Foundations of Computer Science
Public-key cryptosystems based on composite degree residuosity classes

EUROCRYPT'99 Proceedings of the 17th international conference on Theory and application of cryptographic techniques
Privacy preserving clustering

ESORICS'05 Proceedings of the 10th European conference on Research in Computer Security

Privacy-preserving reinforcement learning

Proceedings of the 25th international conference on Machine learning
Privacy-preserving back-propagation and extreme learning machine algorithms

Data & Knowledge Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

A k-means clustering with new privacy-preserving concept, user-centric privacy preservation, is presented. In this framework, users can conduct data mining using their private information with storing them in their local storages. After the computation, they obtain only mining result without disclosing private information to others. The number of parties that join conventional privacy-preserving data mining has been assumed to be two. In our framework, we assume large numbers of parties join the protocol, therefore, not only scalability but also asynchronism and fault-tolerance is important. Considering this, we propose a k-mean algorithm combined with a decentralized cryptographic protocol and a gossip-based protocol. The computational complexity is O(log n) with respect to the number of parties n and experimental results show that our protocol is scalable even with one million parties.