Distributed privacy preserving k-means clustering with additive secret sharing

Authors:
Mahir Can Doganay;Thomas B. Pedersen;Yücel Saygin;Erkay Savaş;Albert Levi
Affiliations:
Sabanci University, Istanbul, Turkey;Sabanci University, Istanbul, Turkey;Sabanci University, Istanbul, Turkey;Sabanci University, Istanbul, Turkey;Sabanci University, Istanbul, Turkey
Venue:
PAIS '08 Proceedings of the 2008 international workshop on Privacy and anonymity in information society
Year:
2008

Citing 13
Cited 5

Completeness theorems for non-cryptographic fault-tolerant distributed computation

STOC '88 Proceedings of the twentieth annual ACM symposium on Theory of computing
Privacy-preserving data mining

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
A Cost-Effective Pay-Per-Multiplication Comparison Method for Millionaires

CT-RSA 2001 Proceedings of the 2001 Conference on Topics in Cryptology: The Cryptographer's Track at RSA
Tools for privacy preserving distributed data mining

ACM SIGKDD Explorations Newsletter
Non-Interactive CryptoComputing For NC1

FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
Privacy-preserving k-means clustering over vertically partitioned data

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Privacy-preserving Bayesian network structure computation on distributed heterogeneous data

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Privacy-Preserving Distributed Mining of Association Rules on Horizontally Partitioned Data

IEEE Transactions on Knowledge and Data Engineering
Random Projection-Based Multiplicative Data Perturbation for Privacy Preserving Distributed Data Mining

IEEE Transactions on Knowledge and Data Engineering
Cryptographically private support vector machines

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Protocols for secure computations

SFCS '82 Proceedings of the 23rd Annual Symposium on Foundations of Computer Science
Public-key cryptosystems based on composite degree residuosity classes

EUROCRYPT'99 Proceedings of the 17th international conference on Theory and application of cryptographic techniques
Efficient privacy preserving distributed clustering based on secret sharing

PAKDD'07 Proceedings of the 2007 international conference on Emerging technologies in knowledge discovery and data mining

Report on international workshop on privacy and anonymity in the information society (PAIS 2008)

ACM SIGMOD Record
BronzeGate: real-time transactional data obfuscation for GoldenGate

Proceedings of the 13th International Conference on Extending Database Technology
Anonymous biometric access control

EURASIP Journal on Information Security - Special issue on enhancing privacy protection in multimedia systems
Privacy-preserving back-propagation and extreme learning machine algorithms

Data & Knowledge Engineering
Distributed Privacy Preserving Clustering via Homomorphic Secret Sharing and Its Application to Vertically Partitioned Spatio-Temporal Data

International Journal of Data Warehousing and Mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recent concerns about privacy issues motivated data mining researchers to develop methods for performing data mining while preserving the privacy of individuals. However, the current techniques for privacy preserving data mining suffer from high communication and computation overheads which are prohibitive considering even a modest database size. Furthermore, the proposed techniques have strict assumptions on the involved parties which need to be relaxed in order to reflect the real-world requirements. In this paper we concentrate on a distributed scenario where the data is partitioned vertically over multiple sites and the involved sites would like to perform clustering without revealing their local databases. For this setting, we propose a new protocol for privacy preserving k-means clustering based on additive secret sharing. We show that the new protocol is more secure than the state of the art. Experiments conducted on real and synthetic data sets show that, in realistic scenarios, the communication and computation cost of our protocol is considerably less than the state of the art which is crucial for data mining applications.