Secure two-party k-means clustering

Authors:
Paul Bunn;Rafail Ostrovsky
Affiliations:
UCLA, Los Angeles, CA;UCLA, Los Angeles, CA
Venue:
Proceedings of the 14th ACM conference on Computer and communications security
Year:
2007

Citing 22
Cited 18

How to play ANY mental game

STOC '87 Proceedings of the nineteenth annual ACM symposium on Theory of computing
Completeness theorems for non-cryptographic fault-tolerant distributed computation

STOC '88 Proceedings of the twentieth annual ACM symposium on Theory of computing
Optimal size integer division circuits

SIAM Journal on Computing
Privacy-preserving data mining

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
On the design and quantification of privacy preserving data mining algorithms

PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Refining Initial Points for K-Means Clustering

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Revealing information while preserving privacy

Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Privacy-preserving k-means clustering over vertically partitioned data

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Foundations of Cryptography: Volume 2, Basic Applications

Foundations of Cryptography: Volume 2, Basic Applications
Privacy-preserving Bayesian network structure computation on distributed heterogeneous data

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Practical privacy: the SuLQ framework

Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Privacy-preserving distributed k-means clustering over arbitrarily partitioned data

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Oblivious Polynomial Evaluation

SIAM Journal on Computing
The Effectiveness of Lloyd-Type Methods for the k-Means Problem

FOCS '06 Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science
Privacy-Preserving Two-Party K-Means Clustering via Secure Approximation

AINAW '07 Proceedings of the 21st International Conference on Advanced Information Networking and Applications Workshops - Volume 01
Zero-knowledge from secure multiparty computation

Proceedings of the thirty-ninth annual ACM symposium on Theory of computing
Public-key cryptosystems based on composite degree residuosity classes

EUROCRYPT'99 Proceedings of the 17th international conference on Theory and application of cryptographic techniques
Secure computation of the mean and related statistics

TCC'05 Proceedings of the Second international conference on Theory of Cryptography
Privacy preserving clustering

ESORICS'05 Proceedings of the 10th European conference on Research in Computer Security
On private scalar product computation for privacy-preserving data mining

ICISC'04 Proceedings of the 7th international conference on Information Security and Cryptology
Oblivious scalar-product protocols

ACISP'06 Proceedings of the 11th Australasian conference on Information Security and Privacy
Calibrating noise to sensitivity in private data analysis

TCC'06 Proceedings of the Third conference on Theory of Cryptography

Longest common subsequence as private search

Proceedings of the 8th ACM workshop on Privacy in the electronic society
Toward empirical aspects of secure scalar product

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews - Special issue on information reuse and integration
Communication-Efficient Privacy-Preserving Clustering

Transactions on Data Privacy
Reliable medical recommendation systems with patient privacy

Proceedings of the 1st ACM International Health Informatics Symposium
Efficient privacy-preserving data mining in malicious model

ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications: Part I
Privacy-preserving data mining in presence of covert adversaries

ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications: Part I
Efficient secure two-party exponentiation

CT-RSA'11 Proceedings of the 11th international conference on Topics in cryptology: CT-RSA 2011
Secure and efficient protocols for iris and fingerprint identification

ESORICS'11 Proceedings of the 16th European conference on Research in computer security
Efficient privacy preserving k-means clustering

PAISI'10 Proceedings of the 2010 Pacific Asia conference on Intelligence and Security Informatics
Distributed privacy-preserving methods for statistical disclosure control

DPM'09/SETOP'09 Proceedings of the 4th international workshop, and Second international conference on Data Privacy Management and Autonomous Spontaneous Security
Fully homomorphic encryption based two-party association rule mining

Data & Knowledge Engineering
Privacy-Preserving EM algorithm for clustering on social network

PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
Secure distributed framework for achieving ε-differential privacy

PETS'12 Proceedings of the 12th international conference on Privacy Enhancing Technologies
Equally contributory privacy-preserving k-means clustering over vertically partitioned data

Information Systems
The effectiveness of lloyd-type methods for the k-means problem

Journal of the ACM (JACM)
Probabilistically correct secure arithmetic computation for modular conversion, zero test, comparison, MOD and exponentiation

SCN'12 Proceedings of the 8th international conference on Security and Cryptography for Networks
Reliable medical recommendation systems with patient privacy

ACM Transactions on Intelligent Systems and Technology (TIST) - Survey papers, special sections on the semantic adaptive social web, intelligent systems for health informatics, regular papers
EsPRESSO: Efficient privacy-preserving evaluation of sample set similarity

Journal of Computer Security

Quantified Score

Hi-index	0.00

Visualization

Abstract

The k-Means Clustering problem is one of the most-explored problems in data mining to date. With the advent of protocols that have proven to be successful in performing single database clustering, the focus has shifted in recent years to the question of how to extend the single database protocols to a multiple database setting. To date there have been numerous attempts to create specific multiparty k-means clustering protocols that protect the privacy of each database, but according to the standard cryptographic definitions of "privacy-protection," so far all such attempts have fallen short of providing adequate privacy. In this paper we describe a Two-Party k-Means Clustering Protocol that guarantees privacy, and is more efficient than utilizing a general multiparty "compiler" to achieve the same task. In particular, a main contribution of our result is a way to compute efficiently multiple iterations of k-means clustering without revealing the intermediate values. To achieve this, we show two techniques: to perform two-party division and to sample uniformly at random from an unknown domain size; the resulting Division Protocol and Random Value Protocol are of use to any protocol that requires the secure computation of a quotient or random sampling. Our techniques can be realized based on the existence of any semantically secure homomorphic encryption scheme. For concreteness, we describe our protocol based on Paillier Homomorphic Encryption scheme (see [21]). We will also demonstrate that our protocol is efficient in terms of communication, remaining competitive with existing protocols (such as [13]) that fail to protect privacy.