Privacy preserving clustering

  • Authors:
  • Somesh Jha;Luis Kruger;Patrick McDaniel

  • Affiliations:
  • Computer Sciences Department, University of Wisconsin, Madison, WI;Computer Sciences Department, University of Wisconsin, Madison, WI;Computer Science and Engineering, Pennsylvania State University, University Park, PA

  • Venue:
  • ESORICS'05 Proceedings of the 10th European conference on Research in Computer Security
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

The freedom and transparency of information flow on the Internet has heightened concerns of privacy. Given a set of data items, clustering algorithms group similar items together. Clustering has many applications, such as customerbehavior analysis, targeted marketing, forensics, and bioinformatics. In this paper, we present the design and analysis of a privacy-preserving k-means clustering algorithm, where only the cluster means at the various steps of the algorithm are revealed to the participating parties. The crucial step in our privacy-preserving k-means is privacy-preserving computation of cluster means.We present two protocols (one based on oblivious polynomial evaluation and the second based on homomorphic encryption) for privacy-preserving computation of cluster means. We have a JAVA implementation of our algorithm. Using our implementation, we have performed a thorough evaluation of our privacy-preserving clustering algorithm on three data sets. Our evaluation demonstrates that privacy-preserving clustering is feasible, i.e., our homomorphic-encryption based algorithm finished clustering a large data set in approximately 66 seconds.