Distributed privacy preserving k-means clustering with additive secret sharing

  • Authors:
  • Mahir Can Doganay;Thomas B. Pedersen;Yücel Saygin;Erkay Savaş;Albert Levi

  • Affiliations:
  • Sabanci University, Istanbul, Turkey;Sabanci University, Istanbul, Turkey;Sabanci University, Istanbul, Turkey;Sabanci University, Istanbul, Turkey;Sabanci University, Istanbul, Turkey

  • Venue:
  • PAIS '08 Proceedings of the 2008 international workshop on Privacy and anonymity in information society
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recent concerns about privacy issues motivated data mining researchers to develop methods for performing data mining while preserving the privacy of individuals. However, the current techniques for privacy preserving data mining suffer from high communication and computation overheads which are prohibitive considering even a modest database size. Furthermore, the proposed techniques have strict assumptions on the involved parties which need to be relaxed in order to reflect the real-world requirements. In this paper we concentrate on a distributed scenario where the data is partitioned vertically over multiple sites and the involved sites would like to perform clustering without revealing their local databases. For this setting, we propose a new protocol for privacy preserving k-means clustering based on additive secret sharing. We show that the new protocol is more secure than the state of the art. Experiments conducted on real and synthetic data sets show that, in realistic scenarios, the communication and computation cost of our protocol is considerably less than the state of the art which is crucial for data mining applications.