Non-metric multidimensional scaling for privacy-preserving data clustering

  • Authors:
  • Khaled Alotaibi;Victor J. Rayward-Smith;Beatriz de la Iglesia

  • Affiliations:
  • School of Computing Sciences, University of East Anglia, Norwich, UK;School of Computing Sciences, University of East Anglia, Norwich, UK;School of Computing Sciences, University of East Anglia, Norwich, UK

  • Venue:
  • IDEAL'11 Proceedings of the 12th international conference on Intelligent data engineering and automated learning
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Outsourcing data to external parties for analysis is risky as the privacy of confidential variables can be easily violated. To eliminate this threat, the data values of these variables should be perturbed before releasing the data. However, the perturbation itself may significantly change the underlying properties of the data, affecting the analysis results. What is required is a subtle transformation to generate perturbed data that maintains, as much as possible, the statistical properties and effectiveness (i.e. the utility) of the original data whilst preserving the privacy. We examine privacy-preserving transformations in the context of data clustering. In particular, this paper demonstrates how nonmetric multidimensional scaling (MDS) can be profitably used as a perturbation tool and how the perturbed data can be effectively used in clustering analysis without compromising privacy or utility. We apply the proposed technique to real datasets and compare the results, which were, in some circumstances, exactly the same as those obtained from the original data.