Right-protected data publishing with hierarchical clustering preservation

  • Authors:
  • Michail Vlachos;Aleksander Wieczorek;Johannes Schneider

  • Affiliations:
  • IBM Research - Zurich, Rueschlikon, Switzerland;Poznan University, Poznan, Poland;IBM Research - Zurich, Rueschlikon, Switzerland

  • Venue:
  • Proceedings of the 21st ACM international conference on Information and knowledge management
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

The emergence of cloud-based storage services is opening up new avenues in data exchange and data dissemination. This has amplified the interest in right-protection mechanisms for establishing ownership in case of data leakage. Current right-protection technologies, however, rarely provide strong guarantees on the dataset utility after the protection process. This work presents techniques that explicitly address this shortcoming and provably preserve the outcome of certain mining operations. In particular, we take special care to guarantee that the outcome of hierarchical clustering operations remains the same before and after right protection. We encode data ownership using watermarking principles. In the process, we derive fundamental bounds on the distortion incurred by the watermarking. We leverage our theoretical analysis to design fast algorithms for right protection without exhaustively searching the vast design space.