DPCube: Releasing Differentially Private Data Cubes for Health Information

  • Authors:
  • Yonghui Xiao;James Gardner;Li Xiong

  • Affiliations:
  • -;-;-

  • Venue:
  • ICDE '12 Proceedings of the 2012 IEEE 28th International Conference on Data Engineering
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose to demonstrate DPCube, a component in our Health Information DE-identification (HIDE) framework, for releasing differentially private data cubes (or multidimensional histograms) for sensitive data. HIDE is a framework we developed for integrating heterogenous structured and unstructured health information and provides methods for privacy preserving data publishing. The DPCube component provides the differentially private multidimensional data cube release. The DPCube algorithm uses the differentially private access mechanisms as provided by HIDE and guarantees differential privacy for the released data. It utilizes an innovative two-step multidimensional partitioning technique to publish a generalized data cube or multi-dimensional histogram that achieve good utility while satisfying the privacy requirement. We demonstrate that the released data cubes can serve as a sanitized synopsis of the raw database and, together with an optional synthesized dataset based on the data cubes, can support various Online Analytical Processing (OLAP) queries and learning tasks.