Protecting privacy in tabular healthcare data: explicit uncertainty for disclosure control

  • Authors:
  • Brian Shand;Jem Rashbass

  • Affiliations:
  • University of Cambridge, Cambridge;University of Cambridge, Cambridge

  • Venue:
  • Proceedings of the 2005 ACM workshop on Privacy in the electronic society
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Summary medical data provides important statistical information for public health, but risks revealing confidential patient information. This risk is particularly difficult to assess when many different tables are released, independently protected against disclosure by various techniques. In this paper, we present a new technique for disclosure control in tabular data which uses explicit uncertainty to prevent small numbers of records from being identified disclosively. In contrast to other techniques, bounds on the cell perturbations are also made public. This technique can be applied effectively to large datasets in their entirety, automatically, and the transformed data can then be used to create the derivative tables, or hosted on a public web site. It is even safe for population-based data. Furthermore, we show that this transformation is computationally efficient while ensuring k-anonymity, and demonstrate the suitability of the transformed data for further statistical analysis.