Protecting privacy in tabular healthcare data: explicit uncertainty for disclosure control

Authors:
Brian Shand;Jem Rashbass
Affiliations:
University of Cambridge, Cambridge;University of Cambridge, Cambridge
Venue:
Proceedings of the 2005 ACM workshop on Privacy in the electronic society
Year:
2005

Citing 4
Cited 2

Statistical analysis with missing data

Statistical analysis with missing data
Protecting Respondents' Identities in Microdata Release

IEEE Transactions on Knowledge and Data Engineering
k-anonymity: a model for protecting privacy

International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Privacy Protection of Binary Confidential Data Against Deterministic, Stochastic, and Insider Threat

Management Science

Masking Gateway for Enterprises

Languages: From Formal to Natural
Dynamic masking of application displays using OCR technologies

IBM Journal of Research and Development

Quantified Score

Hi-index	0.00

Visualization

Abstract

Summary medical data provides important statistical information for public health, but risks revealing confidential patient information. This risk is particularly difficult to assess when many different tables are released, independently protected against disclosure by various techniques. In this paper, we present a new technique for disclosure control in tabular data which uses explicit uncertainty to prevent small numbers of records from being identified disclosively. In contrast to other techniques, bounds on the cell perturbations are also made public. This technique can be applied effectively to large datasets in their entirety, automatically, and the transformed data can then be used to create the derivative tables, or hosted on a public web site. It is even safe for population-based data. Furthermore, we show that this transformation is computationally efficient while ensuring k-anonymity, and demonstrate the suitability of the transformed data for further statistical analysis.