Closeness: A New Privacy Measure for Data Publishing

Authors:
Ninghui Li;Tiancheng Li;Suresh Venkatasubramanian
Affiliations:
Purdue University, West Lafayette;Purdue University, West Lafayette;University of Utah, Salt Lake City
Venue:
IEEE Transactions on Knowledge and Data Engineering
Year:
2010

Citing 0
Cited 12

SABRE: a Sensitive Attribute Bucketization and REdistribution framework for t-closeness

The VLDB Journal — The International Journal on Very Large Data Bases
Learning latent variable models from distributed and abstracted data

Information Sciences: an International Journal
Privacy preservation for associative classification: an approximation algorithm

International Journal of Business Intelligence and Data Mining
Trajectory privacy in location-based services and data publication

ACM SIGKDD Explorations Newsletter
Privacy beyond single sensitive attribute

DEXA'11 Proceedings of the 22nd international conference on Database and expert systems applications - Volume Part I
Utility-driven anonymization in data publishing

Proceedings of the 20th ACM international conference on Information and knowledge management
Limiting disclosure of sensitive data in sequential releases of databases

Information Sciences: an International Journal
Publishing microdata with a robust privacy guarantee

Proceedings of the VLDB Endowment
Generically extending anonymization algorithms to deal with successive queries

Proceedings of the 21st ACM international conference on Information and knowledge management
An efficient quasi-identifier index based approach for privacy preservation over incremental data sets on cloud

Journal of Computer and System Sciences
Incremental processing and indexing for k, e-anonymisation

International Journal of Information and Computer Security
A near-optimal algorithm for differentially-private principal components

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

The k-anonymity privacy requirement for publishing microdata requires that each equivalence class (i.e., a set of records that are indistinguishable from each other with respect to certain “identifying” attributes) contains at least k records. Recently, several authors have recognized that k-anonymity cannot prevent attribute disclosure. The notion of \ell-diversity has been proposed to address this; \ell-diversity requires that each equivalence class has at least \ell well-represented (in Section 2) values for each sensitive attribute. In this paper, we show that \ell-diversity has a number of limitations. In particular, it is neither necessary nor sufficient to prevent attribute disclosure. Motivated by these limitations, we propose a new notion of privacy called “closeness.” We first present the base model t-closeness, which requires that the distribution of a sensitive attribute in any equivalence class is close to the distribution of the attribute in the overall table (i.e., the distance between the two distributions should be no more than a threshold t). We then propose a more flexible privacy model called (n,t)-closeness that offers higher utility. We describe our desiderata for designing a distance measure between two probability distributions and present two distance measures. We discuss the rationale for using closeness as a privacy measure and illustrate its advantages through examples and experiments.