Privacy-MaxEnt: integrating background knowledge in privacy quantification

Authors:
Wenliang Du;Zhouxuan Teng;Zutao Zhu
Affiliations:
Syracuse University, Syracuse, NY, USA;Syracuse University, Syracuse, NY, USA;Syracuse University, Syracuse, NY, USA
Venue:
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Year:
2008

Citing 18
Cited 9

On the limited memory BFGS method for large scale optimization

Mathematical Programming: Series A and B
Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
A maximum entropy approach to natural language processing

Computational Linguistics
Inducing Features of Random Fields

IEEE Transactions on Pattern Analysis and Machine Intelligence
Privacy-preserving data mining

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
On the design and quantification of privacy preserving data mining algorithms

PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Mining for Strong Negative Associations in a Large Database of Customer Transactions

ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Privacy preserving mining of association rules

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Top-Down Specialization for Information and Privacy Preservation

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Data Privacy through Optimal k-Anonymization

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Incognito: efficient full-domain K-anonymity

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Maximum Entropy Models with Inequality Constraints: A Case Study on Text Categorization

Machine Learning
Mondrian Multidimensional K-Anonymity

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
\ell -Diversity: Privacy Beyond \kappa -Anonymity

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
(α, k)-anonymity: an enhanced k-anonymity model for privacy preserving data publishing

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Anatomy: simple and effective privacy preservation

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Maintaining data privacy in association rule mining

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Privacy skyline: privacy with multidimensional adversarial knowledge

VLDB '07 Proceedings of the 33rd international conference on Very large data bases

Attacks on privacy and deFinetti's theorem

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Privacy-Preserving Data Publishing

Foundations and Trends in Databases
Algorithm-safe privacy-preserving data publishing

Proceedings of the 13th International Conference on Extending Database Technology
Privacy Preserving Categorical Data Analysis with Unknown Distortion Parameters

Transactions on Data Privacy
Understanding privacy risk of publishing decision trees

DBSec'10 Proceedings of the 24th annual IFIP WG 11.3 working conference on Data and applications security and privacy
Preventing range disclosure in k-anonymised data

Expert Systems with Applications: An International Journal
Instant anonymization

ACM Transactions on Database Systems (TODS)
Privacy-preserving publishing microdata with full functional dependencies

Data & Knowledge Engineering
ASAP: Eliminating algorithm-based disclosure in privacy-preserving data publishing

Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Privacy-Preserving Data Publishing (PPDP) deals with the publication of microdata while preserving people' private information in the data. To measure how much private information can be preserved, privacy metrics is needed. An essential element for privacy metrics is the measure of how much adversaries can know about an individual's sensitive attributes (SA) if they know the individual's quasi-identifiers (QI), i.e., we need to measure P(SA|QI). Such a measure is hard to derive when adversaries' background knowledge has to be considered. We propose a systematic approach, Privacy-MaxEnt, to integrate background knowledge in privacy quantification. Our approach is based on the maximum entropy principle. We treat all the conditional probabilities P(SA|QI) as unknown variables; we treat the background knowledge as the constraints of these variables; in addition, we also formulate constraints from the published data. Our goal becomes finding a solution to those variables (the probabilities) that satisfy all these constraints. Although many solutions may exist, the most unbiased estimate of P(SA|QI) is the one that achieves the maximum entropy.