Information based data anonymization for classification utility

Authors:
Jiuyong Li;Jixue Liu;Muzammil Baig;Raymond Chi-Wing Wong
Affiliations:
School of Computer & Information Science, University of South Australia, Australia;School of Computer & Information Science, University of South Australia, Australia;School of Computer & Information Science, University of South Australia, Australia;Department of Computer Science & Engineering, Hong Kong University of Science and Technology, Hong Kong
Venue:
Data & Knowledge Engineering
Year:
2011

Citing 19
Cited 1

k-anonymity: a model for protecting privacy

International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Transforming data to satisfy privacy constraints

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Bottom-Up Generalization: A Data Mining Solution to Privacy Protection

ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Top-Down Specialization for Information and Privacy Preservation

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Data Privacy through Optimal k-Anonymization

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Incognito: efficient full-domain K-anonymity

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Mondrian Multidimensional K-Anonymity

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
(α, k)-anonymity: an enhanced k-anonymity model for privacy preserving data publishing

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Utility-based anonymization using local recoding

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
L-diversity: Privacy beyond k-anonymity

ACM Transactions on Knowledge Discovery from Data (TKDD)
Anonymizing Classification Data for Privacy Preservation

IEEE Transactions on Knowledge and Data Engineering
Providing k-anonymity in data mining

The VLDB Journal — The International Journal on Very Large Data Bases
Workload-aware anonymization techniques for large-scale datasets

ACM Transactions on Database Systems (TODS)
Anonymization by Local Recoding in Data with Attribute Hierarchical Taxonomies

IEEE Transactions on Knowledge and Data Engineering
On the tradeoff between privacy and utility in data publishing

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient Multidimensional Suppression for K-Anonymity

IEEE Transactions on Knowledge and Data Engineering
On the use of economic price theory to find the optimum levels of privacy and information utility in non-perturbative microdata anonymisation

Data & Knowledge Engineering
Privacy-preserving data mining through knowledge model sharing

PinKDD'07 Proceedings of the 1st ACM SIGKDD international conference on Privacy, security, and trust in KDD
Privacy-preserving publishing microdata with full functional dependencies

Data & Knowledge Engineering

Editorial: Occupation inference through detection and classification of biographical activities

Data & Knowledge Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Anonymization is a practical approach to protect privacy in data. The major objective of privacy preserving data publishing is to protect private information in data whereas data is still useful for some intended applications, such as building classification models. In this paper, we argue that data generalization in anonymization should be determined by the classification capability of data rather than the privacy requirement. We make use of mutual information for measuring classification capability for generalization, and propose two k-anonymity algorithms to produce anonymized tables for building accurate classification models. The algorithms generalize attributes to maximize the classification capability, and then suppress values by a privacy requirement k (IACk) or distributional constraints (IACc). Experimental results show that algorithm IACk supports more accurate classification models and is faster than a benchmark utility-aware data anonymization algorithm.