Comparison of microaggregation approaches on anonymized data quality

Authors:
Jun-Lin Lin;Pei-Chann Chang;Julie Yu-Chih Liu;Tsung-Hsien Wen
Affiliations:
Department of Information Management, Yuan Ze University, Chung-Li 320, Taiwan;Department of Information Management, Yuan Ze University, Chung-Li 320, Taiwan;Department of Information Management, Yuan Ze University, Chung-Li 320, Taiwan;Department of Information Management, Yuan Ze University, Chung-Li 320, Taiwan
Venue:
Expert Systems with Applications: An International Journal
Year:
2010

Citing 11
Cited 2

Protecting Respondents' Identities in Microdata Release

IEEE Transactions on Knowledge and Data Engineering
Practical Data-Oriented Microaggregation for Statistical Disclosure Control

IEEE Transactions on Knowledge and Data Engineering
k-anonymity: a model for protecting privacy

International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Minimum Spanning Tree Partitioning Algorithm for Microaggregation

IEEE Transactions on Knowledge and Data Engineering
Ordinal, Continuous and Heterogeneous k-Anonymity Through Microaggregation

Data Mining and Knowledge Discovery
Privacy in Data Mining

Data Mining and Knowledge Discovery
Mondrian Multidimensional K-Anonymity

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Efficient multivariate data-oriented microaggregation

The VLDB Journal — The International Journal on Very Large Data Bases
TFRP: An efficient microaggregation algorithm for statistical disclosure control

Journal of Systems and Software
Genetic algorithm-based clustering approach for k-anonymization

Expert Systems with Applications: An International Journal
Density-based microaggregation for statistical disclosure control

Expert Systems with Applications: An International Journal

Using fuzzy AHP for evaluating the dimensions of data quality

International Journal of Business Information Systems
A semantic framework to protect the privacy of electronic health records with non-numerical attributes

Journal of Biomedical Informatics

Quantified Score

Hi-index	12.05

Visualization

Abstract

Microaggregation is commonly used to protect microdata from individual identification by anonymizing dataset records such that the resulting dataset (called the anonymized dataset) satisfies the k-anonymity constraint. Since this anonymizing process degrades data quality, an effective microaggregation approach must ensure the quality of the anonymized dataset so that the anonymized dataset remains useful for further analysis. Therefore, the performance of a microaggregation approach should be measured by the quality of the anonymized dataset generated by the microaggregation approach. Previous studies often refer to the quality of an anonymized dataset as information loss. This study takes a different approach. Since an anonymized dataset should support further analysis, this study first builds a classifier from the anonymized dataset, and then uses the prediction accuracy of that classifier to represent the quality of the anonymized dataset. Performance results indicate that low information loss does not necessarily translate into high prediction accuracy, and vice versa. This is particularly true when the information losses of both anonymized datsets do not differ significantly.