Protecting Respondents' Identities in Microdata Release
IEEE Transactions on Knowledge and Data Engineering
Practical Data-Oriented Microaggregation for Statistical Disclosure Control
IEEE Transactions on Knowledge and Data Engineering
k-anonymity: a model for protecting privacy
International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Minimum Spanning Tree Partitioning Algorithm for Microaggregation
IEEE Transactions on Knowledge and Data Engineering
Ordinal, Continuous and Heterogeneous k-Anonymity Through Microaggregation
Data Mining and Knowledge Discovery
Data Mining and Knowledge Discovery
Mondrian Multidimensional K-Anonymity
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Efficient multivariate data-oriented microaggregation
The VLDB Journal — The International Journal on Very Large Data Bases
TFRP: An efficient microaggregation algorithm for statistical disclosure control
Journal of Systems and Software
Genetic algorithm-based clustering approach for k-anonymization
Expert Systems with Applications: An International Journal
Density-based microaggregation for statistical disclosure control
Expert Systems with Applications: An International Journal
Using fuzzy AHP for evaluating the dimensions of data quality
International Journal of Business Information Systems
Journal of Biomedical Informatics
Hi-index | 12.05 |
Microaggregation is commonly used to protect microdata from individual identification by anonymizing dataset records such that the resulting dataset (called the anonymized dataset) satisfies the k-anonymity constraint. Since this anonymizing process degrades data quality, an effective microaggregation approach must ensure the quality of the anonymized dataset so that the anonymized dataset remains useful for further analysis. Therefore, the performance of a microaggregation approach should be measured by the quality of the anonymized dataset generated by the microaggregation approach. Previous studies often refer to the quality of an anonymized dataset as information loss. This study takes a different approach. Since an anonymized dataset should support further analysis, this study first builds a classifier from the anonymized dataset, and then uses the prediction accuracy of that classifier to represent the quality of the anonymized dataset. Performance results indicate that low information loss does not necessarily translate into high prediction accuracy, and vice versa. This is particularly true when the information losses of both anonymized datsets do not differ significantly.