A data distortion by probability distribution
ACM Transactions on Database Systems (TODS)
Security-control methods for statistical databases: a comparative study
ACM Computing Surveys (CSUR)
ACM Computing Surveys (CSUR)
The statistical security of a statistical database
ACM Transactions on Database Systems (TODS)
Data Mining: Concepts, Models, Methods and Algorithms
Data Mining: Concepts, Models, Methods and Algorithms
Protecting Respondents' Identities in Microdata Release
IEEE Transactions on Knowledge and Data Engineering
Practical Data-Oriented Microaggregation for Statistical Disclosure Control
IEEE Transactions on Knowledge and Data Engineering
k-anonymity: a model for protecting privacy
International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Disclosure Limitation of Sensitive Rules
KDEX '99 Proceedings of the 1999 Workshop on Knowledge and Data Engineering Exchange
A Polynomial Algorithm for Optimal Univariate Microaggregation
IEEE Transactions on Knowledge and Data Engineering
On the complexity of optimal K-anonymity
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Minimum Spanning Tree Partitioning Algorithm for Microaggregation
IEEE Transactions on Knowledge and Data Engineering
Ordinal, Continuous and Heterogeneous k-Anonymity Through Microaggregation
Data Mining and Knowledge Discovery
A 2^d-Tree-Based Blocking Method for Microaggregating Very Large Data Sets
ARES '06 Proceedings of the First International Conference on Availability, Reliability and Security
Density-based microaggregation for statistical disclosure control
Expert Systems with Applications: An International Journal
Comparison of microaggregation approaches on anonymized data quality
Expert Systems with Applications: An International Journal
A modification of the Lloyd algorithm for k-anonymous quantization
Information Sciences: an International Journal
Optimal univariate microaggregation with data suppression
Journal of Systems and Software
MAGE: A semantics retaining K-anonymization method for mixed data
Knowledge-Based Systems
Multivariate microaggregation by iterative optimization
Applied Intelligence
Hi-index | 0.01 |
Recently, the issue of statistic disclosure control (SDC) has attracted much attention. SDC is a very important part of data security dealing with the protection of databases. Microaggregation for SDC techniques is widely used to protect confidentiality in statistical databases released for public use. The basic problem of microaggregation is that similar records are clustered into groups, and each group contains at least k records to prevent disclosure of individual information, where k is a pre-defined security threshold. For a certain k, an optimal multivariable microaggregation has the lowest information loss. The minimum information loss is an NP-hard problem. Existing fixed-size techniques can obtain a low information loss with O(n2) or O(n3/k) time complexity. To improve the execution time and lower information loss, this study proposes the Two Fixed Reference Points (TFRP) method, a two-phase algorithm for microaggregation. In the first phase, TFRP employs the pre-computing and median-of-medians techniques to efficiently shorten its running time to O(n2/k). To decrease information loss in the second phase, TFRP generates variable-size groups by removing the lower homogenous groups. Experimental results reveal that the proposed method is significantly faster than the Diameter and the Centroid methods. Running on several test datasets, TFRP also significantly reduces information loss, particularly in sparse datasets with a large k.