Efficient multivariate data-oriented microaggregation

Authors:
Josep Domingo-Ferrer;Antoni Martínez-Ballesté;Josep Maria Mateo-Sanz;Francesc Sebé
Affiliations:
Department of Computer Engineering & Maths, Rovira i Virgili University of Tarragona, Catalonia;Department of Computer Engineering & Maths, Rovira i Virgili University of Tarragona, Catalonia;Statistics Group, Rovira i Virgili University of Tarragona, Catalonia;Department of Computer Engineering & Maths, Rovira i Virgili University of Tarragona, Catalonia
Venue:
The VLDB Journal — The International Journal on Very Large Data Bases
Year:
2006

Citing 14
Cited 17

On the design and quantification of privacy preserving data mining algorithms

PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Clustering Algorithms

Clustering Algorithms
Protecting Respondents' Identities in Microdata Release

IEEE Transactions on Knowledge and Data Engineering
Practical Data-Oriented Microaggregation for Statistical Disclosure Control

IEEE Transactions on Knowledge and Data Engineering
Microdata Protection through Noise Addition

Inference Control in Statistical Databases, From Theory to Practice
LHS-Based Hybrid Microdata vs Rank Swapping and Microaggregation for Numeric Microdata Protection

Inference Control in Statistical Databases, From Theory to Practice
Disclosure Risk Assessment in Perturbative Microdata Protection

Inference Control in Statistical Databases, From Theory to Practice
Exact and approximate methods for data directed microaggregation in one or more dimensions

International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
k-anonymity: a model for protecting privacy

International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
A Polynomial Algorithm for Optimal Univariate Microaggregation

IEEE Transactions on Knowledge and Data Engineering
Information preserving statistical obfuscation

Statistics and Computing
On Privacy-Preserving Access to Distributed Heterogeneous Healthcare Information

HICSS '04 Proceedings of the Proceedings of the 37th Annual Hawaii International Conference on System Sciences (HICSS'04) - Track 6 - Volume 6
Minimum Spanning Tree Partitioning Algorithm for Microaggregation

IEEE Transactions on Knowledge and Data Engineering
Ordinal, Continuous and Heterogeneous k-Anonymity Through Microaggregation

Data Mining and Knowledge Discovery

Attribute selection in multivariate microaggregation

PAIS '08 Proceedings of the 2008 international workshop on Privacy and anonymity in information society
On the disclosure risk of multivariate microaggregation

Data & Knowledge Engineering
Improving Microaggregation for Complex Record Anonymization

MDAI '08 Sabadell Proceedings of the 5th International Conference on Modeling Decisions for Artificial Intelligence
Importance partitioning in micro-aggregation

Computational Statistics & Data Analysis
Achieving microaggregation for secure statistical databases using fixed-structure partitioning-based learning automata

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Density-based microaggregation for statistical disclosure control

Expert Systems with Applications: An International Journal
On utilizing association and interaction concepts for enhancing microaggregation in secure statistical databases

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Comparison of microaggregation approaches on anonymized data quality

Expert Systems with Applications: An International Journal
Secure distributed computation of anonymized views of shared databases

ACM Transactions on Database Systems (TODS)
Kd-trees and the real disclosure risks of large statistical databases

Information Fusion
Towards identity disclosure control in private hypergraph publishing

PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
Privacy in mobile technology for personal healthcare

ACM Computing Surveys (CSUR)
A modification of the Lloyd algorithm for k-anonymous quantization

Information Sciences: an International Journal
Anonymization methods for taxonomic microdata

PSD'12 Proceedings of the 2012 international conference on Privacy in Statistical Databases
Optimal univariate microaggregation with data suppression

Journal of Systems and Software
A semantic framework to protect the privacy of electronic health records with non-numerical attributes

Journal of Biomedical Informatics
Multivariate microaggregation by iterative optimization

Applied Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Microaggregation is a family of methods for statistical disclosure control (SDC) of microdata (records on individuals and/or companies), that is, for masking microdata so that they can be released while preserving the privacy of the underlying individuals. The principle of microaggregation is to aggregate original database records into small groups prior to publication. Each group should contain at least k records to prevent disclosure of individual information, where k is a constant value preset by the data protector. Recently, microaggregation has been shown to be useful to achieve k-anonymity, in addition to it being a good masking method. Optimal microaggregation (with minimum within-groups variability loss) can be computed in polynomial time for univariate data. Unfortunately, for multivariate data it is an NP-hard problem. Several heuristic approaches to microaggregation have been proposed in the literature. Heuristics yielding groups with fixed size k tends to be more efficient, whereas data-oriented heuristics yielding variable group size tends to result in lower information loss. This paper presents new data-oriented heuristics which improve on the trade-off between computational complexity and information loss and are thus usable for large datasets.