Correlations and Copulas for Decision and Risk Analysis
Management Science
A General Additive Data Perturbation Method for Database Security
Management Science
A general class of multivariate skew-elliptical distributions
Journal of Multivariate Analysis
Information preserving statistical obfuscation
Statistics and Computing
A theoretical basis for perturbation methods
Statistics and Computing
Perturbing Nonnormal Confidential Attributes: The Copula Approach
Management Science
L-diversity: Privacy beyond k-anonymity
ACM Transactions on Knowledge Discovery from Data (TKDD)
Data ShufflingA New Masking Approach for Numerical Data
Management Science
Privacy, accuracy, and consistency too: a holistic solution to contingency table release
Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Generating Sufficiency-based Non-Synthetic Perturbed Data
Transactions on Data Privacy
A three-dimensional conceptual framework for database privacy
SDM'07 Proceedings of the 4th VLDB conference on Secure data management
Class-Restricted Clustering and Microperturbation for Data Privacy
Management Science
Hi-index | 0.01 |
We propose a new data perturbation method for numerical database security problems based on skew-t distributions. Unlike the normal distribution, the more general class of skew-t distributions is a flexible parametric multivariate family that can model skewness and heavy tails in the data. Because databases having a normal distribution are seldom encountered in practice, the newly proposed approach, coined the skew-t data perturbation (STDP) method, is of great interest for database managers. We also discuss how to preserve the sample mean vector and sample covariance matrix exactly for any data perturbation method. We investigate the performance of the STDP method by means of a Monte Carlo simulation study and compare it with other existing perturbation methods. Of particular importance is the ability of STDP to reproduce characteristics of the joint tails of the distribution in order for database users to answer higher-level questions. We apply the STDP method to a medical database related to breast cancer.