Model Based Disclosure Protection
Inference Control in Statistical Databases, From Theory to Practice
Spatial and non-spatial model-based protection procedures for the release of business microdata
Statistics and Computing
Maximum entropy simulation for microdata protection
Statistics and Computing
Towards the Diversity of Sensitive Attributes in k-Anonymity
WI-IATW '06 Proceedings of the 2006 IEEE/WIC/ACM international conference on Web Intelligence and Intelligent Agent Technology
Privacy-Preserving Data Publishing
Foundations and Trends in Databases
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Quantile-based bootstrap methods to generate continuous synthetic data
Proceedings of the 2010 EDBT/ICDT Workshops
Hybrid microdata using microaggregation
Information Sciences: an International Journal
Software—Practice & Experience - Focus on Selected PhD Literature Reviews in the Practical Aspects of Software Technology
Privacy preservation by independent component analysis and variance control
Proceedings of the 20th ACM international conference on Information and knowledge management
Why swap when you can shuffle? a comparison of the proximity swap and data shuffle for numeric data
PSD'06 Proceedings of the 2006 CENEX-SDC project international conference on Privacy in Statistical Databases
Information fusion in data privacy: A survey
Information Fusion
Hybrid microdata via model-based clustering
PSD'12 Proceedings of the 2012 international conference on Privacy in Statistical Databases
Hi-index | 0.00 |
We propose use of Latin Hypercube Sampling to create a synthetic data set that reproduces many of the essential features of an original data set while providing disclosure protection. The synthetic micro data can also be used to create either additive or multiplicative noise which when merged with the original data can provide disclosure protection. The technique can also be used to create hybrid micro data sets containing pre-determined mixtures of real and synthetic data. We demonstrate the basic properties of the synthetic data approach by applying the Latin Hypercube Sampling technique to a database supported a by the Energy Information Administration. The use of Latin Hypercube Sampling, along with the goal of reproducing the rank correlation structure instead of the Pearson correlation structure, has not been previously applied to the disclosure protection problem. Given its properties, this technique offers multiple alternatives to current methods for providing disclosure protection for large data sets.