Why swap when you can shuffle? a comparison of the proximity swap and data shuffle for numeric data

Authors:
Krish Muralidhar;Rathindra Sarathy;Ramesh Dandekar
Affiliations:
University of Kentucky, Lexington, KY;Oklahoma State University, Stillwater, OK;Department of Energy, Energy Information Administration, Washington, DC
Venue:
PSD'06 Proceedings of the 2006 CENEX-SDC project international conference on Privacy in Statistical Databases
Year:
2006

Citing 7
Cited 2

Non-reversible privacy transformations

PODS '82 Proceedings of the 1st ACM SIGACT-SIGMOD symposium on Principles of database systems
Sensitive Micro Data Protection Using Latin Hypercube Sampling Technique

Inference Control in Statistical Databases, From Theory to Practice
The Security of Confidential Numerical Data in Databases

Information Systems Research
Information preserving statistical obfuscation

Statistics and Computing
A theoretical basis for perturbation methods

Statistics and Computing
Perturbing Nonnormal Confidential Attributes: The Copula Approach

Management Science
Data ShufflingA New Masking Approach for Numerical Data

Management Science

Research Note---Generating Shareable Statistical Databases for Business Value: Multiple Imputation with Multimodal Perturbation

Information Systems Research
n-cycle swapping for the American community survey

PSD'12 Proceedings of the 2012 international conference on Privacy in Statistical Databases

Quantified Score

Hi-index	0.00

Visualization

Abstract

The rank based proximity swap has been suggested as a data masking mechanism for numerical data. Recently, more sophisticated procedures for masking numerical data that are based on the concept of “shuffling” the data have been proposed. In this study, we compare and contrast the performance of the swapping and shuffling procedures. The results indicate that the shuffling procedures perform better than data swapping both in terms of data utility and disclosure risk.