Data ShufflingA New Masking Approach for Numerical Data

Authors:
Krishnamurty Muralidhar;Rathindra Sarathy
Affiliations:
Gatton College of Business and Economics, University of Kentucky, Lexington, Kentucky 40506;Spears School of Business, Oklahoma State University, Stillwater, Oklahoma 74078
Venue:
Management Science
Year:
2006

Citing 7
Cited 14

A data distortion by probability distribution

ACM Transactions on Database Systems (TODS)
Security-control methods for statistical databases: a comparative study

ACM Computing Surveys (CSUR)
Correlations and Copulas for Decision and Risk Analysis

Management Science
A General Additive Data Perturbation Method for Database Security

Management Science
Non-reversible privacy transformations

PODS '82 Proceedings of the 1st ACM SIGACT-SIGMOD symposium on Principles of database systems
Confidentiality via Camouflage: The CVC Approach to Disclosure Limitation When Answering Queries to Databases

Operations Research
Perturbing Nonnormal Confidential Attributes: The Copula Approach

Management Science

Random orthogonal matrix masking methodology for microdata release

International Journal of Information and Computer Security
Statistical Disclosure Control for Microdata Using the R-Package sdcMicro

Transactions on Data Privacy
Privacy preservation in data mining using hybrid perturbation methods: an application to bankruptcy prediction in banks

International Journal of Data Analysis Techniques and Strategies
Perturbation of Numerical Confidential Data via Skew-t Distributions

Management Science
Quantile-based bootstrap methods to generate continuous synthetic data

Proceedings of the 2010 EDBT/ICDT Workshops
Privacy preservation by independent component analysis and variance control

Proceedings of the 20th ACM international conference on Information and knowledge management
Why swap when you can shuffle? a comparison of the proximity swap and data shuffle for numeric data

PSD'06 Proceedings of the 2006 CENEX-SDC project international conference on Privacy in Statistical Databases
Multivariate equi-width data swapping for private data publication

PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
Research Note---Generating Shareable Statistical Databases for Business Value: Multiple Imputation with Multimodal Perturbation

Information Systems Research
An investigation of model-based microdata masking for magnitude tabular data release

PSD'12 Proceedings of the 2012 international conference on Privacy in Statistical Databases
Anonymization methods for taxonomic microdata

PSD'12 Proceedings of the 2012 international conference on Privacy in Statistical Databases
n-cycle swapping for the American community survey

PSD'12 Proceedings of the 2012 international conference on Privacy in Statistical Databases
Breaching Euclidean distance-preserving data perturbation using few known inputs

Data & Knowledge Engineering
Disclosure Control of Confidential Data by Applying Pac Learning Theory

Journal of Database Management

Quantified Score

Hi-index	0.04

Visualization

Abstract

This study discusses a new procedure for masking confidential numerical dataa procedure called data shufflingin which the values of the confidential variables are shuffled among observations. The shuffled data provides a high level of data utility and minimizes the risk of disclosure. From a practical perspective, data shuffling overcomes reservations about using perturbed or modified confidential data because it retains all the desirable properties of perturbation methods and performs better than other masking techniques in both data utility and disclosure risk. In addition, data shuffling can be implemented using only rank-order data, and thus provides a nonparametric method for masking. We illustrate the applicability of data shuffling for small and large data sets.