Data ShufflingA New Masking Approach for Numerical Data

  • Authors:
  • Krishnamurty Muralidhar;Rathindra Sarathy

  • Affiliations:
  • Gatton College of Business and Economics, University of Kentucky, Lexington, Kentucky 40506;Spears School of Business, Oklahoma State University, Stillwater, Oklahoma 74078

  • Venue:
  • Management Science
  • Year:
  • 2006

Quantified Score

Hi-index 0.04

Visualization

Abstract

This study discusses a new procedure for masking confidential numerical dataa procedure called data shufflingin which the values of the confidential variables are shuffled among observations. The shuffled data provides a high level of data utility and minimizes the risk of disclosure. From a practical perspective, data shuffling overcomes reservations about using perturbed or modified confidential data because it retains all the desirable properties of perturbation methods and performs better than other masking techniques in both data utility and disclosure risk. In addition, data shuffling can be implemented using only rank-order data, and thus provides a nonparametric method for masking. We illustrate the applicability of data shuffling for small and large data sets.