Disclosure Limitation through Additive Noise Data Masking: Analysis of Skewed Sensitive Data

  • Authors:
  • Sumitra Mukherjee;George T. Duncan

  • Affiliations:
  • -;-

  • Venue:
  • HICSS '97 Proceedings of the 30th Hawaii International Conference on System Sciences: Information System Track-Organizational Systems and Technology - Volume 3
  • Year:
  • 1997

Quantified Score

Hi-index 0.00

Visualization

Abstract

A widely used method for confidentiality protection instatistical databases is to add zero mean noise tosensitive attribute values. Most studies assume that theattributes are normally distributed Using anexponential random variable as an example, thisarticle investigates the effect of additive noise datamasking for attributes with skewed distributions.Examples of exponentially distributed sensitiveattributes used for statistical analysis include the timebetween testing HIV positive and the manifestation ofsymptoms for AIDS and the time between consecutivearrests for repeat offenders. We analyze the issues ofdata quality and confidentiality protection. Our resultsindicate that skewed attributes are, in some sense,better protected than normally distributed attributesunder additive noise data masking.