Privacy Preserving Categorical Data Analysis with Unknown Distortion Parameters

  • Authors:
  • Ling Guo;Xintao Wu

  • Affiliations:
  • Software and Information Systems Department, University of North Carolina at Charlotte, Charlotte, NC 28223, USA. e-mail: lguo2@uncc.edu;Software and Information Systems Department, University of North Carolina at Charlotte, Charlotte, NC 28223, USA. e-mail: xwu@uncc.edu

  • Venue:
  • Transactions on Data Privacy
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Randomized Response techniques have been investigated in privacy preserving categorical data analysis. However, the released distortion parameters can be exploited by attackers to breach privacy. In this paper, we investigate whether data mining or statistical analysis tasks can still be conducted on randomized data when distortion parameters are not disclosed to data miners. We first examine how various objective association measures between two variables may be affected by randomization. We then extend to multiple variables by examining the feasibility of hierarchical loglinear modeling. Finally we show some classic data mining tasks that cannot be applied on the randomized data directly.