Mining association rules with non-uniform privacy concerns

  • Authors:
  • Yi Xia;Yirong Yang;Yun Chi

  • Affiliations:
  • University of California, Los Angeles, CA;University of California, Los Angeles, CA;University of California, Los Angeles, CA

  • Venue:
  • Proceedings of the 9th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Privacy concerns have become an important issue in data mining. A popular way to preserve privacy is to randomize the dataset to be mined in a systematic way and mine the randomized dataset instead. On the other hand, people usually have different privacy concerns for different attributes in data. E.g., in survey data, the sensitivity of questions varies. Appropriate use of this information can lead to more accurate data mining results. However, this information has not been fully utilized by many privacy preserving association rule mining algorithms.In this paper, we generalize the privacy preserving association rule mining problem by allowing different attributes to have different levels of privacy, that is, using different randomization factors for values of different attributes in the randomization process. We also propose an efficient algorithm RE (Recursive Estimation) to estimate the support of itemsets under this framework. Both theoretical and empirical results show that the use of non-uniform randomization factors improves the accuracy of the support estimates, compared to the use of one conservative randomization factor.