CMAR: Accurate and Efficient Classification Based on Multiple Class-Association Rules
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Privacy preserving association rule mining in vertically partitioned data
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Protecting Sensitive Knowledge By Data Sanitization
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
IEEE Transactions on Knowledge and Data Engineering
State-of-the-art in privacy preserving data mining
ACM SIGMOD Record
Privacy preserving mining of association rules
Information Systems - Knowledge discovery and data mining (KDD 2002)
Privacy-Preserving Distributed Mining of Association Rules on Horizontally Partitioned Data
IEEE Transactions on Knowledge and Data Engineering
A new scheme on privacy-preserving data classification
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
A Border-Based Approach for Hiding Sensitive Frequent Itemsets
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
A Heuristic Data Reduction Approach for Associative Classification Rule Hiding
PRICAI '08 Proceedings of the 10th Pacific Rim International Conference on Artificial Intelligence: Trends in Artificial Intelligence
Hi-index | 0.00 |
When data sharing becomes necessary, there is a dilemma in preserving privacy. On one hand sensitive patterns such as classification rules should be hidden from being discovered. On the other hand, hiding the sensitive patterns may affect the data quality. In this paper, we present our studies on the sensitive classification rule hiding problem by data reduction approach, i.e., removing the whole selected records. In our work, we focus on a particular type of classification rule, called canonical associative classification rule. And, the impact on data quality is evaluated in terms of the number of affected non-sensitive rules. We present the observations on the data quality based on a geometric model. According to the observations, we can show the impact precisely without any re-computing. This helps to improve the hiding algorithms from both effectiveness and efficiency perspective. Additionally, we present the algorithmic steps to demonstrate the removal of the records so that the impact on the data quality is potentially minimal. Finally, we conclude our work and outline future work directions for this problem.