Combination of Multiple Classifiers Using Local Accuracy Estimates
IEEE Transactions on Pattern Analysis and Machine Intelligence
Privacy-preserving data mining
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Cryptographic techniques for privacy-preserving data mining
ACM SIGKDD Explorations Newsletter
Limiting privacy breaches in privacy preserving data mining
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Privacy preserving mining of association rules
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
On the Privacy Preserving Properties of Random Data Perturbation Techniques
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Privacy-preserving Distributed Clustering using Generative Models
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Bottom-Up Generalization: A Data Mining Solution to Privacy Protection
ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
A Framework for High-Accuracy Privacy-Preserving Mining
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Top-Down Specialization for Information and Privacy Preservation
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Incognito: efficient full-domain K-anonymity
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
On k-anonymity and the curse of dimensionality
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Knowledge as a Service and Knowledge Breaching
SCC '05 Proceedings of the 2005 IEEE International Conference on Services Computing - Volume 01
\ell -Diversity: Privacy Beyond \kappa -Anonymity
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Achieving anonymity via clustering
Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Injecting utility into anonymized datasets
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Personalized privacy preservation
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
(α, k)-anonymity: an enhanced k-anonymity model for privacy preserving data publishing
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Maintaining data privacy in association rule mining
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Privacy-preserving decision tree mining based on random substitutions
ETRICS'06 Proceedings of the 2006 international conference on Emerging Trends in Information and Communication Security
PinKDD'07: privacy, security, and trust in KDD post-workshop report
ACM SIGKDD Explorations Newsletter - Special issue on visual analytics
Extending l-diversity to generalize sensitive data
Data & Knowledge Engineering
Utility-guided Clustering-based Transaction Data Anonymization
Transactions on Data Privacy
Information based data anonymization for classification utility
Data & Knowledge Engineering
A Knowledge Model Sharing Based Approach to Privacy-Preserving Data Mining
Transactions on Data Privacy
Hi-index | 0.00 |
Privacy-preserving data mining (PPDM) is an important topic to both industry and academia. In general there are two approaches to tackling PPDM, one is statistics-based and the other is crypto-based. The statistics-based approach has the advantage of being efficient enough to deal with large volume of datasets. The basic idea underlying this approach is to let the data owners publish some sanitized versions of their data (e.g., via perturbation, generalization, or l-diversification), which are then used for extracting useful knowledge models such as decision trees. In this paper, we present a new method for statistics-based PPDM. Our method differs from the existing ones because it lets the data owners share with each other the knowledge models extracted from their own private datasets, rather than to let the data owners publish any of their own private datasets (not even in any sanitized form). The knowledge models derived from the individual datasets are used to generate some pseudo-data that are then used for extracting the desired "global" knowledge models. While instrumental, there are some technical subtleties that need be carefully addressed. Specifically, we propose an algorithm for generating pseudo-data according to paths of a decision tree, a method for adapting anonymity measures of datasets to measure the privacy of decision trees, and an algorithm that prunes a decision tree to satisfy a given anonymity requirement. Through an empirical study, we show that predictive models learned using our method are significantly more accurate than those learned using the existing l-diversity method in both centralized and distributed environments with different types of datasets, predictive models, and utility measures.