Randomized algorithms
BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Security of statistical databases: multidimensional transformation
ACM Transactions on Database Systems (TODS)
Efficient algorithms for mining outliers from large data sets
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Privacy-preserving data mining
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
On the design and quantification of privacy preserving data mining algorithms
PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Algorithms for Mining Distance-Based Outliers in Large Datasets
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Cryptographic techniques for privacy-preserving data mining
ACM SIGKDD Explorations Newsletter
k-anonymity: a model for protecting privacy
International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Achieving k-anonymity privacy protection using generalization and suppression
International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Transforming data to satisfy privacy constraints
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Privacy-preserving Distributed Clustering using Generative Models
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Privacy-preserving k-means clustering over vertically partitioned data
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
A Framework for High-Accuracy Privacy-Preserving Mining
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Top-Down Specialization for Information and Privacy Preservation
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Data Privacy through Optimal k-Anonymization
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Incognito: efficient full-domain K-anonymity
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
How to generate and exchange secrets
SFCS '86 Proceedings of the 27th Annual Symposium on Foundations of Computer Science
Distributed clustering based on sampling local density estimates
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Practical issues on privacy-preserving health data mining
PAKDD'07 Proceedings of the 2007 international conference on Emerging technologies in knowledge discovery and data mining
Hi-index | 0.00 |
The challenge of privacy-preserving data mining lies in respecting privacy requirements while discovering the original interesting patterns or structures. Existing methods loose the correlations among attributes by transforming the different attributes independently, or cannot guarantee the minimum abstraction level required by legal policies. In this paper, we propose a novel privacy-preserving transformation framework for distance-based mining operations based on the concept of privacy-preserving MicroClusters that satisfy a privacy constraint as well as a significance constraint. Our framework well extends the robustness of the state-of-the-art k-anonymity model by introducing a privacy constraint (minimum radius) while keeping its effectiveness by a significance constraint (minimum number of corresponding data records). The privacy-preserving MicroClusters are made public for data mining purposes, but the original data records are kept private. We present efficient methods for generating and maintaining privacy-preserving MicroClusters and show that data mining operations such as clustering can easily be adapted to the public data represented by MicroClusters instead of the private data records. The experiment demonstrates that the proposed methods achieve accurate clusterings results while preserving the privacy.