On static and dynamic methods for condensation-based privacy-preserving data mining

Authors:
Charu C. Aggarwal;Philip S. Yu
Affiliations:
IBM T. J. Watson Research Center, Hawthorne, NY;IBM T. J. Watson Research Center, Hawthorne, NY
Venue:
ACM Transactions on Database Systems (TODS)
Year:
2008

Citing 33
Cited 5

Practical data-swapping: the first steps

ACM Transactions on Database Systems (TODS)
A data distortion by probability distribution

ACM Transactions on Database Systems (TODS)
TRUSTe: an online privacy seal program

Communications of the ACM
Privacy interfaces for information management

Communications of the ACM
Privacy-preserving data mining

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
On the design and quantification of privacy preserving data mining algorithms

PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Automatic Construction of Decision Trees from Data: A Multi-Disciplinary Survey

Data Mining and Knowledge Discovery
Protecting Respondents' Identities in Microdata Release

IEEE Transactions on Knowledge and Data Engineering
Practical Data-Oriented Microaggregation for Statistical Disclosure Control

IEEE Transactions on Knowledge and Data Engineering
Some geometric clustering problems

Nordic Journal of Computing
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Privacy preserving mining of association rules

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Transforming data to satisfy privacy constraints

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Privacy preserving association rule mining in vertically partitioned data

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
On the Privacy Preserving Properties of Random Data Perturbation Techniques

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
A Framework for High-Accuracy Privacy-Preserving Mining

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Top-Down Specialization for Information and Privacy Preservation

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Data Privacy through Optimal k-Anonymization

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
On the complexity of optimal K-anonymity

PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Practical privacy: the SuLQ framework

Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Privacy-enhancing k-anonymization of customer data

Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Incognito: efficient full-domain K-anonymity

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
To do or not to do: the dilemma of disclosing anonymized data

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
On k-anonymity and the curse of dimensionality

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Random Projection-Based Multiplicative Data Perturbation for Privacy Preserving Distributed Data Mining

IEEE Transactions on Knowledge and Data Engineering
\ell -Diversity: Privacy Beyond \kappa -Anonymity

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Privacy via pseudorandom sketches

Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Personalized privacy preservation

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Workload-aware anonymization

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Anonymizing Classification Data for Privacy Preservation

IEEE Transactions on Knowledge and Data Engineering
Maintaining data privacy in association rule mining

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Anonymity preserving pattern discovery

The VLDB Journal — The International Journal on Very Large Data Bases
Providing k-anonymity in data mining

The VLDB Journal — The International Journal on Very Large Data Bases

Privacy-preserving data publishing: A survey of recent developments

ACM Computing Surveys (CSUR)
Testing software in age of data privacy: a balancing act

Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering
Permutation anonymization: improving anatomy for privacy preservation in data publication

PAKDD'11 Proceedings of the 15th international conference on New Frontiers in Applied Data Mining
Privacy-preserving back-propagation and extreme learning machine algorithms

Data & Knowledge Engineering
A Privacy Preserving Method Using Privacy Enhancing Techniques for Location Based Services

Mobile Networks and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

In recent years, privacy-preserving data mining has become an important problem because of the large amount of personal data which is tracked by many business applications. In many cases, users are unwilling to provide personal information unless the privacy of sensitive information is guaranteed. In this paper, we propose a new framework for privacy-preserving data mining of multidimensional data. Previous work for privacy-preserving data mining uses a perturbation approach which reconstructs data distributions in order to perform the mining. Such an approach treats each dimension independently and therefore ignores the correlations between the different dimensions. In addition, it requires the development of a new distribution-based algorithm for each data mining problem, since it does not use the multidimensional records, but uses aggregate distributions of the data as input. This leads to a fundamental re-design of data mining algorithms. In this paper, we will develop a new and flexible approach for privacy-preserving data mining that does not require new problem-specific algorithms, since it maps the original data set into a new anonymized data set. These anonymized data closely match the characteristics of the original data including the correlations among the different dimensions. We will show how to extend the method to the case of data streams. We present empirical results illustrating the effectiveness of the method. We also show the efficiency of the method for data streams.