Computer systems that learn: classification and prediction methods from statistics, neural nets, machine learning, and expert systems
C4.5: programs for machine learning
C4.5: programs for machine learning
Generalizing data to provide anonymity when disclosing information (abstract)
PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Protecting Respondents' Identities in Microdata Release
IEEE Transactions on Knowledge and Data Engineering
Datafly: A System for Providing Anonymity in Medical Data
Proceedings of the IFIP TC11 WG11.3 Eleventh International Conference on Database Securty XI: Status and Prospects
Achieving k-anonymity privacy protection using generalization and suppression
International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Transforming data to satisfy privacy constraints
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Bottom-Up Generalization: A Data Mining Solution to Privacy Protection
ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Top-Down Specialization for Information and Privacy Preservation
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Data Privacy through Optimal k-Anonymization
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
On the complexity of optimal K-anonymity
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Incognito: efficient full-domain K-anonymity
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Template-Based Privacy Preservation in Classification Problems
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Mondrian Multidimensional K-Anonymity
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
\ell -Diversity: Privacy Beyond \kappa -Anonymity
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Personalized privacy preservation
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Anonymizing sequential releases
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
(α, k)-anonymity: an enhanced k-anonymity model for privacy preserving data publishing
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Utility-based anonymization using local recoding
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Handicapping attacker's confidence: an alternative to k-anonymization
Knowledge and Information Systems
Integrating private databases for data analysis
ISI'05 Proceedings of the 2005 IEEE international conference on Intelligence and Security Informatics
On static and dynamic methods for condensation-based privacy-preserving data mining
ACM Transactions on Database Systems (TODS)
Anonymity for continuous data publishing
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Data privacy protection in multi-party clustering
Data & Knowledge Engineering
IDEAL '08 Proceedings of the 9th International Conference on Intelligent Data Engineering and Automated Learning
Privacy-preserving data mashup
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Privacy protection for RFID data
Proceedings of the 2009 ACM symposium on Applied Computing
Privately detecting bursts in streaming, distributed time series data
Data & Knowledge Engineering
Privacy-preserving data publishing for cluster analysis
Data & Knowledge Engineering
A brief survey on anonymization techniques for privacy preserving publishing of social network data
ACM SIGKDD Explorations Newsletter
Anonymizing healthcare data: a case study on the blood transfusion service
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Anonymizing location-based RFID data
C3S2E '09 Proceedings of the 2nd Canadian Conference on Computer Science and Software Engineering
Preserving Privacy in Time Series Data Classification by Discretization
MLDM '09 Proceedings of the 6th International Conference on Machine Learning and Data Mining in Pattern Recognition
A novel anonymization algorithm: Privacy protection and knowledge preservation
Expert Systems with Applications: An International Journal
Walking in the crowd: anonymizing trajectory data for pattern analysis
Proceedings of the 18th ACM conference on Information and knowledge management
Privacy-preserving data publishing: A survey of recent developments
ACM Computing Surveys (CSUR)
A data perturbation approach to sensitive classification rule hiding
Proceedings of the 2010 ACM Symposium on Applied Computing
Centralized and Distributed Anonymization for High-Dimensional Healthcare Data
ACM Transactions on Knowledge Discovery from Data (TKDD)
A granular agent evolutionary algorithm for classification
Applied Soft Computing
Verification of data pattern for interactive privacy preservation model
Proceedings of the 2011 ACM Symposium on Applied Computing
Differentially private data release for data mining
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Privacy preservation for associative classification: an approximation algorithm
International Journal of Business Intelligence and Data Mining
Anonymity meets game theory: secure data integration with malicious participants
The VLDB Journal — The International Journal on Very Large Data Bases
Hiding emerging patterns with local recoding generalization
PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
Information based data anonymization for classification utility
Data & Knowledge Engineering
The application of differential privacy to health data
Proceedings of the 2012 Joint EDBT/ICDT Workshops
Secure distributed framework for achieving ε-differential privacy
PETS'12 Proceedings of the 12th international conference on Privacy Enhancing Technologies
Preserving Privacy in Time Series Data Mining
International Journal of Data Warehousing and Mining
Incremental processing and indexing for k, e-anonymisation
International Journal of Information and Computer Security
A new tool for sharing and querying of clinical documents modeled using HL7 Version 3 standard
Computer Methods and Programs in Biomedicine
Improving accuracy of classification models induced from anonymized datasets
Information Sciences: an International Journal
The Journal of Supercomputing
Hi-index | 0.00 |
Classification is a fundamental problem in data analysis. Training a classifier requires accessing a large collection of data. Releasing person-specific data, such as customer data or patient records, may pose a threat to an individual's privacy. Even after removing explicit identifying information such as Name and SSN, it is still possible to link released records back to their identities by matching some combination of nonidentifying attributes such as \{Sex, Zip, Birthdate\}. A useful approach to combat such linking attacks, called k-anonymization [1], is anonymizing the linking attributes so that at least k released records match each value combination of the linking attributes. Previous work attempted to find an optimal k-anonymization that minimizes some data distortion metric. We argue that minimizing the distortion to the training data is not relevant to the classification goal that requires extracting the structure of predication on the "future” data. In this paper, we propose a k-anonymization solution for classification. Our goal is to find a k-anonymization, not necessarily optimal in the sense of minimizing data distortion, which preserves the classification structure. We conducted intensive experiments to evaluate the impact of anonymization on the classification on future data. Experiments on real-life data show that the quality of classification can be preserved even for highly restrictive anonymity requirements.