C4.5: programs for machine learning
C4.5: programs for machine learning
Generalizing data to provide anonymity when disclosing information (abstract)
PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Information Retrieval
Protecting Respondents' Identities in Microdata Release
IEEE Transactions on Knowledge and Data Engineering
k-anonymity: a model for protecting privacy
International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Achieving k-anonymity privacy protection using generalization and suppression
International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Transforming data to satisfy privacy constraints
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Privacy-preserving k-means clustering over vertically partitioned data
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Top-Down Specialization for Information and Privacy Preservation
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Data Privacy through Optimal k-Anonymization
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Incognito: efficient full-domain K-anonymity
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
On k-anonymity and the curse of dimensionality
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Knowledge discovery by probabilistic clustering of distributed databases
Data & Knowledge Engineering
Template-Based Privacy Preservation in Classification Problems
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Mondrian Multidimensional K-Anonymity
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Achieving anonymity via clustering
Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Injecting utility into anonymized datasets
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Anonymizing sequential releases
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
(α, k)-anonymity: an enhanced k-anonymity model for privacy preserving data publishing
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Utility-based anonymization using local recoding
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Online clustering of parallel data streams
Data & Knowledge Engineering
L-diversity: Privacy beyond k-anonymity
ACM Transactions on Knowledge Discovery from Data (TKDD)
ST-DBSCAN: An algorithm for clustering spatial-temporal data
Data & Knowledge Engineering
Handicapping attacker's confidence: an alternative to k-anonymization
Knowledge and Information Systems
Capturing data usefulness and privacy protection in K-anonymisation
Proceedings of the 2007 ACM symposium on Applied computing
Anonymizing Classification Data for Privacy Preservation
IEEE Transactions on Knowledge and Data Engineering
Investigating diversity of clustering methods: An empirical comparison
Data & Knowledge Engineering
Privacy preserving clustering on horizontally partitioned data
Data & Knowledge Engineering
k-Unlinkability: A privacy protection model for distributed data
Data & Knowledge Engineering
Anonymity for continuous data publishing
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Data utility and privacy protection trade-off in k-anonymisation
PAIS '08 Proceedings of the 2008 international workshop on Privacy and anonymity in information society
Privacy-preserving data mashup
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Privacy protection for RFID data
Proceedings of the 2009 ACM symposium on Applied Computing
Integrating private databases for data analysis
ISI'05 Proceedings of the 2005 IEEE international conference on Intelligence and Security Informatics
Robust clustering methods: a unified view
IEEE Transactions on Fuzzy Systems
Anonymizing location-based RFID data
C3S2E '09 Proceedings of the 2nd Canadian Conference on Computer Science and Software Engineering
Collaborative clustering with background knowledge
Data & Knowledge Engineering
Privacy-preserving data publishing: A survey of recent developments
ACM Computing Surveys (CSUR)
Background knowledge integration in clustering using purity indexes
KSEM'10 Proceedings of the 4th international conference on Knowledge science, engineering and management
Fuzzy based clustering algorithm for privacy preserving data mining
International Journal of Business Information Systems
Privacy-preserving publishing microdata with full functional dependencies
Data & Knowledge Engineering
Privacy-aware collection of aggregate spatial data
Data & Knowledge Engineering
Knowledge hiding from tree and graph databases
Data & Knowledge Engineering
Privacy-preserving back-propagation and extreme learning machine algorithms
Data & Knowledge Engineering
Clustering-oriented privacy-preserving data publishing
Knowledge-Based Systems
Low Dimensional Data Privacy Preservation Using Multi Layer Artificial Neural Network
International Journal of Intelligent Information Technologies
Anonymizing classification data using rough set theory
Knowledge-Based Systems
Fast clustering-based anonymization approaches with time constraints for data streams
Knowledge-Based Systems
Hi-index | 0.00 |
Releasing person-specific data could potentially reveal sensitive information about individuals. k-anonymization is a promising privacy protection mechanism in data publishing. Although substantial research has been conducted on k-anonymization and its extensions in recent years, only a few prior works have considered releasing data for some specific purpose of data analysis. This paper presents a practical data publishing framework for generating a masked version of data that preserves both individual privacy and information usefulness for cluster analysis. Experiments on real-life data suggest that by focusing on preserving cluster structure in the masking process, the cluster quality is significantly better than the cluster quality of the masked data without such focus. The major challenge of masking data for cluster analysis is the lack of class labels that could be used to guide the masking process. Our approach converts the problem into the counterpart problem for classification analysis, wherein class labels encode the cluster structure in the data, and presents a framework to evaluate the cluster quality on the masked data.