Bottom-Up Generalization: A Data Mining Solution to Privacy Protection

Authors:
Ke Wang;Philip S. Yu;Sourav Chakraborty
Affiliations:
Simon Fraser University;IBM T. J. Watson Research Center;Simon Fraser University
Venue:
ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Year:
2004

Citing 0
Cited 68

Top-Down Specialization for Information and Privacy Preservation

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Deriving private information from randomized data

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Incognito: efficient full-domain K-anonymity

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Template-Based Privacy Preservation in Classification Problems

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Personalized privacy preservation

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Workload-aware anonymization

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Anonymizing sequential releases

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
(α, k)-anonymity: an enhanced k-anonymity model for privacy preserving data publishing

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Anatomy: simple and effective privacy preservation

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Utility-based anonymization for privacy preservation with less information loss

ACM SIGKDD Explorations Newsletter
Handicapping attacker's confidence: an alternative to k-anonymization

Knowledge and Information Systems
Anonymizing Classification Data for Privacy Preservation

IEEE Transactions on Knowledge and Data Engineering
Minimality attack in privacy preserving data publishing

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
K-anonymization as spatial indexing: toward scalable and incremental anonymization

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Towards optimal k-anonymization

Data & Knowledge Engineering
An efficient hash-based algorithm for minimal k-anonymity

ACSC '08 Proceedings of the thirty-first Australasian conference on Computer science - Volume 74
Providing k-anonymity in data mining

The VLDB Journal — The International Journal on Very Large Data Bases
Workload-aware anonymization techniques for large-scale datasets

ACM Transactions on Database Systems (TODS)
Privacy Preserving Data Mining Research: Current Status and Key Issues

ICCS '07 Proceedings of the 7th international conference on Computational Science, Part III: ICCS 2007
ARUBA: A Risk-Utility-Based Algorithm for Data Disclosure

SDM '08 Proceedings of the 5th VLDB workshop on Secure Data Management
Fixed-Parameter Tractability of Anonymizing Data by Suppressing Entries

COCOA 2008 Proceedings of the 2nd international conference on Combinatorial Optimization and Applications
Does enforcing anonymity mean decreasing data usefulness?

Proceedings of the 4th ACM workshop on Quality of protection
Towards privacy-preserving integration of distributed heterogeneous data

Proceedings of the 2nd PhD workshop on Information and knowledge management
Disclosure Analysis and Control in Statistical Databases

ESORICS '08 Proceedings of the 13th European Symposium on Research in Computer Security: Computer Security
A Novel Heuristic Algorithm for Privacy Preserving of Associative Classification

PRICAI '08 Proceedings of the 10th Pacific Rim International Conference on Artificial Intelligence: Trends in Artificial Intelligence
On the comparison of microdata disclosure control algorithms

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
An efficient online auditing approach to limit private data disclosure

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
HIDE: heterogeneous information DE-identification

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Anonymization-based attacks in privacy-preserving data publishing

ACM Transactions on Database Systems (TODS)
On the tradeoff between privacy and utility in data publishing

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
k-Anonymous data collection

Information Sciences: an International Journal
A multi-objective approach to data sharing with privacy constraints and preference based objectives

Proceedings of the 11th Annual conference on Genetic and evolutionary computation
(α, k)-anonymous data publishing

Journal of Intelligent Information Systems
A distributed approach to enabling privacy-preserving model-based classifier training

Knowledge and Information Systems
Privacy-Preserving Data Publishing

Foundations and Trends in Databases
An integrated framework for de-identifying unstructured medical data

Data & Knowledge Engineering
POkA: identifying pareto-optimal k-anonymous nodes in a domain hierarchy lattice

Proceedings of the 18th ACM conference on Information and knowledge management
Incremental privacy preservation for associative classification

Proceedings of the ACM first international workshop on Privacy and anonymity for very large databases
Transparent anonymization: Thwarting adversaries who know the algorithm

ACM Transactions on Database Systems (TODS)
The hardness and approximation algorithms for l-diversity

Proceedings of the 13th International Conference on Extending Database Technology
Privacy-preserving data publishing: A survey of recent developments

ACM Computing Surveys (CSUR)
(α, k)-anonymity based privacy preservation by lossy join

APWeb/WAIM'07 Proceedings of the joint 9th Asia-Pacific web and 8th international conference on web-age information management conference on Advances in data and web management
Achieving k-anonymity via a density-based clustering method

APWeb/WAIM'07 Proceedings of the joint 9th Asia-Pacific web and 8th international conference on web-age information management conference on Advances in data and web management
k-anonymization without Q-S associations

APWeb/WAIM'07 Proceedings of the joint 9th Asia-Pacific web and 8th international conference on web-age information management conference on Advances in data and web management
Risk & distortion based K-anonymity

WISA'07 Proceedings of the 8th international conference on Information security applications
Privacy protection on multiple sensitive attributes

ICICS'07 Proceedings of the 9th international conference on Information and communications security
Allowing privacy protection algorithms to jump out of local optimums: an ordered greed framework

PinKDD'07 Proceedings of the 1st ACM SIGKDD international conference on Privacy, security, and trust in KDD
Privacy-preserving data mining through knowledge model sharing

PinKDD'07 Proceedings of the 1st ACM SIGKDD international conference on Privacy, security, and trust in KDD
Privacy-preserving data mining: A feature set partitioning approach

Information Sciences: an International Journal
Privacy-aware location data publishing

ACM Transactions on Database Systems (TODS)
Extending l-diversity to generalize sensitive data

Data & Knowledge Engineering
Relationships and data sanitization: a study in scarlet

Proceedings of the 2010 workshop on New security paradigms
Extended k-anonymity models against sensitive attribute disclosure

Computer Communications
Can the Utility of Anonymized Data be Used for Privacy Breaches?

ACM Transactions on Knowledge Discovery from Data (TKDD)
Publishing anonymous survey rating data

Data Mining and Knowledge Discovery
Priority-Based k-anonymity accomplished by weighted generalisation structures

DaWaK'06 Proceedings of the 8th international conference on Data Warehousing and Knowledge Discovery
Achieving k-anonymity by clustering in attribute hierarchical structures

DaWaK'06 Proceedings of the 8th international conference on Data Warehousing and Knowledge Discovery
Privacy streamliner: a two-stage approach to improving algorithm efficiency

Proceedings of the second ACM conference on Data and Application Security and Privacy
Limiting disclosure of sensitive data in sequential releases of databases

Information Sciences: an International Journal
Integrating private databases for data analysis

ISI'05 Proceedings of the 2005 IEEE international conference on Intelligence and Security Informatics
Shadow: a middleware in pervasive computing environment for user controllable privacy protection

EuroSSC'06 Proceedings of the First European conference on Smart Sensing and Context
Disclosure analysis for two-way contingency tables

PSD'06 Proceedings of the 2006 CENEX-SDC project international conference on Privacy in Statistical Databases
On the identity anonymization of high-dimensional rating data

Concurrency and Computation: Practice & Experience
Information based data anonymization for classification utility

Data & Knowledge Engineering
A Knowledge Model Sharing Based Approach to Privacy-Preserving Data Mining

Transactions on Data Privacy
Anonymizing classification data using rough set theory

Knowledge-Based Systems
A general framework for privacy preserving data publishing

Knowledge-Based Systems
Exploring privacy versus data quality trade-offs in anonymization techniques using multi-objective optimization

Journal of Computer Security

Quantified Score

Hi-index	0.00

Visualization

Abstract

The well-known privacy-preserved data mining modifies existing data mining techniques to randomized data. In this paper, we investigate data mining as a technique for masking data, therefore, termed data mining based privacy protection. This approach incorporates partially the requirement of a targeted data mining task into the process of masking data so that essential structure is preserved in the masked data. The idea is simple but novel: we explore the data generalization concept from data mining as a way to hide detailed information, rather than discover trends and patterns. Once the data is masked, standard data mining techniques can be applied without modification. Our work demonstrated another positive use of data mining technology: not only can it discover useful patterns, but also mask private information. We consider the following privacy problem: a data holder wants to release a version of datafor building classification models, but wants to protect against linking the released data to an external source for inferring sensitive information. We adapt an iterative bottom-up generalization from data mining to generalize the data. The generalized data remains useful to classification but becomes difficult to link to other sources. The generalization space is specified by a hierarchical structure of generalizations. A key is identifying the best generalization to climb up the hierarchy at each iteration. Enumerating all candidate generalizations is impractical. We present a scalable solution that examines at most one generalization in each iteration for each attribute involved in the linking.