On-the-fly generalization hierarchies for numerical attributes revisited

Authors:
Alina Campan;Nicholas Cooper;Traian Marius Truta
Affiliations:
Department of Computer Science, Northern Kentucky University, Highland Heights, KY;Department of Computer Science, Northern Kentucky University, Highland Heights, KY;Department of Computer Science, Northern Kentucky University, Highland Heights, KY
Venue:
SDM'11 Proceedings of the 8th VLDB international conference on Secure data management
Year:
2011

Citing 28
Cited 0

C4.5: programs for machine learning

C4.5: programs for machine learning
Generalization-based data mining in object-oriented databases using an object cube model

Data & Knowledge Engineering - Special jubilee issue: DKE 25
Feature Selection via Discretization

IEEE Transactions on Knowledge and Data Engineering
Protecting Respondents' Identities in Microdata Release

IEEE Transactions on Knowledge and Data Engineering
Data-Driven Discovery of Quantitative Rules in Relational Databases

IEEE Transactions on Knowledge and Data Engineering
On Changing Continuous Attributes into Ordered Discrete Attributes

EWSL '91 Proceedings of the European Working Session on Machine Learning
Discovery of Multiple-Level Association Rules from Large Databases

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
k-anonymity: a model for protecting privacy

International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Achieving k-anonymity privacy protection using generalization and suppression

International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Transforming data to satisfy privacy constraints

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Generalization and decision tree induction: efficient classification in data mining

RIDE '97 Proceedings of the 7th International Workshop on Research Issues in Data Engineering (RIDE '97) High Performance Database Management for Large-Scale Applications
An Association Thesaurus for Information Retrieval

An Association Thesaurus for Information Retrieval
Clustering intrusion detection alarms to support root cause analysis

ACM Transactions on Information and System Security (TISSEC)
Data Privacy through Optimal k-Anonymization

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Data Mining: Concepts and Techniques

Data Mining: Concepts and Techniques
Introduction to Data Mining, (First Edition)

Introduction to Data Mining, (First Edition)
Mondrian Multidimensional K-Anonymity

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
\ell -Diversity: Privacy Beyond \kappa -Anonymity

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Privacy Protection: p-Sensitive k-Anonymity Property

ICDEW '06 Proceedings of the 22nd International Conference on Data Engineering Workshops
A crossover operator for the k- anonymity problem

Proceedings of the 8th annual conference on Genetic and evolutionary computation
(α, k)-anonymity: an enhanced k-anonymity model for privacy preserving data publishing

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Preservation of proximity privacy in publishing numerical sensitive data

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Automatic generation of concept hierarchies using WordNet

Expert Systems with Applications: An International Journal
(t, λ)-Uniqueness: Anonymity Management for Data Publication

ICIS '08 Proceedings of the Seventh IEEE/ACIS International Conference on Computer and Information Science (icis 2008)
A framework for efficient data anonymization under privacy and accuracy constraints

ACM Transactions on Database Systems (TODS)
Improving inference through conceptual clustering

AAAI'87 Proceedings of the sixth National conference on Artificial intelligence - Volume 2
ChiMerge: discretization of numeric attributes

AAAI'92 Proceedings of the tenth national conference on Artificial intelligence
On-the-fly hierarchies for numerical attributes in data anonymization

SDM'10 Proceedings of the 7th VLDB conference on Secure data management

Quantified Score

Hi-index	0.00

Visualization

Abstract

Generalization hierarchies are frequently used in computer science, statistics, biology, bioinformatics, and other areas when less specific values are needed for data analysis. Generalization is also one of the most used disclosure control technique for anonymizing data. For numerical attributes, generalization is performed either by using existing predefined generalization hierarchies or a hierarchy-free model. Because hierarchy-free generalization is not suitable for anonymization in all possible scenarios, generalization hierarchies are of particular interest for data anonymization. Traditionally, these hierarchies were created by the data owner with help from the domain experts. But while it is feasible to construct a hierarchy of small size, the effort increases for hierarchies that have many levels. Therefore, new approaches of creating these numerical hierarchies involve their automatic/on-the-fly generation. In this paper we extend an existing method for creating on-the-fly generalization hierarchies, we present several existing information loss measures used to assess the quality of anonymized data, and we run a series of experiments that show that our new method improves over existing methods to automatically generate on-the-fly numerical generalization hierarchies.