Data anonymization using an improved utility measurement

Authors:
Stuart Morton;Malika Mahoui;P. Joseph Gibson
Affiliations:
IUPUI, Indianapolis, IN, USA;IUPUI, Indianapolis, IN, USA;Marion County Public Health Department, Indianapolis, IN, USA
Venue:
Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium
Year:
2012

Citing 16
Cited 1

Security-control methods for statistical databases: a comparative study

ACM Computing Surveys (CSUR)
Protecting Respondents' Identities in Microdata Release

IEEE Transactions on Knowledge and Data Engineering
k-anonymity: a model for protecting privacy

International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Achieving k-anonymity privacy protection using generalization and suppression

International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Transforming data to satisfy privacy constraints

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
State-of-the-art in privacy preserving data mining

ACM SIGMOD Record
Data Privacy through Optimal k-Anonymization

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
On the complexity of optimal K-anonymity

PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Incognito: efficient full-domain K-anonymity

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
On k-anonymity and the curse of dimensionality

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Mondrian Multidimensional K-Anonymity

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Utility-based anonymization using local recoding

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
L-diversity: Privacy beyond k-anonymity

ACM Transactions on Knowledge Discovery from Data (TKDD)
Utility-based anonymization for privacy preservation with less information loss

ACM SIGKDD Explorations Newsletter
Local and global recoding methods for anonymizing set-valued data

The VLDB Journal — The International Journal on Very Large Data Bases
Anonymizing tables

ICDT'05 Proceedings of the 10th international conference on Database Theory

An automated data utility clustering methodology using data constraint rules

Proceedings of the 2012 international workshop on Smart health and wellbeing

Quantified Score

Hi-index	0.00

Visualization

Abstract

As medical data continues to transition to an electronic format, opportunities arise for researchers to use this microdata to discover patterns and increase knowledge in order to improve patient care. Now more than ever, it is critical to protect the identities of the patients contained in these databases. Even after removing obvious "identifier" attributes, such as social security numbers or first and last names, that clearly identify a specific person, it is possible to join "quasi-identifier" attributes from two or more publicly available databases to identify individuals. K-anonymity is an established approach that has been used to ensure that no one individual can be distinguished within a group of at least k individuals. The majority of the proposed approaches implementing k-anonymity have focused on improving the efficiency of algorithms implementing k-anonymity; less emphasis has been put towards ensuring the "utility" of anonymized data from a researchers' perspective. We propose a data utility measurement, called the research value (RV), which evaluates how well common cutoffs for numerical data or groupings in categorical data are preserved during the anonymization process. The proposed algorithm utilizing the new utility function scales efficiently when the number of attributes is large, while still ensuring that the generalization process is dictated by the data content expert's assessment of the utility of the generalized data.