Priority-Based k-anonymity accomplished by weighted generalisation structures

Authors:
Konrad Stark;Johann Eder;Kurt Zatloukal
Affiliations:
Institute of Pathology, Medical University Graz, Graz;Department of Knowledge and Business Engineering, University of Vienna, Wien;Institute of Pathology, Medical University Graz, Graz
Venue:
DaWaK'06 Proceedings of the 8th international conference on Data Warehousing and Knowledge Discovery
Year:
2006

Citing 5
Cited 2

Protecting Respondents' Identities in Microdata Release

IEEE Transactions on Knowledge and Data Engineering
Achieving k-anonymity privacy protection using generalization and suppression

International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Bottom-Up Generalization: A Data Mining Solution to Privacy Protection

ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Top-Down Specialization for Information and Privacy Preservation

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Incognito: efficient full-domain K-anonymity

Proceedings of the 2005 ACM SIGMOD international conference on Management of data

GATiB-CSCW, Medical Research Supported by a Service-Oriented Collaborative System

CAiSE '08 Proceedings of the 20th international conference on Advanced Information Systems Engineering
On the use of economic price theory to find the optimum levels of privacy and information utility in non-perturbative microdata anonymisation

Data & Knowledge Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Biobanks are gaining in importance by storing large collections of patient’s clinical data (e.g. disease history, laboratory parameters, diagnosis, life style) together with biological materials such as tissue samples, blood or other body fluids. When releasing these patient-specific data for medical studies privacy protection has to be guaranteed for ethical and legal reasons. k-anonymity may be used to ensure privacy by generalising and suppressing attributes in order to release sufficient data twins that mask patients’ identities. However, data transformation techniques like generalisation may produce anonymised data unusable for medical studies because some attributes become too coarse-grained. We propose a priority-driven anonymisation technique that allows to specify the degree of acceptable information loss for each attribute separately. We use generalisation and suppression of attributes together with a weighting-scheme for quantifying generalisation steps. Our approach handles both numerical and categorical attributes and provides a data anonymisation based on priorities and weights. The anonymisation algorithm described in this paper has been implemented and tested on a carcinoma data set. We discuss some general privacy protecting methods for medical data and show some medical-relevant use cases that benefit from our anonymisation technique.