On the identity anonymization of high-dimensional rating data

Authors:
Xiaoxun Sun;Hua Wang;Yanchun Zhang
Affiliations:
Australian Council for Educational Research, Vic., Australia;University of Southern Queensland, Qld., Australia;Victoria University, Vic., Australia
Venue:
Concurrency and Computation: Practice & Experience
Year:
2012

Citing 23
Cited 0

Protecting Respondents' Identities in Microdata Release

IEEE Transactions on Knowledge and Data Engineering
k-anonymity: a model for protecting privacy

International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Transforming data to satisfy privacy constraints

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Association Rule Hiding

IEEE Transactions on Knowledge and Data Engineering
Bottom-Up Generalization: A Data Mining Solution to Privacy Protection

ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Top-Down Specialization for Information and Privacy Preservation

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Data Privacy through Optimal k-Anonymization

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Incognito: efficient full-domain K-anonymity

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
On k-anonymity and the curse of dimensionality

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Coding and Information Theory

Coding and Information Theory
Blocking Anonymity Threats Raised by Frequent Itemset Mining

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
\ell -Diversity: Privacy Beyond \kappa -Anonymity

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Injecting utility into anonymized datasets

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
You are what you say: privacy risks of public mentions

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Anonymity preserving pattern discovery

The VLDB Journal — The International Journal on Very Large Data Bases
Robust De-anonymization of Large Sparse Datasets

SP '08 Proceedings of the 2008 IEEE Symposium on Security and Privacy
Anonymizing transaction databases for publication

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
On the Anonymization of Sparse High-Dimensional Data

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Injecting purpose and trust into data anonymisation

Proceedings of the 18th ACM conference on Information and knowledge management
Extended k-anonymity models against sensitive attribute disclosure

Computer Communications
k-anonymous patterns

PKDD'05 Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases
Satisfying privacy requirements: one step before anonymization

PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

We study the challenges of protecting the privacy of individuals in a large public survey rating data. The survey rating data usually contains both ratings of sensitive and non-sensitive issues. The ratings of sensitive issues involve personal privacy. Although the survey participants do not reveal any of their ratings, their survey records are potentially identifiable by using information from other public sources. None of the existing anonymization principles (e.g. k-anonymity, l-diversity, etc.) can effectively prevent such breaches in large survey rating data sets. In this paper, we tackle the problem by defining a principle called (k, epsilon, l)-anonymity. The principle requires that, for each transaction t in the given survey rating data T, at least (k − 1) other transactions in T must have ratings similar to t, where the similarity is controlled by ε and the standard deviation of sensitive ratings is at least l. We propose a greedy approach to anonymize the survey rating data that scales almost linearly with the input size, and we apply the method to two real-life data sets to demonstrate their efficiency and practical utility. Copyright © 2011 John Wiley & Sons, Ltd.