Protecting Respondents' Identities in Microdata Release
IEEE Transactions on Knowledge and Data Engineering
k-anonymity: a model for protecting privacy
International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Achieving k-anonymity privacy protection using generalization and suppression
International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
You are what you say: privacy risks of public mentions
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
L-diversity: Privacy beyond k-anonymity
ACM Transactions on Knowledge Discovery from Data (TKDD)
M-invariance: towards privacy preserving re-publication of dynamic datasets
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Robust De-anonymization of Large Sparse Datasets
SP '08 Proceedings of the 2008 IEEE Symposium on Security and Privacy
Myths and fallacies of "Personally Identifiable Information"
Communications of the ACM
Differential privacy: a survey of results
TAMC'08 Proceedings of the 5th international conference on Theory and applications of models of computation
Quantitative information flow, with a view
ESORICS'11 Proceedings of the 16th European conference on Research in computer security
ICALP'06 Proceedings of the 33rd international conference on Automata, Languages and Programming - Volume Part II
On the feasibility of user de-anonymization from shared mobile sensor data
Proceedings of the Third International Workshop on Sensing Applications on Mobile Phones
On the use of decentralization to enable privacy in web-scale recommendation services
Proceedings of the 12th ACM workshop on Workshop on privacy in the electronic society
Hi-index | 0.00 |
There is a significant body of empirical work on statistical de-anonymization attacks against databases containing micro-data about individuals, e.g., their preferences, movie ratings, or transaction data. Our goal is to analytically explain why such attacks work. Specifically, we analyze a variant of the Narayanan-Shmatikov algorithm that was used to effectively de-anonymize the Netflix database of movie ratings. We prove theorems characterizing mathematical properties of the database and the auxiliary information available to the adversary that enable two classes of privacy attacks. In the first attack, the adversary successfully identifies the individual about whom she possesses auxiliary information (an isolation attack). In the second attack, the adversary learns additional information about the individual, although she may not be able to uniquely identify him (an information amplification attack). We demonstrate the applicability of the analytical results by empirically verifying that the mathematical properties assumed of the database are actually true for a significant fraction of the records in the Netflix movie ratings database, which contains ratings from about 500,000 users.