Output privacy in data mining

Authors:
Ting Wang;Ling Liu
Affiliations:
Georgia Institute of Technology;Georgia Institute of Technology
Venue:
ACM Transactions on Database Systems (TODS)
Year:
2011

Citing 32
Cited 1

Security-control methods for statistical databases: a comparative study

ACM Computing Surveys (CSUR)
Quadratic programming is in NP

Information Processing Letters
The inclusion-exclusion principle and its applications to cryptography

Cryptologia
Statistical database design

ACM Transactions on Database Systems (TODS)
Secure databases: protection against user influence

ACM Transactions on Database Systems (TODS)
Secure statistical databases with random sample queries

ACM Transactions on Database Systems (TODS)
Privacy-preserving data mining

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
On the design and quantification of privacy preserving data mining algorithms

PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Probabilistic logic programming with conditional constraints

ACM Transactions on Computational Logic (TOCL)
The statistical security of a statistical database

ACM Transactions on Database Systems (TODS)
Models and issues in data stream systems

Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Mining All Non-derivable Frequent Itemsets

PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Statistical Databases: Characteristics, Problems, and some Solutions

VLDB '82 Proceedings of the 8th International Conference on Very Large Data Bases
Privacy Preserving Data Mining

CRYPTO '00 Proceedings of the 20th Annual International Cryptology Conference on Advances in Cryptology
k-anonymity: a model for protecting privacy

International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Privacy preserving mining of association rules

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Privacy preserving association rule mining in vertically partitioned data

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
On the Privacy Preserving Properties of Random Data Perturbation Techniques

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
When do data mining results violate privacy?

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Computational complexity of itemset frequency satisfiability

PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Deriving private information from randomized data

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Privacy Preserving Data Classification with Rotation Perturbation

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Mondrian Multidimensional K-Anonymity

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
\ell -Diversity: Privacy Beyond \kappa -Anonymity

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Catch the moment: maintaining closed frequent itemsets over a data stream sliding window

Knowledge and Information Systems
Handicapping attacker's confidence: an alternative to k-anonymization

Knowledge and Information Systems
Approximate algorithms for K-anonymity

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
A privacy-preserving index for range queries

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Anonymity preserving pattern discovery

The VLDB Journal — The International Journal on Very Large Data Bases
Butterfly: Protecting Output Privacy in Stream Mining

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
A General Proximity Privacy Principle

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering

A Regression Dependent Iterative Algorithm for Optimizing Top-K Selection in Simulation Query Language

International Journal of Decision Support System Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

Privacy has been identified as a vital requirement in designing and implementing data mining systems. In general, privacy preservation demands protecting both input and output privacy: the former refers to sanitizing the raw data itself before performing mining; while the latter refers to preventing the mining output (models or patterns) from malicious inference attacks. This article presents a systematic study on the problem of protecting output privacy in data mining, and particularly, stream mining: (i) we highlight the importance of this problem by showing that even sufficient protection of input privacy does not guarantee that of output privacy; (ii) we present a general inferencing and disclosure model that exploits the intrawindow and interwindow privacy breaches in stream mining output; (iii) we propose a light-weighted countermeasure that effectively eliminates these breaches without explicitly detecting them, while minimizing the loss of output accuracy; (iv) we further optimize the basic scheme by taking account of two types of semantic constraints, aiming at maximally preserving utility-related semantics while maintaining hard privacy guarantee; (v) finally, we conduct extensive experimental evaluation over both synthetic and real data to validate the efficacy of our approach.