Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Data mining: concepts and techniques
Data mining: concepts and techniques
Principles of data mining
Discovery of Frequent Episodes in Event Sequences
Data Mining and Knowledge Discovery
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Limiting privacy breaches in privacy preserving data mining
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
k-anonymity: a model for protecting privacy
International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
State of the art of graph-based data mining
ACM SIGKDD Explorations Newsletter
A Framework for High-Accuracy Privacy-Preserving Mining
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Mechanism Design via Differential Privacy
FOCS '07 Proceedings of the 48th Annual IEEE Symposium on Foundations of Computer Science
A learning theory approach to non-interactive database privacy
STOC '08 Proceedings of the fortieth annual ACM symposium on Theory of computing
Composition attacks and auxiliary information in data privacy
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Temporal pattern discovery for trends and transient effects: its application to patient records
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Releasing search queries and clicks privately
Proceedings of the 18th international conference on World wide web
Privacy integrated queries: an extensible platform for privacy-preserving data analysis
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Information Discovery on Electronic Health Records
Information Discovery on Electronic Health Records
ICALP'06 Proceedings of the 33rd international conference on Automata, Languages and Programming - Volume Part II
Calibrating noise to sensitivity in private data analysis
TCC'06 Proceedings of the Third conference on Theory of Cryptography
Personalized social recommendations: accurate or private
Proceedings of the VLDB Endowment
Differentially private data cubes: optimizing noise sources and consistency
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
iReduct: differential privacy with reduced relative errors
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Differentially private data release for data mining
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Differential privacy in data publication and analysis
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
PrivBasis: frequent itemset mining with differential privacy
Proceedings of the VLDB Endowment
Low-rank mechanism: optimizing batch queries under differential privacy
Proceedings of the VLDB Endowment
Functional mechanism: regression analysis under differential privacy
Proceedings of the VLDB Endowment
Differentially private top-k query over MapReduce
Proceedings of the fourth international workshop on Cloud data management
On differentially private frequent itemset mining
Proceedings of the VLDB Endowment
Efficient and accurate strategies for differentially-private sliding window queries
Proceedings of the 16th International Conference on Extending Database Technology
πBox: a platform for privacy-preserving apps
nsdi'13 Proceedings of the 10th USENIX conference on Networked Systems Design and Implementation
Mining frequent graph patterns with differential privacy
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Privacy-preserving data exploration in genome-wide association studies
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Differential privacy for neighborhood-based collaborative filtering
Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
A two-phase algorithm for mining sequential patterns with differential privacy
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
DiffR-Tree: a differentially private spatial index for OLAP query
WAIM'13 Proceedings of the 14th international conference on Web-Age Information Management
Mining frequent patterns with differential privacy
Proceedings of the VLDB Endowment
A new tool for sharing and querying of clinical documents modeled using HL7 Version 3 standard
Computer Methods and Programs in Biomedicine
Differentially private histogram publication
The VLDB Journal — The International Journal on Very Large Data Bases
Hi-index | 0.00 |
Discovering frequent patterns from data is a popular exploratory technique in datamining. However, if the data are sensitive (e.g., patient health records, user behavior records) releasing information about significant patterns or trends carries significant risk to privacy. This paper shows how one can accurately discover and release the most significant patterns along with their frequencies in a data set containing sensitive information, while providing rigorous guarantees of privacy for the individuals whose information is stored there. We present two efficient algorithms for discovering the k most frequent patterns in a data set of sensitive records. Our algorithms satisfy differential privacy, a recently introduced definition that provides meaningful privacy guarantees in the presence of arbitrary external information. Differentially private algorithms require a degree of uncertainty in their output to preserve privacy. Our algorithms handle this by returning 'noisy' lists of patterns that are close to the actual list of k most frequent patterns in the data. We define a new notion of utility that quantifies the output accuracy of private top-k pattern mining algorithms. In typical data sets, our utility criterion implies low false positive and false negative rates in the reported lists. We prove that our methods meet the new utility criterion; we also demonstrate the performance of our algorithms through extensive experiments on the transaction data sets from the FIMI repository. While the paper focuses on frequent pattern mining, the techniques developed here are relevant whenever the data mining output is a list of elements ordered according to an appropriately 'robust' measure of interest.