k-anonymity: a model for protecting privacy
International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Sequential PAttern mining using a bitmap representation
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
When do data mining results violate privacy?
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Privacy-Preserving Data Mining: Why, How, and When
IEEE Security and Privacy
Blocking Anonymity Threats Raised by Frequent Itemset Mining
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
(α, k)-anonymity: an enhanced k-anonymity model for privacy preserving data publishing
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Practical issues on privacy-preserving health data mining
PAKDD'07 Proceedings of the 2007 international conference on Emerging technologies in knowledge discovery and data mining
Hi-index | 0.00 |
We investigate situations where releasing frequent sequential patterns can compromise individual's privacy. We propose two concrete objectives for privacy protection: k-anonymity and α-dissociation. The first addresses the problem of inferring patterns with very low support, say, in [1, k]. These inferred patterns can become quasi-identifiers in linking attacks. We show that, for all but one definition of support, it is impossible to reliably infer support values for patterns with two or more negative items (items which do not occur in a pattern) solely based on frequent sequential patterns. For the remaining definition, we formulate privacy inference channels. α-dissociation handles the problem of high certainty of inferring sensitive attribute values. In order to remove privacy threats w.r.t. the two objectives, we show that we only need to examine pairs of sequential patterns with length difference of 1. We then establish a Privacy Inference Channels Sanitisation (PICS) algorithm. It can, as illustrated by experiments, reduce the privacy disclosure risk carried by frequent sequential patterns with a small computation overhead.