Approximate string-matching with q-grams and maximal matches
Theoretical Computer Science - Selected papers of the Combinatorial Pattern Matching School
Discovery of Frequent Episodes in Event Sequences
Data Mining and Knowledge Discovery
Scalable sequential pattern mining for biological sequences
Proceedings of the thirteenth ACM international conference on Information and knowledge management
A Primitive Operator for Similarity Joins in Data Cleaning
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Scaling up all pairs similarity search
Proceedings of the 16th international conference on World Wide Web
Privacy Preservation in the Publication of Trajectories
MDM '08 Proceedings of the The Ninth International Conference on Mobile Data Management
Anonymizing moving objects: how to hide a MOB in a crowd?
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Never Walk Alone: Uncertainty for Anonymity in Moving Objects Databases
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Privacy integrated queries: an extensible platform for privacy-preserving data analysis
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Movement data anonymity through generalization
Proceedings of the 2nd SIGSPATIAL ACM GIS 2009 International Workshop on Security and Privacy in GIS and LBS
Discovering frequent patterns in sensitive data
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
iReduct: differential privacy with reduced relative errors
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Trajectory anonymity in publishing personal mobility data
ACM SIGKDD Explorations Newsletter
Private and Continual Release of Statistics
ACM Transactions on Information and System Security (TISSEC)
ICALP'06 Proceedings of the 33rd international conference on Automata, Languages and Programming - Volume Part II
Calibrating noise to sensitivity in private data analysis
TCC'06 Proceedings of the Third conference on Theory of Cryptography
Differentially private transit data publication: a case study on the montreal transportation system
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
PrivBasis: frequent itemset mining with differential privacy
Proceedings of the VLDB Endowment
Differentially private sequential data publication via variable-length n-grams
Proceedings of the 2012 ACM conference on Computer and communications security
Frequent grams based embedding for privacy preserving record linkage
Proceedings of the 21st ACM international conference on Information and knowledge management
On differentially private frequent itemset mining
Proceedings of the VLDB Endowment
Privacy-preserving trajectory data publishing by local suppression
Information Sciences: an International Journal
Hi-index | 0.00 |
Frequent sequential pattern mining is a central task in many fields such as biology and finance. However, release of these patterns is raising increasing concerns on individual privacy. In this paper, we study the sequential pattern mining problem under the differential privacy framework which provides formal and provable guarantees of privacy. Due to the nature of the differential privacy mechanism which perturbs the frequency results with noise, and the high dimensionality of the pattern space, this mining problem is particularly challenging. In this work, we propose a novel two-phase algorithm for mining both prefixes and substring patterns. In the first phase, our approach takes advantage of the statistical properties of the data to construct a model-based prefix tree which is used to mine prefixes and a candidate set of substring patterns. The frequency of the substring patterns is further refined in the successive phase where we employ a novel transformation of the original data to reduce the perturbation noise. Extensive experiment results using real datasets showed that our approach is effective for mining both substring and prefix patterns in comparison to the state-of-the-art solutions.