Hiding informative association rule sets

Authors:
Shyue-Liang Wang;Bhavesh Parikh;Ayat Jafari
Affiliations:
Department of Computer Science, New York Institute of Technology, 1855 Broadway, New York, New York 10023, USA;Department of Computer Science, New York Institute of Technology, 1855 Broadway, New York, New York 10023, USA;Department of Computer Science, New York Institute of Technology, 1855 Broadway, New York, New York 10023, USA
Venue:
Expert Systems with Applications: An International Journal
Year:
2007

Citing 17
Cited 3

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Privacy-preserving data mining

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Using sample size to limit exposure to data mining

Journal of Computer Security - Special issue on database security
On the design and quantification of privacy preserving data mining algorithms

PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Using unknowns to prevent discovery of association rules

ACM SIGMOD Record
Mining the Smallest Association Rule Set for Predictions

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Protecting Against Data Mining through Samples

Proceedings of the IFIP WG 11.3 Thirteenth International Conference on Database Security: Research Advances in Database and Information Systems Security
Hiding Association Rules by Using Confidence and Support

IHW '01 Proceedings of the 4th International Workshop on Information Hiding
Tools for privacy preserving distributed data mining

ACM SIGKDD Explorations Newsletter
Randomization in privacy preserving data mining

ACM SIGKDD Explorations Newsletter
Limiting privacy breaches in privacy preserving data mining

Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Privacy preserving mining of association rules

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Privacy preserving association rule mining in vertically partitioned data

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Privacy preserving frequent itemset mining

CRPIT '14 Proceedings of the IEEE international conference on Privacy, security and data mining - Volume 14
Protecting Sensitive Knowledge By Data Sanitization

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Association Rule Hiding

IEEE Transactions on Knowledge and Data Engineering
State-of-the-art in privacy preserving data mining

ACM SIGMOD Record

Efficient sanitization of informative association rules

Expert Systems with Applications: An International Journal
Maintenance of sanitizing informative association rules

Expert Systems with Applications: An International Journal
Privacy preserving itemset mining through noisy items

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	12.06

Visualization

Abstract

Privacy-preserving data mining, is a novel research direction in data mining and statistical databases, where data mining algorithms are analyzed for the side effects they incur in data privacy [Verykios, V., Bertino, E., Fovino, I. G., Provenza, L. P., Saygin, Y., & Theodoridis, Y. (2004). State-of-the-art in privacy preserving data mining. SIGMOD Record 33(1), 50-57, March 2004]. For example, through data mining, one is able to infer sensitive information, including personal information or even patterns, from non-sensitive information or unclassified data. There have been two types of privacy concerning data mining. The first type of privacy, called output privacy, is that the data is minimally altered so that the mining result will not disclose certain privacy. The second type of privacy, called input privacy, is that the data is manipulated so that the mining result is not affected or minimally affected. In output privacy of hiding association rules, current approaches require hidden rules or patterns been given in advance [Dasseni, E., Verykios, V., Elmagarmid, A., & Bertino, E. (2001). Hiding association rules by using confidence and support. In Proceedings of 4th information hiding workshop, Pittsburgh, PA (pp. 369-383); Oliveira, S., & Zaiane, O. (2002). Privacy preserving frequent itemset mining. In Proceedings of IEEE international conference on data mining, November (pp. 43-54); Oliveira, S., & Zaiane, O. (2003). Algorithms for balancing privacy and knowledge discovery in association rule mining. In Proceedings of 7th international database engineering and applications symposium (IDEAS03), Hong Kong, July; Oliveira, S., & Zaiane, O. (2003). Protecting sensitive knowledge by data sanitization. In Proceedings of IEEE international conference on data mining, November 2003; Saygin, Y., Verykios, V., & Clifton, C. (2001). Using unknowns to prevent discovery of association rules. SIGMOND Record 30(4), 45-54; Verykios, V., Elmagarmid, A., Bertino, E., Saygin, Y., & Dasseni, E. (2004). Association rules hiding. IEEE Transactions on Knowledge and Data Engineering 16(4), 434-447]. This selection of rules would require data mining process to be executed first. Based on the discovered rules and privacy requirements, hidden rules or patterns are then selected manually. However, for some applications, we are interested in hiding certain constrained classes of association rules such as informative association rule sets [Li J., Shen H., & Topor R. (2001). Mining the smallest association rule set for predictions. In Proceedings of the 2001 IEEE international conference on data mining (pp. 361-368)]. To hide such rule sets, the pre-process of finding these hidden rules can be integrated into the hiding process as long as the predicting items are given. In this work, we propose two algorithms, ISL (Increase Support of LHS) and DSR (Decrease Support of RHS), to automatically hiding informative association rule sets without pre-mining and selection of hidden rules. Examples illustrating the proposed algorithms are given. Numerical experiments are performed to show the various effects of the algorithms. Recommendations of appropriate usage of the proposed algorithms based on the characteristics of databases are reported.