An integer programming approach for frequent itemset hiding

Authors:
Aris Gkoulalas-Divanis;Vassilios S. Verykios
Affiliations:
University of Thessaly, Volos, GREECE;University of Thessaly, Volos, GREECE
Venue:
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Year:
2006

Citing 12
Cited 10

Efficiently mining long patterns from databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
KDD-Cup 2000 organizers' report: peeling the onion

ACM SIGKDD Explorations Newsletter - Special issue on “Scalable data mining algorithms”
Using unknowns to prevent discovery of association rules

ACM SIGMOD Record
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
The inference problem: a survey

ACM SIGKDD Explorations Newsletter
Artificial Intelligence: A Modern Approach

Artificial Intelligence: A Modern Approach
Disclosure Limitation of Sensitive Rules

KDEX '99 Proceedings of the 1999 Workshop on Knowledge and Data Engineering Exchange
Privacy preserving frequent itemset mining

CRPIT '14 Proceedings of the IEEE international conference on Privacy, security and data mining - Volume 14
A quantitative and qualitative ANALYSIS of blocking in association rule hiding

Proceedings of the 2004 ACM workshop on Privacy in the electronic society
Preserving the Confidentiality of Categorical Statistical Data Bases When Releasing Information for Association Rules*

Data Mining and Knowledge Discovery
A Border-Based Approach for Hiding Sensitive Frequent Itemsets

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Maximizing Accuracy of Shared Databases when Concealing Sensitive Patterns

Information Systems Research

Hiding Frequent Patterns under Multiple Sensitive Thresholds

DEXA '08 Proceedings of the 19th international conference on Database and Expert Systems Applications
Privacy preservation of aggregates in hidden databases: why and how?

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Privacy risks in health databases from aggregate disclosure

Proceedings of the 2nd International Conference on PErvasive Technologies Related to Assistive Environments
K-anonymous association rule hiding

ASIACCS '10 Proceedings of the 5th ACM Symposium on Information, Computer and Communications Security
From itemsets through trajectories to location based services: a knowledge hiding privacy approach

ACM SIGKDD Explorations Newsletter
Revisiting sequential pattern hiding to enhance utility

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Aggregate suppression for enterprise search engines

SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Integer linear programming for Constrained Multi-Aspect Committee Review Assignment

Information Processing and Management: an International Journal
Bands of privacy preserving objectives: classification of PPDM strategies

AusDM '11 Proceedings of the Ninth Australasian Data Mining Conference - Volume 121
Association rule hiding in risk management for retail supply chain collaboration

Computers in Industry

Quantified Score

Hi-index	0.03

Visualization

Abstract

The rapid growth of transactional data brought, soon enough, into attention the need of its further exploitation. In this paper, we investigate the problem of securing sensitive knowledge from being exposed in patterns extracted during association rule mining. Instead of hiding the produced rules directly, we decide to hide the sensitive frequent itemsets that may lead to the production of these rules. As a first step, we introduce the notion of distance between two databases and a measure for quantifying it. By trying to minimize the distance between the original database and its sanitized version (that can safely be released), we propose a novel, exact algorithm for association rule hiding and evaluate it on real world datasets demonstrating its effectiveness towards solving the problem.