The role of domain knowledge in data mining
CIKM '95 Proceedings of the fourth international conference on Information and knowledge management
An introduction to genetic algorithms
An introduction to genetic algorithms
Towards tractable algebras for bags
Journal of Computer and System Sciences
From data mining to knowledge discovery: an overview
Advances in knowledge discovery and data mining
Wrappers for feature subset selection
Artificial Intelligence - Special issue on relevance
Generalizing data to provide anonymity when disclosing information (abstract)
PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Privacy-preserving data mining
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Data mining by attribute decomposition with semiconductor manufacturing case study
Data mining for design and manufacturing
An extended relational algebra with control over duplicate elimination
PODS '82 Proceedings of the 1st ACM SIGACT-SIGMOD symposium on Principles of database systems
Efficient GA Based Techniques for Classification
Applied Intelligence
Protecting Respondents' Identities in Microdata Release
IEEE Transactions on Knowledge and Data Engineering
Genetic Algorithms for Multiobjective Optimization: FormulationDiscussion and Generalization
Proceedings of the 5th International Conference on Genetic Algorithms
Theory and Applications of Attribute Decomposition
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Improving Supervised Learning by Feature Decomposition
FoIKS '02 Proceedings of the Second International Symposium on Foundations of Information and Knowledge Systems
k-anonymity: a model for protecting privacy
International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Achieving k-anonymity privacy protection using generalization and suppression
International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Transforming data to satisfy privacy constraints
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
State-of-the-art in privacy preserving data mining
ACM SIGMOD Record
When do data mining results violate privacy?
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Bottom-Up Generalization: A Data Mining Solution to Privacy Protection
ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Top-Down Specialization for Information and Privacy Preservation
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Data Privacy through Optimal k-Anonymization
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
On the complexity of optimal K-anonymity
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Incognito: efficient full-domain K-anonymity
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
On k-anonymity and the curse of dimensionality
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Checking for k-anonymity violation by views
VLDB '05 Proceedings of the 31st international conference on Very large data bases
ISDA '05 Proceedings of the 5th International Conference on Intelligent Systems Design and Applications
Comparison of Multiobjective Evolutionary Algorithms: Empirical Results
Evolutionary Computation
Mondrian Multidimensional K-Anonymity
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
\ell -Diversity: Privacy Beyond \kappa -Anonymity
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Decomposition methodology for classification tasks: a meta decomposer framework
Pattern Analysis & Applications
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Feature set decomposition for decision trees
Intelligent Data Analysis
Statistical Comparisons of Classifiers over Multiple Data Sets
The Journal of Machine Learning Research
Optimal k-Anonymity with Flexible Generalization Schemes through Bottom-up Searching
ICDMW '06 Proceedings of the Sixth IEEE International Conference on Data Mining - Workshops
Decision-tree instance-space decomposition with grouped gain-ratio
Information Sciences: an International Journal
Two methods for privacy preserving data mining with malicious participants
Information Sciences: an International Journal
Data & Knowledge Engineering
Domain-Driven, Actionable Knowledge Discovery
IEEE Intelligent Systems
Privacy preserving data mining of sequential patterns for network traffic data
Information Sciences: an International Journal
Genetic algorithm-based feature set partitioning for classification problems
Pattern Recognition
A genetic algorithm calibration method based on convergence due to genetic drift
Information Sciences: an International Journal
Providing k-anonymity in data mining
The VLDB Journal — The International Journal on Very Large Data Bases
Self-organizing genetic algorithm based tuning of PID controllers
Information Sciences: an International Journal
Getting insights from the voices of customers: Conversation mining at a contact center
Information Sciences: an International Journal
Information Sciences: an International Journal
Troika - An improved stacking schema for classification tasks
Information Sciences: an International Journal
Efficient Multidimensional Suppression for K-Anonymity
IEEE Transactions on Knowledge and Data Engineering
k-Anonymous Decision Tree Induction
PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
PKDD'05 Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases
Multiobjective evolutionary algorithms: a comparative case studyand the strength Pareto approach
IEEE Transactions on Evolutionary Computation
Limiting disclosure of sensitive data in sequential releases of databases
Information Sciences: an International Journal
A modification of the Lloyd algorithm for k-anonymous quantization
Information Sciences: an International Journal
The CASH algorithm-cost-sensitive attribute selection using histograms
Information Sciences: an International Journal
Algorithmic superactivation of asymptotic quantum capacity of zero-capacity quantum channels
Information Sciences: an International Journal
Customer relationship management using partial focus feature reduction
ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part IV
Editorial: Guest editorial: Special issue on data mining for information security
Information Sciences: an International Journal
Privacy-preserving disjunctive normal form operations on distributed sets
Information Sciences: an International Journal
Privacy-preserving trajectory data publishing by local suppression
Information Sciences: an International Journal
Information Sciences: an International Journal
Application traffic classification at the early stage by characterizing application rounds
Information Sciences: an International Journal
Improving accuracy of classification models induced from anonymized datasets
Information Sciences: an International Journal
The Journal of Supercomputing
Hi-index | 0.07 |
In privacy-preserving data mining (PPDM), a widely used method for achieving data mining goals while preserving privacy is based on k-anonymity. This method, which protects subject-specific sensitive data by anonymizing it before it is released for data mining, demands that every tuple in the released table should be indistinguishable from no fewer than k subjects. The most common approach for achieving compliance with k-anonymity is to replace certain values with less specific but semantically consistent values. In this paper we propose a different approach for achieving k-anonymity by partitioning the original dataset into several projections such that each one of them adheres to k-anonymity. Moreover, any attempt to rejoin the projections, results in a table that still complies with k-anonymity. A classifier is trained on each projection and subsequently, an unlabelled instance is classified by combining the classifications of all classifiers. Guided by classification accuracy and k-anonymity constraints, the proposed data mining privacy by decomposition (DMPD) algorithm uses a genetic algorithm to search for optimal feature set partitioning. Ten separate datasets were evaluated with DMPD in order to compare its classification performance with other k-anonymity-based methods. The results suggest that DMPD performs better than existing k-anonymity-based algorithms and there is no necessity for applying domain dependent knowledge. Using multiobjective optimization methods, we also examine the tradeoff between the two conflicting objectives in PPDM: privacy and predictive performance.