Practical data-swapping: the first steps
ACM Transactions on Database Systems (TODS)
Inference control mechanism for statistical database frequency-imposed data distortions
Journal of the American Society for Information Science
Security-control methods for statistical databases: a comparative study
ACM Computing Surveys (CSUR)
Relative compromise of statistical databases
Australian Computer Journal
C4.5: programs for machine learning
C4.5: programs for machine learning
Machine Learning
Communications of the ACM
OLAP and statistical databases: similarities and differences
PODS '97 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Data Warehousing, Data Mining, and Olap
Data Warehousing, Data Mining, and Olap
Data Mining Techniques: For Marketing, Sales, and Customer Support
Data Mining Techniques: For Marketing, Sales, and Customer Support
Knowledge Discovery in Personal Data vs. Privacy: A mini-symposium
IEEE Expert: Intelligent Systems and Their Applications
Some Privacy Issues in Knowledge Discovery: The OECD Personal Privacy Guidelines
IEEE Expert: Intelligent Systems and Their Applications
Machine Learning
Machine Learning
Protecting Against Data Mining through Samples
Proceedings of the IFIP WG 11.3 Thirteenth International Conference on Database Security: Research Advances in Database and Information Systems Security
Collaborative Knowledge Acquisition with a Genetic Algorithm
ICTAI '97 Proceedings of the 9th International Conference on Tools with Artificial Intelligence
On the Privacy Preserving Properties of Random Data Perturbation Techniques
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
A New Algorithm for Finding Minimal Sample Uniques for Use in Statistical Disclosure Assessment
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Privacy-Preserving Computation of Bayesian Networks on Vertically Partitioned Data
IEEE Transactions on Knowledge and Data Engineering
Computational analysis of a nonstationary fatigue data using the ARIMA approaches
ICCOMP'07 Proceedings of the 11th WSEAS International Conference on Computers
IDEAL '08 Proceedings of the 9th International Conference on Intelligent Data Engineering and Automated Learning
Extending l-diversity to generalize sensitive data
Data & Knowledge Engineering
An improved EDP algorithm to privacy protection in data mining
BI'11 Proceedings of the 2011 international conference on Brain informatics
OTM'11 Proceedings of the 2011th Confederated international conference on On the move to meaningful internet systems - Volume Part I
Incorporating privacy concerns in data mining on distributed data
AIMSA'06 Proceedings of the 12th international conference on Artificial Intelligence: methodology, Systems, and Applications
A decision tree-based missing value imputation technique for data pre-processing
AusDM '11 Proceedings of the Ninth Australasian Data Mining Conference - Volume 121
Bands of privacy preserving objectives: classification of PPDM strategies
AusDM '11 Proceedings of the Ninth Australasian Data Mining Conference - Volume 121
VICUS: a noise addition technique for categorical data
AusDM '12 Proceedings of the Tenth Australasian Data Mining Conference - Volume 134
Hi-index | 0.00 |
The recent proliferation of data mining tools for the analysis of large volumes of data has paid little attention to individual privacy issues. Here, we introduce methods aimed at finding a balance between the individuals' right to privacy and the data-miners' need to find general patterns in huge volumes of detailed records. In particular, we focus on the data-mining task of classification with decision trees. We base our security-control mechanism on noise-addition techniques used in statistical databases because (1) the multidimensional matrix model of statistical databases and the multidimensional cubes of On-Line Analytical Processing (OLAP) are essentially the same, and (2) noise-addition techniques are very robust. The main drawback of noise addition techniques in the context of statistical databases is low statistical quality of released statistics. We argue that in data mining the major requirement of security control mechanism (in addition to protect privacy) is not to ensure precise and bias-free statistics, but rather to preserve the high-level descriptions of knowledge constructed by artificial data mining tools.