Applied multivariate statistical analysis
Applied multivariate statistical analysis
Security-control methods for statistical databases: a comparative study
ACM Computing Surveys (CSUR)
A modeling study of the TPC-C benchmark
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Quickly generating billion-record synthetic databases
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Privacy-preserving data mining
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
A framework for testing database applications
Proceedings of the 2000 ACM SIGSOFT international symposium on Software testing and analysis
On the design and quantification of privacy preserving data mining algorithms
PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
LMI approximation for the Radius of the intersection of ellipsoids: survey
Journal of Optimization Theory and Applications
Real world performance of association rule algorithms
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Protecting Respondents' Identities in Microdata Release
IEEE Transactions on Knowledge and Data Engineering
MUDD: a multi-dimensional data generator
WOSP '04 Proceedings of the 4th international workshop on Software and performance
Privacy preserving database application testing
Proceedings of the 2003 ACM workshop on Privacy in the electronic society
Privacy Aware Data Generation for Testing Database Applications
IDEAS '05 Proceedings of the 9th International Database Engineering & Application Symposium
Maintaining data privacy in association rule mining
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Generating thousand benchmark queries in seconds
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Statistical database modeling for privacy preserving database generation
ISMIS'05 Proceedings of the 15th international conference on Foundations of Intelligent Systems
Guided test generation for database applications via synthesized database interactions
ACM Transactions on Software Engineering and Methodology (TOSEM)
Generation of test databases using sampling methods
Proceedings of the 2013 International Symposium on Software Testing and Analysis
Hi-index | 0.00 |
Testing of database applications is of great importance. Although various studies have been conducted to investigate testing techniques for database design, relatively few efforts have been made to explicitly address the testing of database applications which requires a large amount of representative data available. As testing over live production databases is often infeasible in many situations due to the high risks of disclosure of confidential information or incorrect updating of real data, in this paper we investigate the problem of generating synthetic databases based on a-priori knowledge about production databases. Our approach is to fit the general location model using various characteristics (e.g., constraints, statistics, rules) extracted from a production database and then generate synthetic data using model learned. The generated data is valid and similar to real data in terms of statistical distribution, hence it can be used for functional and performance testing. As characteristics extracted may contain information which may be used by attackers to derive some confidential information about individuals, we present our disclosure analysis method which applies cell suppression technique for identity disclosure and perturbation for value disclosure analysis.