Quickly generating billion-record synthetic databases
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Implementing data cubes efficiently
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
On the complexity of the view-selection problem
PODS '99 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
A foundation for capturing and querying complex multidimensional data
Information Systems - Data warehousing
Generating consistent test data: restricting the search space by a generator formula
The VLDB Journal — The International Journal on Very Large Data Bases
Extending Practical Pre-Aggregation in On-Line Analytical Processing
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
What can Hierarchies do for Data Warehouses?
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Multidimensional normal forms for data warehouse design
Information Systems
TestEra: Specification-Based Testing of Java Programs Using SAT
Automated Software Engineering
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Capturing summarizability with integrity constraints in OLAP
ACM Transactions on Database Systems (TODS)
Stream Cube: An Architecture for Multi-Dimensional Analysis of Data Streams
Distributed and Parallel Databases
Software Abstractions: Logic, Language, and Analysis
Software Abstractions: Logic, Language, and Analysis
Simple and realistic data generation
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Hierarchies in a multidimensional model: from conceptual modeling to logical representation
Data & Knowledge Engineering - Special issue: WIDM 2004
QAGen: generating query-aware test databases
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
A parallel general-purpose synthetic data generator
ACM SIGMOD Record
ROLAP implementations of the data cube
ACM Computing Surveys (CSUR)
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Scalable satisfiability checking and test data generation from modeling diagrams
Automated Software Engineering
Text Cube: Computing IR Measures for Multidimensional Text Database Analysis
ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Data & Knowledge Engineering
Generating example data for dataflow programs
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Data mining-based materialized view and index selection in data warehouses
Journal of Intelligent Information Systems
A survey on summarizability issues in multidimensional modeling
Data & Knowledge Engineering
Random graph generation for scheduling simulations
Proceedings of the 3rd International ICST Conference on Simulation Tools and Techniques
Constraint-based test database generation for SQL queries
Proceedings of the 5th Workshop on Automation of Software Test
Multidimensional arrays for warehousing data on clouds
Globe'10 Proceedings of the Third international conference on Data management in grid and peer-to-peer systems
Generating databases for query workloads
Proceedings of the VLDB Endowment
A data generator for cloud-scale benchmarking
TPCTC'10 Proceedings of the Second TPC technology conference on Performance evaluation, measurement and characterization of complex systems
Data generation using declarative constraints
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Graph cube: on warehousing and OLAP multidimensional networks
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Scalable analysis of conceptual data models
Proceedings of the 2011 International Symposium on Software Testing and Analysis
ES2: A cloud data storage system for supporting both OLTP and OLAP
ICDE '11 Proceedings of the 2011 IEEE 27th International Conference on Data Engineering
Reversing statistics for scalable test databases generation
Proceedings of the Sixth International Workshop on Testing Database Systems
Issues in big data testing and benchmarking
Proceedings of the Sixth International Workshop on Testing Database Systems
Hi-index | 0.00 |
Multidimensional data models form the core of modern decision support software. The need for this kind of software is significant, and it continues to grow with the size and variety of datasets being collected today. Yet real multidimensional instances are often unavailable for testing and benchmarking, and existing data generators can only produce a limited class of such structures. In this paper, we present a new framework for scalable generation of test data from a rich class of multidimensional models. The framework provides a small, expressive language for specifying such models, and a novel solver for generating sample data from them. While the satisfiability problem for the language is NP-hard, we identify a polynomially solvable fragment that captures most practical modeling patterns. Given a model and, optionally, a statistical specification of the desired test dataset, the solver detects and instantiates a maximal subset of the model within this fragment, generating data that exhibits the desired statistical properties. We use our framework to generate a variety of high-quality test datasets from real industrial models, which cannot be correctly instantiated by existing data generators, or as effectively solved by general-purpose constraint solvers.