Translating SQL Into Relational Algebra: Optimization, Semantics, and Equivalence of SQL Queries
IEEE Transactions on Software Engineering
e-approximations with minimum packing constraint violation (extended abstract)
STOC '92 Proceedings of the twenty-fourth annual ACM symposium on Theory of computing
Approximation schemes for Euclidean k-medians and related problems
STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
A constant-factor approximation algorithm for the k-median problem (extended abstract)
STOC '99 Proceedings of the thirty-first annual ACM symposium on Theory of computing
Self-tuning histograms: building histograms without looking at data
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Congressional samples for approximate answering of group-by queries
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Data mining: concepts and techniques
Data mining: concepts and techniques
STHoles: a multidimensional workload-aware histogram
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
A robust, optimization-based approach for approximate answering of aggregate queries
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
Overcoming Limitations of Sampling for Aggregation Queries
Proceedings of the 17th International Conference on Data Engineering
ICICLES: Self-Tuning Samples for Approximate Query Answering
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
LEO - DB2's LEarning Optimizer
Proceedings of the 27th International Conference on Very Large Data Bases
Sampling-Based Estimation of the Number of Distinct Values of an Attribute
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
An Efficient Cost-Driven Index Selection Tool for Microsoft SQL Server
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Fast Incremental Maintenance of Approximate Histograms
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Automating Statistics Management for Query Optimizers
ICDE '00 Proceedings of the 16th International Conference on Data Engineering
Dynamic Histograms: Capturing Evolving Data Sets
ICDE '00 Proceedings of the 16th International Conference on Data Engineering
DB2 Advisor: An Optimizer Smart Enough to Recommend its own Indexes
ICDE '00 Proceedings of the 16th International Conference on Data Engineering
Dynamic sample selection for approximate query processing
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Indexing text data under space constraints
Proceedings of the thirteenth ACM international conference on Information and knowledge management
Goals and benchmarks for autonomic configuration recommenders
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Using Datacube Aggregates for Approximate Querying and Deviation Detection
IEEE Transactions on Knowledge and Data Engineering
Physical Database Design: the database professional's guide to exploiting indexes, views, storage, and more
Physical design refinement: The ‘merge-reduce’ approach
ACM Transactions on Database Systems (TODS)
Primitives for workload summarization and implications for SQL
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
DB2 design advisor: integrated automatic physical database design
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Self-tuning database systems: a decade of progress
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Robustness in automatic physical database design
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Compressing Very Large Database Workloads for Continuous Online Index Selection
DEXA '08 Proceedings of the 19th international conference on Database and Expert Systems Applications
A framework for testing query transformation rules
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Data mining-based materialized view and index selection in data warehouses
Journal of Intelligent Information Systems
Consistent on-line classification of dbs workload events
Proceedings of the 18th ACM conference on Information and knowledge management
A method of workload compression basing on characteristics for index selection
Proceedings of the ACM first international workshop on Data-intensive software management and mining
Tuning database configuration parameters with iTuned
Proceedings of the VLDB Endowment
Online index selection in RDBMS by evolutionary approach
DEXA'11 Proceedings of the 22nd international conference on Database and expert systems applications - Volume Part II
Divergent physical design tuning for replicated databases
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Skew-aware automatic database partitioning in shared-nothing, parallel OLTP systems
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
An automatic physical design tool for clustered column-stores
Proceedings of the 16th International Conference on Extending Database Technology
Hi-index | 0.00 |
Recently several important relational database tasks such as index selection, histogram tuning, approximate query processing, and statistics selection have recognized the importance of leveraging workloads. Often these tasks are presented with large workloads, i.e., a set of SQL DML statements, as input. A key factor affecting the scalability of such tasks is the size of the workload. In this paper, we present the novel problem of workload compression which helps improve the scalability of such tasks. We present a principled solution to this challenging problem. Our solution is broadly applicable to a variety of workload-driven tasks, while allowing for incorporation of task specific knowledge. We have implemented this solution and our experiments illustrate its effectiveness in the context of two workload-driven tasks: index selection and approximate query processing.