ACM Transactions on Database Systems (TODS)
Implementing data cubes efficiently
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Materialized view maintenance and integrity constraint checking: trading space for time
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
On the complexity of generating optimal plans with cross products (extended abstract)
PODS '97 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
An overview of query optimization in relational systems
PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
AutoAdmin “what-if” index analysis utility
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Efficient and extensible algorithms for multi query optimization
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Pipelining in multi-query optimization
PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Exploiting statistics on query expressions for optimization
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Data Quality: The Accuracy Dimension
Data Quality: The Accuracy Dimension
Selection of Views to Materialize in a Data Warehouse
ICDT '97 Proceedings of the 6th International Conference on Database Theory
fAST Refresh using Mass Query Optimization
Proceedings of the 17th International Conference on Data Engineering
Fast Computation of Sparse Datacubes
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Automated Selection of Materialized Views and Indexes in SQL Databases
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Including Group-By in Query Optimization
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Sampling-Based Estimation of the Number of Distinct Values of an Attribute
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
On the Computation of Multidimensional Aggregates
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Recommending Materialized Views and Indexes with IBM DB2 Design Advisor
ICAC '04 Proceedings of the First International Conference on Autonomic Computing
COMBI-operator - database support for data mining applications
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
CURE for cubes: cubing using a ROLAP engine
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Efficient incremental maintenance of data cubes
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Computing closest common subexpressions for view selection problems
DOLAP '06 Proceedings of the 9th ACM international workshop on Data warehousing and OLAP
Computing Iceberg Cubes by Top-Down and Bottom-Up Integration: The StarCubing Approach
IEEE Transactions on Knowledge and Data Engineering
Plot Query Processing with Wavelets
SSDBM '08 Proceedings of the 20th international conference on Scientific and Statistical Database Management
Towards integrated and efficient scientific sensor data processing: a database approach
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
BitCube: A Bottom-Up Cubing Engineering
DaWaK '09 Proceedings of the 11th International Conference on Data Warehousing and Knowledge Discovery
An efficient method for maintaining data cubes incrementally
Information Sciences: an International Journal
Journal of Intelligent Information Systems
ISMIS'08 Proceedings of the 17th international conference on Foundations of intelligent systems
Incremental aggregation on multiple continuous queries
ISMIS'06 Proceedings of the 16th international conference on Foundations of Intelligent Systems
Executing multiple group by query using mapreduce approach: implementation and optimization
GPC'10 Proceedings of the 5th international conference on Advances in Grid and Pervasive Computing
Optimization of analytic window functions
Proceedings of the VLDB Endowment
Hi-index | 0.01 |
Data analysts need to understand the quality of data in the warehouse. This is often done by issuing many Group By queries on the sets of columns of interest. Since the volume of data in these warehouses can be large, and tables in a data warehouse often contain many columns, this analysis typically requires executing a large number of Group By queries, which can be expensive. We show that the performance of today's database systems for such data analysis is inadequate. We also show that the problem is computationally hard, and develop efficient techniques for solving it. We demonstrate significant speedup over existing approaches on today's commercial database systems.