Efficient computation of multiple group by queries

Authors:
Zhimin Chen;Vivek Narasayya
Affiliations:
Microsoft Research;Microsoft Research
Venue:
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Year:
2005

Citing 20
Cited 14

Multiple-query optimization

ACM Transactions on Database Systems (TODS)
Implementing data cubes efficiently

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Materialized view maintenance and integrity constraint checking: trading space for time

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
On the complexity of generating optimal plans with cross products (extended abstract)

PODS '97 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
An overview of query optimization in relational systems

PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
AutoAdmin “what-if” index analysis utility

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Efficient and extensible algorithms for multi query optimization

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Pipelining in multi-query optimization

PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Exploiting statistics on query expressions for optimization

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Data Quality: The Accuracy Dimension

Data Quality: The Accuracy Dimension
Selection of Views to Materialize in a Data Warehouse

ICDT '97 Proceedings of the 6th International Conference on Database Theory
fAST Refresh using Mass Query Optimization

Proceedings of the 17th International Conference on Data Engineering
Fast Computation of Sparse Datacubes

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Automated Selection of Materialized Views and Indexes in SQL Databases

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Including Group-By in Query Optimization

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Sampling-Based Estimation of the Number of Distinct Values of an Attribute

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
On the Computation of Multidimensional Aggregates

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Recommending Materialized Views and Indexes with IBM DB2 Design Advisor

ICAC '04 Proceedings of the First International Conference on Autonomic Computing
COMBI-operator - database support for data mining applications

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29

CURE for cubes: cubing using a ROLAP engine

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Composite subset measures

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Efficient incremental maintenance of data cubes

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Computing closest common subexpressions for view selection problems

DOLAP '06 Proceedings of the 9th ACM international workshop on Data warehousing and OLAP
Computing Iceberg Cubes by Top-Down and Bottom-Up Integration: The StarCubing Approach

IEEE Transactions on Knowledge and Data Engineering
Plot Query Processing with Wavelets

SSDBM '08 Proceedings of the 20th international conference on Scientific and Statistical Database Management
Towards integrated and efficient scientific sensor data processing: a database approach

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
BitCube: A Bottom-Up Cubing Engineering

DaWaK '09 Proceedings of the 11th International Conference on Data Warehousing and Knowledge Discovery
An efficient method for maintaining data cubes incrementally

Information Sciences: an International Journal
A top-down approach for compressing data cubes under the simultaneous evaluation of multiple hierarchical range queries

Journal of Intelligent Information Systems
Top-down compression of data cubes in the presence of simultaneous multiple hierarchical range queries

ISMIS'08 Proceedings of the 17th international conference on Foundations of intelligent systems
Incremental aggregation on multiple continuous queries

ISMIS'06 Proceedings of the 16th international conference on Foundations of Intelligent Systems
Executing multiple group by query using mapreduce approach: implementation and optimization

GPC'10 Proceedings of the 5th international conference on Advances in Grid and Pervasive Computing
Optimization of analytic window functions

Proceedings of the VLDB Endowment

Quantified Score

Hi-index	0.01

Visualization

Abstract

Data analysts need to understand the quality of data in the warehouse. This is often done by issuing many Group By queries on the sets of columns of interest. Since the volume of data in these warehouses can be large, and tables in a data warehouse often contain many columns, this analysis typically requires executing a large number of Group By queries, which can be expensive. We show that the performance of today's database systems for such data analysis is inadequate. We also show that the problem is computationally hard, and develop efficient techniques for solving it. We demonstrate significant speedup over existing approaches on today's commercial database systems.