Extending complex ad-hoc OLAP

Authors:
Theodore Johnson;Damianos Chatziantoniou
Affiliations:
Database Research Dept., AT&T Labs - Research;Dept. of CS, Stevens Institute of Technology
Venue:
Proceedings of the eighth international conference on Information and knowledge management
Year:
1999

Citing 25
Cited 7

Language features for interoperability of databases with schematic discrepancies

SIGMOD '91 Proceedings of the 1991 ACM SIGMOD international conference on Management of data
Classifying Schematic and Data Heterogeneity in Multidatabase Systems

Computer
Query evaluation techniques for large databases

ACM Computing Surveys (CSUR)
Fundamentals of database systems (2nd ed.)

Fundamentals of database systems (2nd ed.)
Why decision support fails and how to fix it

ACM SIGMOD Record
Adaptive parallel aggregation algorithms

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
A data model for supporting on-line analytical processing

CIKM '96 Proceedings of the fifth international conference on Information and knowledge management
Building a scaleable geo-spatial DBMS: technology, implementation, and evaluation

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Simultaneous optimization and evaluation of multiple dimensional queries

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Integrating association rule mining with relational database systems: alternatives and implications

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
On parallel processing of aggregate and scalar functions in object-relational DBMS

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
The design and implementation of INGRES

ACM Transactions on Database Systems (TODS)
Object Relational DBMSs: The Next Great Wave

Object Relational DBMSs: The Next Great Wave
Complex Query Decorrelation

ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Modeling Multidimensional Databases

ICDE '97 Proceedings of the Thirteenth International Conference on Data Engineering
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total

ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Groupwise Processing of Relational Queries

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
nD-SQL: A Multi-Dimensional Language for Interoperability and OLAP

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Including Group-By in Query Optimization

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Eager Aggregation and Lazy Aggregation

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
On the Computation of Multidimensional Aggregates

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Querying Multiple Features of Groups in Relational Databases

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
A Foundation for Multi-dimensional Databases

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Evaluation of Ad Hoc OLAP: In-Place Computation

SSDBM '99 Proceedings of the 11th International Conference on Scientific and Statistical Database Management
Ad Hoc OLAP: Expression and Evaluation

ICDE '99 Proceedings of the 15th International Conference on Data Engineering

The PanQ tool and EMF SQL for complex data management

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient OLAP query processing in distributed data warehouses

Information Systems - Special issue: Best papers from EDBT 2002
Generalized MD-Joins: Evaluation and Reduction to SQL

DBTel '01 Proceedings of the VLDB 2001 International Workshop on Databases in Telecommunications II
Data warehousing

Handbook of massive data sets
Composite subset measures

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Using grouping variables to express complex decision support queries

Data & Knowledge Engineering
Stream warehousing with DataDepot

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data

Quantified Score

Hi-index	0.01

Visualization

Abstract

Large scale data analysis and mining activities require sophisticated information extraction queries. Many queries require complex aggregation, and many of these aggregates are non-distributive. Conventional solutions to this problem involve defining User Defined Aggregate Functions (UDAFs). However, the use of UDAFs entails several problems. Defining a new UDAF can be a significant burden for the user, and optimizing queries involving UDAFs is difficult because of the “black box” nature of the UDAF.In this paper, we present a method for expressing nested aggregates in a declarative way. A nested aggregate, which is a rollup of another aggregated value, expresses a wide range of useful non-distributive aggregation. For example, most frequent type aggregation can be naturally expressed using nested aggregation, e.g. “For each product, report its total sales during the month with the largest total sales of the product”. By expressing compex aggregates declaratively, we relieve the user of the burden of defining UDAFs, and allow the evalution of the complex aggregates to be optimized.We use the Extended Multi-Feature (EMF) syntax as the basis for expressing nested aggregation. An advantage of this approach is that EMF SQL can already express a wide range of complex aggregation in a succinct way, and EMF SQL is easily optimized into efficient query plans. We show that nested aggregation queries can be evaluated efficiently by using a small extension to the EMF SQL query evaluation algorithm. A side effect of this extension is to extend EMF SQL to permit complex aggregation of data from multiple sources.