Language features for interoperability of databases with schematic discrepancies
SIGMOD '91 Proceedings of the 1991 ACM SIGMOD international conference on Management of data
Query evaluation techniques for large databases
ACM Computing Surveys (CSUR)
Fundamentals of database systems (2nd ed.)
Fundamentals of database systems (2nd ed.)
Why decision support fails and how to fix it
ACM SIGMOD Record
Adaptive parallel aggregation algorithms
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
A data model for supporting on-line analytical processing
CIKM '96 Proceedings of the fifth international conference on Information and knowledge management
Building a scaleable geo-spatial DBMS: technology, implementation, and evaluation
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Simultaneous optimization and evaluation of multiple dimensional queries
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Integrating association rule mining with relational database systems: alternatives and implications
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
On parallel processing of aggregate and scalar functions in object-relational DBMS
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
The design and implementation of INGRES
ACM Transactions on Database Systems (TODS)
Object Relational DBMSs: The Next Great Wave
Object Relational DBMSs: The Next Great Wave
ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Modeling Multidimensional Databases
ICDE '97 Proceedings of the Thirteenth International Conference on Data Engineering
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total
ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Groupwise Processing of Relational Queries
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
nD-SQL: A Multi-Dimensional Language for Interoperability and OLAP
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Including Group-By in Query Optimization
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Eager Aggregation and Lazy Aggregation
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
On the Computation of Multidimensional Aggregates
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Querying Multiple Features of Groups in Relational Databases
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
A Foundation for Multi-dimensional Databases
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Evaluation of Ad Hoc OLAP: In-Place Computation
SSDBM '99 Proceedings of the 11th International Conference on Scientific and Statistical Database Management
Ad Hoc OLAP: Expression and Evaluation
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
The PanQ tool and EMF SQL for complex data management
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient OLAP query processing in distributed data warehouses
Information Systems - Special issue: Best papers from EDBT 2002
Generalized MD-Joins: Evaluation and Reduction to SQL
DBTel '01 Proceedings of the VLDB 2001 International Workshop on Databases in Telecommunications II
Handbook of massive data sets
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Using grouping variables to express complex decision support queries
Data & Knowledge Engineering
Stream warehousing with DataDepot
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Hi-index | 0.01 |
Large scale data analysis and mining activities require sophisticated information extraction queries. Many queries require complex aggregation, and many of these aggregates are non-distributive. Conventional solutions to this problem involve defining User Defined Aggregate Functions (UDAFs). However, the use of UDAFs entails several problems. Defining a new UDAF can be a significant burden for the user, and optimizing queries involving UDAFs is difficult because of the “black box” nature of the UDAF.In this paper, we present a method for expressing nested aggregates in a declarative way. A nested aggregate, which is a rollup of another aggregated value, expresses a wide range of useful non-distributive aggregation. For example, most frequent type aggregation can be naturally expressed using nested aggregation, e.g. “For each product, report its total sales during the month with the largest total sales of the product”. By expressing compex aggregates declaratively, we relieve the user of the burden of defining UDAFs, and allow the evalution of the complex aggregates to be optimized.We use the Extended Multi-Feature (EMF) syntax as the basis for expressing nested aggregation. An advantage of this approach is that EMF SQL can already express a wide range of complex aggregation in a succinct way, and EMF SQL is easily optimized into efficient query plans. We show that nested aggregation queries can be evaluated efficiently by using a small extension to the EMF SQL query evaluation algorithm. A side effect of this extension is to extend EMF SQL to permit complex aggregation of data from multiple sources.