The PanQ tool and EMF SQL for complex data management
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Proceedings of the eighth international conference on Information and knowledge management
On computing correlated aggregates over continual data streams
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Mining data streams under block evolution
ACM SIGKDD Explorations Newsletter
Efficient OLAP query processing in distributed data warehouses
Information Systems - Special issue: Best papers from EDBT 2002
Efficient OLAP Query Processing in Distributed Data Warehouses
EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
Handbook of massive data sets
Decision support queries on a tape-resident data warehouse
Information Systems
Efficient processing of client transactions in real-time
Distributed and Parallel Databases
Using grouping variables to express complex decision support queries
Data & Knowledge Engineering
Hi-index | 0.00 |
Users frequently formulate complex data analysis queries in order to identify interesting trends, make unusual patterns stand out, or verify hypotheses. Being able to express these data mining queries concisely is of major importance not only from the user's, but also from the system's point of view. Recent research in OLAP has focused on datacubes and their applications; however, expression and processing of ad hoc decision support queries has been given very little attention. In this paper we present an appropriate framework for these queries and introduce a syntactic construct to support it. This SQL extension allows most OLAP queries, such as pivoting, complex intra- and inter-group comparisons, trends and hierarchical comparisons, to be expressed in a compact, intuitive and simple manner. This succinct representation of a complex OLAP query translates immediately to a novel, simple and efficient evaluation algorithm. We show how to optimize, analyze and parallelize this algorithm and discuss issues such as multiple query analysis and scaling. We present several experimental results of real-life queries that show orders of magnitude of performance improvement. We argue that this tight coupling between representation and algorithm is essential to efficient processing of ad hoc OLAP queries.