Mining quantitative association rules in large relational tables
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Implementing data cubes efficiently
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
An overview of data warehousing and OLAP technology
ACM SIGMOD Record
Improved query performance with variant indexes
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Cubetree: organization of and bulk incremental updates on the data cube
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
An array-based algorithm for simultaneous multidimensional aggregates
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Data warehousing and OLAP for decision support
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Bitmap index design and evaluation
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Consistency Algorithms for Multi-Source Warehouse View Maintenance
Distributed and Parallel Databases - Special issue on parallel and distributed information systems
Bottom-up computation of sparse and Iceberg CUBE
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Direct spatial search on pictorial databases using packed R-trees
SIGMOD '85 Proceedings of the 1985 ACM SIGMOD international conference on Management of data
CubiST: a new algorithm for improving the performance of ad-hoc OLAP queries
Proceedings of the 3rd ACM international workshop on Data warehousing and OLAP
ACM Computing Surveys (CSUR)
PARSIMONY: An infrastructure for parallel multidimensional analysis and data mining
Journal of Parallel and Distributed Computing - Special issue on high-performance data mining
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals
Data Mining and Knowledge Discovery
High Performance OLAP and Data Mining on Parallel Computers
Data Mining and Knowledge Discovery
TBSAM: An Access Method for Efficient Processing of Statistical Queries
IEEE Transactions on Knowledge and Data Engineering
Modeling Multidimensional Databases
ICDE '97 Proceedings of the Thirteenth International Conference on Data Engineering
Physical Database Design for Data Warehouses
ICDE '97 Proceedings of the Thirteenth International Conference on Data Engineering
Model 204 Architecture and Performance
Proceedings of the 2nd International Workshop on High Performance Transaction Systems
Eager Aggregation and Lazy Aggregation
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Information Retrieval from an Incomplete Data Cube
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
On the Computation of Multidimensional Aggregates
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Sorting improves word-aligned bitmap indexes
Data & Knowledge Engineering
A web visualization tool for historical analysis of geo-referenced multidimensional data
DASFAA'08 Proceedings of the 13th international conference on Database systems for advanced applications
Hi-index | 0.00 |
We report on a new, efficient encoding for the data cube, which results in a drastic speed-up of OLAP queries that aggregate along any combination of dimensions over numerical and categorical attributes. We are focusing on a class of queries called cube queries, which return aggregated values rather than sets of tuples. Our approach, termed CubiST++ (Cubing with Statistics Trees Plus Families), represents a drastic departure from existing relational (ROLAP) and multi-dimensional (MOLAP) approaches in that it does not use the view lattice to compute and materialize new views from existing views in some heuristic fashion. Instead, CubiST++ encodes all possible aggregate views in the leaves of a new data structure called statistics tree (ST) during a one-time scan of the detailed data. In order to optimize the queries involving constraints on hierarchy levels of the underlying dimensions, we select and materialize a family of candidate trees, which represent superviews over the different hierarchical levels of the dimensions. Given a query, our query evaluation algorithm selects the smallest tree in the family, which can provide the answer. Extensive evaluations of our prototype implementation have demonstrated its superior run-time performance and scalability when compared with existing MOLAP and ROLAP systems.