The R*-tree: an efficient and robust access method for points and rectangles
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Range queries in OLAP data cubes
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
An array-based algorithm for simultaneous multidimensional aggregates
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
Towards the building of a dense-region-based OLAP system
Data & Knowledge Engineering
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals
Data Mining and Knowledge Discovery
R-trees: a dynamic index structure for spatial searching
SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
ICDE '97 Proceedings of the Thirteenth International Conference on Data Engineering
Flexible Data Cubes for Online Aggregation
ICDT '01 Proceedings of the 8th International Conference on Database Theory
Relative Prefix Sums: An Efficient Approach for Querying Dynamic OLAP Data Cubes
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Space-efficient cubes for OLAP range-sum queries
Decision Support Systems
Efficient Range-Sum Queries along Dimensional Hierarchies in Data Cubes
DBKDA '09 Proceedings of the 2009 First International Conference on Advances in Databases, Knowledge, and Data Applications
An effective algorithm to extract dense sub-cubes from a large sparse cube
DaWaK'06 Proceedings of the 8th international conference on Data Warehousing and Knowledge Discovery
Evaluation of top-k OLAP queries using aggregate r–trees
SSTD'05 Proceedings of the 9th international conference on Advances in Spatial and Temporal Databases
Efficient online aggregates in dense-region-based data cube representations
Transactions on large-scale data- and knowledge-centered systems II
Efficient online aggregates in dense-region-based data cube representations
Transactions on large-scale data- and knowledge-centered systems II
Hi-index | 0.00 |
In-memory OLAP systems require a space-efficient representation of sparse data cubes in order to accommodate large data sets. On the other hand, most efficient online aggregation techniques, such as prefix sums, are built on dense array-based representations. These are often not applicable to real-world data due to the size of the arrays which usually cannot be compressed well, as most sparsity is removed during pre-processing. A possible solution is to identify dense regions in a sparse cube and only represent those using arrays, while storing sparse data separately, e.g. in a spatial index structure. Previous dense-region-based approaches have concentrated mainly on the effectiveness of the dense-region detection (i.e. on the space-efficiency of the result). However, especially in higher-dimensional cubes, data is usually more cluttered, resulting in a potentially large number of small dense regions, which negatively affects query performance on such a structure. In this paper, our focus is not only on space-efficiency but also on time-efficiency, both for the initial dense-region extraction and for queries carried out in the resulting hybrid data structure. We describe two methods to trade available memory for increased aggregate query performance. In addition, optimizations in our approach significantly reduce the time to build the initial data structure compared to former systems. Also, we present a straightforward adaptation of our approach to support multi-core or multi-processor architectures, which can further enhance query performance. Experiments with different real-world data sets show how various parameter settings can be used to adjust the efficiency and effectiveness of our algorithms.