Principles of database buffer management
ACM Transactions on Database Systems (TODS)
Equi-depth multidimensional histograms
SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
The buddy tree: an efficient and robust access method for spatial data base
Proceedings of the sixteenth international conference on Very large databases
The R*-tree: an efficient and robust access method for points and rectangles
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Query evaluation techniques for large databases
ACM Computing Surveys (CSUR)
An overview of data warehousing and OLAP technology
ACM SIGMOD Record
An array-based algorithm for simultaneous multidimensional aggregates
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
An alternative storage organization for ROLAP aggregate views based on cubetrees
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Multidimensional access methods
ACM Computing Surveys (CSUR)
The Grid File: An Adaptable, Symmetric Multikey File Structure
ACM Transactions on Database Systems (TODS)
A survey of logical models for OLAP databases
ACM SIGMOD Record
The Quadtree and Related Hierarchical Data Structures
ACM Computing Surveys (CSUR)
Dynamic maintenance of data distribution for selectivity estimation
The VLDB Journal — The International Journal on Very Large Data Bases
A Region Splitting Strategy for Physical Database Design of Multidimensional File Organizations
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Aggregation Algorithms for Very Large Compressed Data Warehouses
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
On the Computation of Multidimensional Aggregates
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
A one-pass aggregation algorithm with the optimal buffer size in multidimensional OLAP
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Business intelligence for small and middle-sized entreprises
ACM SIGMOD Record
Hi-index | 0.07 |
Aggregation is an operation that plays a key role in multidimensional OLAP (MOLAP). Existing aggregation methods in MOLAP have been proposed for file structures such as multidimensional arrays. These file structures are suitable for data with uniform distributions, but do not work well with skewed distributions. In this paper, we consider an aggregation method that uses dynamic multidimensional files adapting to skewed distributions. In these multidimensional files, the sizes of page regions vary according to the data density in these regions, and the pages that belong to a larger region are accessed multiple times while computing aggregations. To solve this problem, we first present an aggregation computation model that uses the new notions of disjoint-inclusive partition and induced space filling curves . Based on this model, we then present a dynamic aggregation algorithm. Using these notions, the algorithm allows us to maximize the effectiveness of the buffer--we control the page access order in such a way that a page being accessed can reside in the buffer until the next access. We have conducted experiments to show the effectiveness of our approach. Experimental results for a real data set show that the algorithm reduces the number of disk accesses by up to 5.09 times compared with a naive algorithm. The results further show that the algorithm achieves a near optimal performance (i.e., normalized I/O = 1.01) with the total main memory (needed for the buffer and the result table) less than 1.0% of the database size. We believe our work also provides an excellent formal basis for investigating further issues in computing aggregations in MOLAP.