Implementing vertical splitting for large scale multidimensional datasets and its evaluations

Authors:
Takayuki Tsuchida;Tatsuo Tsuji;Ken Higuchi
Affiliations:
Graduate School of Engineering, University of Fukui, Fukui, Japan;Graduate School of Engineering, University of Fukui, Fukui, Japan;Graduate School of Engineering, University of Fukui, Fukui, Japan
Venue:
DaWaK'11 Proceedings of the 13th international conference on Data warehousing and knowledge discovery
Year:
2011

Citing 11
Cited 1

Vertical partitioning for database design: a graphical algorithm

SIGMOD '89 Proceedings of the 1989 ACM SIGMOD international conference on Management of data
An array-based algorithm for simultaneous multidimensional aggregates

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Allocating Storage for Extendible Arrays

Journal of the ACM (JACM)
A class of data structures for associative searching

PODS '84 Proceedings of the 3rd ACM SIGACT-SIGMOD symposium on Principles of database systems
Integrating the UB-Tree into a Database System Kernel

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
The Universal B-Tree for Multidimensional Indexing: general Concepts

WWCA '97 Proceedings of the International Conference on Worldwide Computing and Its Applications
Integrating vertical and horizontal partitioning into automated physical database design

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Generalized multidimensional data mapping and query processing

ACM Transactions on Database Systems (TODS)
History offset implementation scheme for large scale multidimensional data sets

Proceedings of the 2008 ACM symposium on Applied computing
Efficient and scalable statistics gathering for large databases in Oracle 11g

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
An incremental maintenance scheme of data cubes

DASFAA'08 Proceedings of the 13th international conference on Database systems for advanced applications

A proposal of storage scheme for supply chain management

Proceedings of the 14th International Conference on Information Integration and Web-based Applications & Services

Quantified Score

Hi-index	0.00

Visualization

Abstract

History-offset encoding we are proposing is a scheme for encoding multidimensional datasets. In general, significant problems in implementing multidimensional databases include the saturation of address space for addressing multidimensional data. One of the solutions against this problem is splitting the dimension attributes of the multidimensional data into more than one group; i.e., vertical splitting. We have implemented the vertical splitting scheme for large scale multidimensional datasets based on the history-offset encoding. In this paper, we describe implementation of the constructed prototype system and experimentally evaluate and compare the system with other systems. These systems include PostgreSQL, which is a relational DBMS conventionally implemented, and UB tree, which is organized in a similar kind of multidimensional approach with our history-offset encoding. The evaluation results prove that our vertical splitting scheme can reduce retrieval I/O cost, while expanding the required logical address space to store large scale multidimensional datasets. Our method far outperforms PostgreSQL and is fairly better than UB tree in retrieval time. The splitting causes increase of storage cost but the cost is not so large compared with those of them.