Performance evaluation of functional disk system with nonuniform data distribution

Authors:
Masaru Kitsuregawa;Miyuki Nakano;Lilian Harada;Mikio Takagi
Affiliations:
Institute of Industrial Science, University of Tokyo, 22-1, Roppongi 7, Minato- kc, Tokyo, Japan;Institute of Industrial Science, University of Tokyo, 22-1, Roppongi 7, Minato- kc, Tokyo, Japan;Institute of Industrial Science, University of Tokyo, 22-1, Roppongi 7, Minato- kc, Tokyo, Japan;Institute of Industrial Science, University of Tokyo, 22-1, Roppongi 7, Minato- kc, Tokyo, Japan
Venue:
DPDS '90 Proceedings of the second international symposium on Databases in parallel and distributed systems
Year:
1990

Citing 14
Cited 3

Join processing in database systems with large main memories

ACM Transactions on Database Systems (TODS)
Comparative benchmarking of relational database systems

Comparative benchmarking of relational database systems
A performance evaluation of four parallel join algorithms in a shared-nothing multiprocessor environment

SIGMOD '89 Proceedings of the 1989 ACM SIGMOD international conference on Management of data
The effect of bucket size tuning in the dynamic hybrid GRACE hash join method

VLDB '89 Proceedings of the 15th international conference on Very large data bases
The art of computer programming, volume 3: (2nd ed.) sorting and searching

The art of computer programming, volume 3: (2nd ed.) sorting and searching
Implications of certain assumptions in database performance evauation

ACM Transactions on Database Systems (TODS)
Operating system support for database management

Communications of the ACM
Implementation techniques for main memory database systems

SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Accurate estimation of the number of tuples satisfying a condition

SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Functional Disk System for Relational Database

Proceedings of the Third International Conference on Data Engineering
Query Execution for Large Relations on Functional Disk Systems

Proceedings of the Fifth International Conference on Data Engineering
Hashing Methods and Relational Algebra Operations

VLDB '84 Proceedings of the 10th International Conference on Very Large Data Bases
Hash-Partitioned Join Method Using Dynamic Destaging Strategy

VLDB '88 Proceedings of the 14th International Conference on Very Large Data Bases
Dataflow query processing using multiprocessor hash-partitioned algorithms (database, pipeline, parallelism)

Dataflow query processing using multiprocessor hash-partitioned algorithms (database, pipeline, parallelism)

A Parallel Hash Join Algorithm for Managing Data Skew

IEEE Transactions on Parallel and Distributed Systems
Performance Analysis of Affinity Clustering on Transaction Processing Coupling Architecture

IEEE Transactions on Knowledge and Data Engineering
Performance Analysis of a Load Balancing Hash-Join Algorithm for a Shared Memory Multiprocessor

VLDB '91 Proceedings of the 17th International Conference on Very Large Data Bases

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we analyze the performance of a Functional Disk System with Relational database engine (FDS-RII) for a nonuniform data distribution. FDS-RII is a relational storage system, designed to accelerate relational algebraic operations, which employs a hash-based algorithm to process relational operations. Basically, in the hash-based algorithm, a relation is first partitioned into several clusters by a split function. Then each cluster is staged onto the main memory and, further, a hash function is applied to each cluster to perform a relational operation. Thus, the nonuniformity of split and hash functions is considered to be resulting from a nonuniform data distribution on the hash-based algorithm. We clarify the effect of nonuniformity of the hash and split functions on the join performance. It is possible to attenuate the effect of the hash function nonuniformity by increasing the number of processors and processing the buckets in parallel. Furthermore, in order to tackle the nonuniformity of split function, we introduce the Combined Hash Algorithm. This algorithm combines the Grace Hash Algorithm with the Nested Loop Algorithm in order to handle the overflown bucket efficiently. Using the Combined Hash Algorithm, we find that the execution time of the nonuniform data distribution is almost equal to that of the uniform data distribution. Thus we can get sufficiently high performance on FDS-RII also for nonuniformly distributed data.