Performance evaluation of functional disk system with nonuniform data distribution

  • Authors:
  • Masaru Kitsuregawa;Miyuki Nakano;Lilian Harada;Mikio Takagi

  • Affiliations:
  • Institute of Industrial Science, University of Tokyo, 22-1, Roppongi 7, Minato- kc, Tokyo, Japan;Institute of Industrial Science, University of Tokyo, 22-1, Roppongi 7, Minato- kc, Tokyo, Japan;Institute of Industrial Science, University of Tokyo, 22-1, Roppongi 7, Minato- kc, Tokyo, Japan;Institute of Industrial Science, University of Tokyo, 22-1, Roppongi 7, Minato- kc, Tokyo, Japan

  • Venue:
  • DPDS '90 Proceedings of the second international symposium on Databases in parallel and distributed systems
  • Year:
  • 1990

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we analyze the performance of a Functional Disk System with Relational database engine (FDS-RII) for a nonuniform data distribution. FDS-RII is a relational storage system, designed to accelerate relational algebraic operations, which employs a hash-based algorithm to process relational operations. Basically, in the hash-based algorithm, a relation is first partitioned into several clusters by a split function. Then each cluster is staged onto the main memory and, further, a hash function is applied to each cluster to perform a relational operation. Thus, the nonuniformity of split and hash functions is considered to be resulting from a nonuniform data distribution on the hash-based algorithm. We clarify the effect of nonuniformity of the hash and split functions on the join performance. It is possible to attenuate the effect of the hash function nonuniformity by increasing the number of processors and processing the buckets in parallel. Furthermore, in order to tackle the nonuniformity of split function, we introduce the Combined Hash Algorithm. This algorithm combines the Grace Hash Algorithm with the Nested Loop Algorithm in order to handle the overflown bucket efficiently. Using the Combined Hash Algorithm, we find that the execution time of the nonuniform data distribution is almost equal to that of the uniform data distribution. Thus we can get sufficiently high performance on FDS-RII also for nonuniformly distributed data.