Performance Analysis of Parallel Query Processing Algorithms for Object-Oriented Databases

Authors:
Stanley Y. W. Su;Sanjay Ranka;Xiang He
Affiliations:
-;-;-
Venue:
IEEE Transactions on Knowledge and Data Engineering
Year:
2000

Citing 23
Cited 2

Join indices

ACM Transactions on Database Systems (TODS)
Optimization of large join queries

SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
A performance evaluation of four parallel join algorithms in a shared-nothing multiprocessor environment

SIGMOD '89 Proceedings of the 1989 ACM SIGMOD international conference on Management of data
A performance evaluation of pointer-based joins

SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
A general framework for the optimization of object-oriented queries

SIGMOD '92 Proceedings of the 1992 ACM SIGMOD international conference on Management of data
An object flow computer for database applications: design and performance evaluation

Journal of Parallel and Distributed Computing
The 007 Benchmark

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Performance analysis of parallel object-oriented query processing algorithms

Distributed and Parallel Databases - Special issue on distributed/parallel database object management
Identification- and elimination-based parallel query processing techniques for object-oriented databases

Journal of Parallel and Distributed Computing
Implementation and evaluation of parallel query processing algorithms and data partitioning heuristics in object-oriented databases

Distributed and Parallel Databases
Join and Semijoin Algorithms for a Multiprocessor Database Machine

ACM Transactions on Database Systems (TODS)
Parallel pointer-based join techniques for object-oriented database

PDIS '93 Proceedings of the second international conference on Parallel and distributed information systems
Multiple wavefront algorithms for pattern-based processing of object-oriented databases

PDIS '91 Proceedings of the first international conference on Parallel and distributed information systems
The Gamma Database Machine Project

IEEE Transactions on Knowledge and Data Engineering
Algorithms for Asynchronous Parallel Processing of Object-Oriented Databases

IEEE Transactions on Knowledge and Data Engineering
Parallelism in Object-Oriented Query Processing

Proceedings of the Sixth International Conference on Data Engineering
Scheduling and Processor Allocation for Parallel Execution of Multi-Join Queries

Proceedings of the Eighth International Conference on Data Engineering
An Object Flow Computer For Database Applications

IWDM '89 Proceedings of the Sixth International Workshop on Database Machines
GAMMA - A High Performance Dataflow Database Machine

VLDB '86 Proceedings of the 12th International Conference on Very Large Data Bases
Tradeoffs in Processing Complex Join Queries via Hashing in Multiprocessor Database Machines

VLDB '90 Proceedings of the 16th International Conference on Very Large Data Bases
Bucket Spreading Parallel Hash: A New, Robust, Parallel Hash Join Method for Data Skew in the Super Database Computer (SDC)

VLDB '90 Proceedings of the 16th International Conference on Very Large Data Bases
Implementation and Analysis of a Parallel Collection Query Language

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Predict Query Processing Cost in a Distributed Datbase System

DDEXA '93 Proceedings of the 4th International Conference on Database and Expert Systems Applications

An Experimental Performance Evaluation of Join Algorithms for Parallel Object Databases

Euro-Par '01 Proceedings of the 7th International Euro-Par Conference Manchester on Parallel Processing
The Design, Implementation and Evaluation of an ODMG Compliant, Parallel Object Database Server

Distributed and Parallel Databases

Quantified Score

Hi-index	0.00

Visualization

Abstract

In recent years, parallel processing and optimization algorithms for processing object-oriented databases have drawn a considerable amount of attention from the database research community. Two general types of algorithms have been introduced: hybrid-hash pointer-based algorithms and multiwavefront algorithms. In this work, we quantitatively analyze the two algorithms and develop analytical formulas to capture the main performance features of these two approaches. We study their performance in three application environments: One is characterized by large databases having many object classes, each of which contains a large number of instances; the second one is characterized by large databases having many object classes, each of which contains a relatively small number of instances; and the third one is by large databases having object classes of varying sizes. A horizontal data partitioning strategy, in which each object class is partitioned into horizontal segments stored across all processors, is used in the first environment. A class-per-node assignment strategy, in which instances of each object class are stored in a single processor, is used in the second environment. In the third environment, object classes are partitioned horizontally and assigned to a varying number of processors depending on their different sizes. Our analytical results show that the multiwavefront algorithm has three distinguishing features which contribute to its better performance: 1) two-phase processing strategy, 2) vertical partitioning of horizontal segments, and 3) dynamic determination of 驴collision point驴 in multiwavefront propagations which results in an optimized query execution plan. We show that if these features are adopted by a hybrid-hash, pointer-based algorithm, its performance will be comparable with that of the multiwavefront algorithm because the difference in CPU time between them is negligible. The assumed computing environment is a network of workstations having a share-nothing architecture. The schema and some queries selected from the OO7 benchmark are used in the performance analyses and comparisons. The queries are modified slightly in different data environments in order to reflect the features of diverse database applications.