A novel approach to resource scheduling for parallel query processing on computational grids
Distributed and Parallel Databases
A parallel spatial data analysis infrastructure for the cloud
Proceedings of the 21st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
Hi-index | 0.00 |
Modular clusters are now composed of non-uniform nodes with different CPUs, disks ornetwork cards so that customers can adapt thecluster configuration to the changingtechnologies and to their changing needs. Thischallenges dataflow parallelism as the primaryload balancing technique of existing paralleldatabase systems. We show in this paper thatdataflow parallelism alone is ill suited formodular clusters because running the sameoperation on different subsets of the data can notfully utilize non-uniform hardware resources.We propose and evaluate new load balancingtechniques that blend pipeline parallelism withdata parallelism. We consider relationaloperators as pipelines of fine-grained operationsthat can be located on different cluster nodes andexecuted in parallel on different data subsets tobest exploit non-uniform resources. We presentan experimental study that confirms thefeasibility and effectiveness of the newtechniques in a parallel execution engineprototype based on the open-source DBMSPredator.