Workload-aware data partitioning in community-driven data grids
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Adaptive workload allocation in query processing in autonomous heterogeneous environments
Distributed and Parallel Databases
Cluster-and-conquer: hierarchical multi-metric query processing in large-scale database federations
Proceedings of the Fourteenth International Database Engineering & Applications Symposium
Decentralized execution of linear workflows over web services
Future Generation Computer Systems
Incorporating change detection in network coordinate systems for large data transfers
Proceedings of the 17th Panhellenic Conference on Informatics
Hi-index | 0.00 |
We introduce join scheduling algorithms that employ a balanced network utilization metric to optimize the use of all network paths in a global-scale database federation. This metric allows algorithms to exploit excess capacity in the network, while avoiding narrow, long-haul paths. We give a two-approximate, polynomial-time algorithm for serial (left-deep) join schedules. We also present extensions to this algorithm that explore parallel schedules, reduce resource usage, and define trade-offs between computation and network utilization. We evaluate these techniques within the SkyQuery federation of Astronomy databases using spatial-join queries submitted by SkyQuery's users. Experiments show that our algorithms realize near-optimal network utilization with minor computational overhead.