PFRF: An adaptive data replication algorithm based on star-topology data grids
Future Generation Computer Systems
Efficient provisioning of bursty scientific workloads on the cloud using adaptive elasticity control
Proceedings of the 3rd workshop on Scientific Cloud Computing Date
Characterizing spot price dynamics in public cloud environments
Future Generation Computer Systems
Enhanced Dynamic Hierarchical Replication and Weighted Scheduling Strategy in Data Grid
Journal of Parallel and Distributed Computing
Journal of Network and Computer Applications
Hi-index | 0.00 |
Grid computing proves to be a successful paradigm for large-scale distributed data processing, and global eScience Grids have been in production for years (e.g., LCG and OSG). The majority of applications running on these production environments can be characterized as massive CPU-intensive batch jobs (or “bag-of-tasks”), sometimes considered as the “killer” application for the Grid. A deep understanding of its main workload characteristics is not only necessary for realistic performance evaluation of the existing system, but also crucial to generate new insights into better resource allocation schemes. This paper presents a comprehensive statistical analysis of the workloads on production eScience Grid environments. We focus on second-order statistics and the scaling behavior of main job characteristics, namely job arrivals and job runtimes. A range of autocorrelation structures is identified and analyzed, including pseudoperiodicity, short-range dependence (SRD), and long-range dependence (LRD). We further develop mathematical models that are able to capture these salient properties in the workloads. Workload models, in turn, enable us to quantitatively evaluate the performance impacts of autocorrelations in Grid scheduling. The results indicate that autocorrelations in workloads result in system performance degradation, sometimes the difference can be as large as up to several orders of magnitude. Nevertheless, better performance can be achieved at the Grid level under bursty local background workloads. Such effects of workloads on systems are extensively analyzed and explained.