Lowering Inter-datacenter Bandwidth Costs via Bulk Data Scheduling

Authors:
Thyaga Nandagopal;Krishna P. N. Puttaswamy
Affiliations:
-;-
Venue:
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
Year:
2012

Citing 11
Cited 0

Approximation algorithms for bin packing: a survey

Approximation algorithms for NP-hard problems
The macroscopic behavior of the TCP congestion avoidance algorithm

ACM SIGCOMM Computer Communication Review
Flow and stretch metrics for scheduling continuous job streams

Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Scheduling data transfers in a network and the set scheduling problem

Journal of Algorithms
Scheduling deadline-constrained bulk data transfers to minimize network congestion

CCGRID '07 Proceedings of the Seventh IEEE International Symposium on Cluster Computing and the Grid
Delay tolerant bulk data transfers on the internet

Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems
Understanding data center traffic characteristics

Proceedings of the 1st ACM workshop on Research on enterprise networking
The nature of data center traffic: measurements & analysis

Proceedings of the 9th ACM SIGCOMM conference on Internet measurement conference
Planning Large Data Transfers in Institutional Grids

CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
Optimal Scheduling of Urgent Preemptive Tasks

RTCSA '10 Proceedings of the 2010 IEEE 16th International Conference on Embedded and Real-Time Computing Systems and Applications
Inter-datacenter bulk transfers with netstitcher

Proceedings of the ACM SIGCOMM 2011 conference

Quantified Score

Hi-index	0.00

Visualization

Abstract

Cloud service providers (CSP) of today operate multiple data centers, over which they provide resilient infrastructure, data storage and compute services. The links between data centers have very high capacity, and are typically purchased by the CSPs using established billing practices, such as 95-thpercentile billing or average-usage billing. These links are used to serve both client traffic as well as CSP-specific bulk data traffic, such as backup jobs, etc. Past studies have shown a diurnal pattern of traffic over such links. However, CSPs pay for the peak bandwidth, which implies that they are under-utilizing the capacity for which they have paid for. We propose a scheduling framework that considers various classes of jobs that are encountered over such links, and propose GRESE, an algorithm that attempts to minimize overall bandwidth costs to the CSP, by leveraging the flexible nature of the deadlines of these bulk data jobs. We demonstrate the problem is not a simple extension of any well-known scheduling problems, and show how the GRESE algorithm is effective in curtailing CSP bandwidth costs.