A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs
SIAM Journal on Scientific Computing
A flexible model for resource management in virtual private networks
Proceedings of the conference on Applications, technologies, architectures, and protocols for computer communication
Provisioning a virtual private network: a network design problem for multicommodity flow
STOC '01 Proceedings of the thirty-third annual ACM symposium on Theory of computing
A solver for the network testbed mapping problem
ACM SIGCOMM Computer Communication Review
Measurement based characterization and provisioning of IP VPNs
Proceedings of the 4th ACM SIGCOMM conference on Internet measurement
Availability of multi-object operations
NSDI'06 Proceedings of the 3rd conference on Networked Systems Design & Implementation - Volume 3
Dynamic function placement for data-intensive cluster computing
ATEC '00 Proceedings of the annual conference on USENIX Annual Technical Conference
Rethinking virtual network embedding: substrate support for path splitting and migration
ACM SIGCOMM Computer Communication Review
Partitioning graphs into balanced components
SODA '09 Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms
PortLand: a scalable fault-tolerant layer 2 data center network fabric
Proceedings of the ACM SIGCOMM 2009 conference on Data communication
VL2: a scalable and flexible data center network
Proceedings of the ACM SIGCOMM 2009 conference on Data communication
The nature of data center traffic: measurements & analysis
Proceedings of the 9th ACM SIGCOMM conference on Internet measurement conference
A survey of network virtualization
Computer Networks: The International Journal of Computer and Telecommunications Networking
Improving the scalability of data center networks with traffic-aware virtual machine placement
INFOCOM'10 Proceedings of the 29th conference on Information communications
Volley: automated data placement for geo-distributed cloud services
NSDI'10 Proceedings of the 7th USENIX conference on Networked systems design and implementation
Network traffic characteristics of data centers in the wild
IMC '10 Proceedings of the 10th ACM SIGCOMM conference on Internet measurement
SecondNet: a data center network virtualization architecture with bandwidth guarantees
Proceedings of the 6th International COnference
Availability in globally distributed storage systems
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Sharing the data center network
Proceedings of the 8th USENIX conference on Networked systems design and implementation
Towards predictable datacenter networks
Proceedings of the ACM SIGCOMM 2011 conference
Understanding network failures in data centers: measurement, analysis, and implications
Proceedings of the ACM SIGCOMM 2011 conference
Survivable virtual network embedding
NETWORKING'10 Proceedings of the 9th IFIP TC 6 international conference on Networking
ViNEYard: virtual network embedding algorithms with coordinated node and link mapping
IEEE/ACM Transactions on Networking (TON)
Failure-Oriented Path Restoration Algorithm for Survivable Networks
IEEE Transactions on Network and Service Management
Survivable Routing of Mesh Topologies in IP-over-WDM Networks by Recursive Graph Contraction
IEEE Journal on Selected Areas in Communications
Coflow: a networking abstraction for cluster applications
Proceedings of the 11th ACM Workshop on Hot Topics in Networks
Increasing network resilience through edge diversity in NEBULA
ACM SIGMOBILE Mobile Computing and Communications Review
Leveraging endpoint flexibility in data-intensive clusters
Proceedings of the ACM SIGCOMM 2013 conference on SIGCOMM
ElasticSwitch: practical work-conserving bandwidth guarantees for cloud computing
Proceedings of the ACM SIGCOMM 2013 conference on SIGCOMM
Corybantic: towards the modular composition of SDN control programs
Proceedings of the Twelfth ACM Workshop on Hot Topics in Networks
Hi-index | 0.00 |
Datacenter networks have been designed to tolerate failures of network equipment and provide sufficient bandwidth. In practice, however, failures and maintenance of networking and power equipment often make tens to thousands of servers unavailable, and network congestion can increase service latency. Unfortunately, there exists an inherent tradeoff between achieving high fault tolerance and reducing bandwidth usage in network core; spreading servers across fault domains improves fault tolerance, but requires additional bandwidth, while deploying servers together reduces bandwidth usage, but also decreases fault tolerance. We present a detailed analysis of a large-scale Web application and its communication patterns. Based on that, we propose and evaluate a novel optimization framework that achieves both high fault tolerance and significantly reduces bandwidth usage in the network core by exploiting the skewness in the observed communication patterns.