Hypervisor-based fault tolerance
ACM Transactions on Computer Systems (TOCS) - Special issue on operating system principles
Node-covering, Error-correcting Codes and Multiprocessors with Very High Average Fault Tolerance
IEEE Transactions on Computers
Live migration of virtual machines
NSDI'05 Proceedings of the 2nd conference on Symposium on Networked Systems Design & Implementation - Volume 2
Bigtable: a distributed storage system for structured data
OSDI '06 Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation - Volume 7
Rethinking virtual network embedding: substrate support for path splitting and migration
ACM SIGCOMM Computer Communication Review
Remus: high availability via asynchronous virtual machine replication
NSDI'08 Proceedings of the 5th USENIX Symposium on Networked Systems Design and Implementation
Virtual routers on the move: live router migration as a network-management primitive
Proceedings of the ACM SIGCOMM 2008 conference on Data communication
Trellis: a platform for building flexible, fast virtual networks on commodity hardware
CoNEXT '08 Proceedings of the 2008 ACM CoNEXT Conference
BCube: a high performance, server-centric network architecture for modular data centers
Proceedings of the ACM SIGCOMM 2009 conference on Data communication
A virtual network mapping algorithm based on subgraph isomorphism detection
Proceedings of the 1st ACM workshop on Virtualized infrastructure systems and architectures
Network virtualization: state of the art and research challenges
IEEE Communications Magazine
Reliability in layered networks with random link failures
INFOCOM'10 Proceedings of the 29th conference on Information communications
Reliability Support in Virtual Infrastructures
CLOUDCOM '10 Proceedings of the 2010 IEEE Second International Conference on Cloud Computing Technology and Science
Decentralized erasure codes for distributed networked storage
IEEE Transactions on Information Theory
International Journal of Web and Grid Services
Resource allocation with multi-factor node ranking in data center networks
Future Generation Computer Systems
Hi-index | 0.01 |
In a virtualized infrastructure where physical resources are shared, a single physical server failure will terminate several virtual servers and crippling the virtual infrastructures which contained those virtual servers. In the worst case, more failures may cascade from overloading the remaining servers. To guarantee some level of reliability, each virtual infrastructure, at instantiation, should be augmented with backup virtual nodes and links that have sufficient capacities. This ensures that, when physical failures occur, sufficient computing resources are available and the virtual network topology is preserved. However, in doing so, the utilization of the physical infrastructure may be greatly reduced. This can be circumvented if backup resources are pooled and shared across multiple virtual infrastructures, and intelligently embedded in the physical infrastructure. These techniques can reduce the physical footprint of virtual backups while guaranteeing reliability.