Fat-trees: universal networks for hardware-efficient supercomputing
IEEE Transactions on Computers
Fault-tolerant routing in MIN-based supercomputers
Proceedings of the 1990 ACM/IEEE conference on Supercomputing
A Strategy to Compute the InfiniBand Arbitration Tables
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
A Strategy to Manage Time Sensitive Traffic in InfiniBand
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
K-ary N-trees: High Performance Networks for Massively Parallel Architectures
K-ary N-trees: High Performance Networks for Massively Parallel Architectures
Siamese-Twin: A Dynamically Fault-Tolerant Fat-Tree
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Dynamic Fault Tolerance with Misrouting in Fat Trees
ICPP '06 Proceedings of the 2006 International Conference on Parallel Processing
Decoupling the Bandwidth and Latency Bounding for Table-based Schedulers
ICPP '06 Proceedings of the 2006 International Conference on Parallel Processing
Combining Source Routing and Dynamic Fault Tolerance
SBAC-PAD '06 Proceedings of the 18th International Symposium on Computer Architecture and High Performance Computing
Hi-index | 0.00 |
A very important ingredient in the computing landscape isUtility Computing Data Centres (UCDCs), large-scale computing systemsthat offer computational services to concurrently running jobsthrough virtual servers. As UCDC systems increase in size and the meantime between failure decreases, it is becoming an increasingly importantchallenge to expediently tolerate failures (dynamically), while distributingthe effects of the failure amongst the virtual servers according to theirservice level agreements. We propose and evaluate a strategy for offeringpredictable service in fat-trees experiencing faults, by reprioritisingpackets. The strategy is able to distribute the effect of network faults inorder to satisfy a number of quality-of-service demands. Which demandsto favour depends on the computer system and the characteristics of thejobs it is running, and in the presence of a moderate number of faults itis to some degree possible to meet the demands.