Adaptive control system for server groups in enterprise data centers
CCGRID '04 Proceedings of the 2004 IEEE International Symposium on Cluster Computing and the Grid
Strategies for effective use of exergy-based modeling of data center thermal management systems
Microelectronics Journal
Thermal-aware workload scheduling for energy efficient data centers
Proceedings of the 7th international conference on Autonomic computing
Real time emulations: foundation and applications
Proceedings of the 47th Design Automation Conference
Cooling-aware workload placement with performance constraints
Performance Evaluation
Proactive thermal management in green datacenters
The Journal of Supercomputing
Towards a net-zero data center
ACM Journal on Emerging Technologies in Computing Systems (JETC)
Energy-Efficient Thermal-Aware Autonomic Management of Virtualized HPC Cloud Infrastructure
Journal of Grid Computing
Thermal camera networks for large datacenters using real-time thermal monitoring mechanism
The Journal of Supercomputing
Hi-index | 0.00 |
Consolidation and dense aggregation of slim compute, storage and networking hardware has resulted in high power density data centers. The high power density resulting from current and future generations of servers necessitates detailed thermo-fluids analysis to provision the cooling resources in a given data center for reliable operation. The analysis must also predict the impact on the thermo-fluid distribution due to changes in hardware configuration and building infrastructure such as a sudden failure in data center cooling resources. The objective of the analysis is to assure availability of adequate cooling resources to match the heat load, which is typically non-uniformly distributed and characterized by high-localized power density. This study presents an analysis of an example modern data center with a view of the magnitude of temperature variation and impact of a failure. Initially, static provisioning for a given distribution of heat loads and cooling resources is achieved to produce a reference state. A perturbation in reference state is introduced to simulate a very plausible scenario--failure of a computer room air conditioning (CRAC) unit. The transient model shows the "redlining" of inlet temperature of systems in the area that is most influenced by the failed CRAC. In this example high-density data center, the time to reach unacceptable inlet temperature is less than 80 seconds based on an example temperature set point limit of 40°C (most of today's servers would require an inlet temperature below 35°C to operate). An effective approach to resolve this issue, if there is adequate capacity, is to migrate the compute workload to other available systems within the data center to reduce the inlet temperature to the servers to an acceptable level.