SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Balance of Power: Dynamic Thermal Management for Internet Data Centers
IEEE Internet Computing
Making scheduling "cool": temperature-aware workload placement in data centers
ATEC '05 Proceedings of the annual conference on USENIX Annual Technical Conference
ATC'07 2007 USENIX Annual Technical Conference on Proceedings of the USENIX Annual Technical Conference
Energy-aware server provisioning and load dispatching for connection-intensive internet services
NSDI'08 Proceedings of the 5th USENIX Symposium on Networked Systems Design and Implementation
Write off-loading: Practical power management for enterprise storage
ACM Transactions on Storage (TOS)
The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines
The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines
Hadoop: The Definitive Guide
On the energy (in)efficiency of Hadoop clusters
ACM SIGOPS Operating Systems Review
Robust and flexible power-proportional storage
Proceedings of the 1st ACM symposium on Cloud computing
Lightning: self-adaptive, energy-conserving, multi-zoned, commodity green cloud storage system
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Towards continuous policy-driven demand response in data centers
Proceedings of the 2nd ACM SIGCOMM workshop on Green networking
Energy efficient scheduling of MapReduce workloads on heterogeneous clusters
Green Computing Middleware on Proceedings of the 2nd International Workshop
Analysis of disk power management for data-center storage systems
Proceedings of the 3rd International Conference on Future Energy Systems: Where Energy, Computing and Communication Meet
Power-reduction techniques for data-center storage systems
ACM Computing Surveys (CSUR)
Analysis of HDFS under HBase: a facebook messages case study
FAST'14 Proceedings of the 12th USENIX conference on File and Storage Technologies
MapReduce framework energy adaptation via temperature awareness
Cluster Computing
Hi-index | 0.00 |
Hadoop Distributed File System (HDFS) presents unique challenges to the existing energy-conservation techniques and makes it hard to scale-down servers. We propose an energy-conserving, hybrid, logical multi-zoned variant of HDFS for managing data-processing intensive, commodity Hadoop cluster. Green HDFS's data-classification-driven data placement allows scale-down by guaranteeing substantially long periods (several days) of idleness in a subset of servers in the datacenter designated as the Cold Zone. These servers are then transitioned to high-energy-saving, inactive power modes. This is done without impacting the performance of the Hot zone as studies have shown that the servers in the data-intensive compute clusters are under-utilized and, hence, opportunities exist for better consolidation of the workload on the Hot Zone. Analysis of the traces of a Yahoo! Hadoop cluster showed significant heterogeneity in the data's access patterns which can be used to guide energy-aware data placement policies. The trace-driven simulation results with three-month-long real-life HDFS traces from a Hadoop cluster at Yahoo! show a 26% energy consumption reduction by doing only Cold zone power management. Analytical cost model projects savings of $14.6 million in 3-year total cost of ownership (TCO) and simulation results extrapolate savings of $2.4 million annually when Green-HDFS technique is applied across all Hadoop clusters (amounting to 38000 servers) at Yahoo.