Energy efficiency for large-scale MapReduce workloads with significant interactive analysis
Proceedings of the 7th ACM european conference on Computer Systems
GreenHadoop: leveraging green energy in data-processing frameworks
Proceedings of the 7th ACM european conference on Computer Systems
STSHC: secure and trusted scheme for Hadoop cluster
International Journal of High Performance Systems Architecture
Hi-index | 0.00 |
We present a detailed evaluation and sensitivity analysis of an energy-conserving, highly scalable variant of the Hadoop Distributed File System (HDFS) called Green-HDFS. Green HDFS logically divides the servers in a Hadoop cluster into Hot and Cold Zones and relies on insightful data-classification driven energy-conserving data placement to realize guaranteed, substantially long periods(several days) of idleness in a significant subset of servers in the Cold Zone. Detailed lifespan analysis of the files in a large-scale production Hadoop cluster at Yahoo! points at the viability of Green HDFS. Simulation results with real-worldYahoo! HDFS traces show that Green HDFS can achieve 24% energy cost reduction by doing power management in only one top-level tenant directory in the cluster and meets all the scale-down mandates in spite of the unique scale-down challenges present in a Hadoop cluster. If Green HDFS technique is applied to all the Hadoop clusters at Yahoo! (amounting to 38000 servers), $2.1millioncan be saved in energy costs per annum. Sensitivity analysis shows that energy-conservation is minimally sensitive to the thresholds in Green HDFS. Lifespan analysis points out that one-size-fits-all energy-management policies won’tsuffice in a multi-tenant Hadoop Cluster.