Analyzing job completion reliability and job energy consumption for a general MapReduce infrastructure

Authors:
Jia-Chun Lin;Fang-Yie Leu;Ying-ping Chen
Affiliations:
Department of Computer Science, National Chiao Tung University, Taiwan. E-mails: kellylin1219@gmail.com, ypchen@cs.nctu.edu.tw;Department of Computer Science, TungHai University, Taiwan. E-mail: leufy@thu.edu.tw;Department of Computer Science, National Chiao Tung University, Taiwan. E-mails: kellylin1219@gmail.com, ypchen@cs.nctu.edu.tw
Venue:
Journal of High Speed Networks
Year:
2013

Citing 12
Cited 0

A Hierarchical Modeling and Analysis for Grid Service Reliability

IEEE Transactions on Computers
MapReduce: simplified data processing on large clusters

Communications of the ACM - 50th anniversary issue: 1958 - 2008
CloudBLAST: Combining MapReduce and Virtualization on Distributed Resources for Bioinformatics Applications

ESCIENCE '08 Proceedings of the 2008 Fourth IEEE International Conference on eScience
Improving reliability of a heterogeneous grid-based intrusion detection platform using levels of redundancies

Future Generation Computer Systems
Hadoop: The Definitive Guide

Hadoop: The Definitive Guide
Characterizing cloud computing hardware reliability

Proceedings of the 1st ACM symposium on Cloud computing
Web-scale computer vision using MapReduce for multimedia data mining

Proceedings of the Tenth International Workshop on Multimedia Data Mining
A Large-Scale Study of Failures in High-Performance Computing Systems

IEEE Transactions on Dependable and Secure Computing
Energy management for MapReduce clusters

Proceedings of the VLDB Endowment
Optimizing intermediate data management in MapReduce computations

Proceedings of the First International Workshop on Cloud Computing Platforms
Energy efficiency for MapReduce workloads: an in-depth study

ADC '12 Proceedings of the Twenty-Third Australasian Database Conference - Volume 124
Deriving Job Completion Reliability and Job Energy Consumption for a General MapReduce Infrastructure from Single-Job Perspective

WAINA '13 Proceedings of the 2013 27th International Conference on Advanced Information Networking and Applications Workshops

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recently, MapReduce has been a popular distributed programming framework, which divides a job into map tasks and reduce tasks and executes these tasks in parallel over a large-scale MapReduce cluster to speed up job execution. Generally, the cluster is a master-slave infrastructure. To prevent jobs from being interrupted due to node failure, current MapReduce implementations, such as Hadoop, adopt a task-reexecution policy on the slave side, i.e., when a slave node due to failure cannot complete a task, this task will be reassigned to another available slave for reexecution. However, on the master side by default, no redundancy scheme is provided. Since this type of infrastructure has been worldwide adopted, we call it the general MapReduce infrastructure GMI. To achieve a more reliable and energy-efficient working environment, understanding the impact of GMI on its job completion reliability JCR and job energy consumption JEC is required. In this paper, we base on a Poisson distribution to analyze GMI's JCR from a single-job perspective. After that, we accordingly derive the corresponding JEC. Through the analytical results, MapReduce managers can comprehend how GMI behaves and how their MapReduce can be improved so as to achieve a more reliable and energy-efficient MapReduce environment.