Hadoop high availability through metadata replication

  • Authors:
  • Feng Wang;Jie Qiu;Jie Yang;Bo Dong;Xinhui Li;Ying Li

  • Affiliations:
  • IBM China Research Laboratory, Beijing, China;IBM China Research Laboratory, Beijing, China;IBM China Research Laboratory, Beijing, China;Xi'an Jiaotong University, Xi'an, China;IBM China Research Laboratory, Beijing, China;IBM China Research Laboratory, Beijing, China

  • Venue:
  • Proceedings of the first international workshop on Cloud data management
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Hadoop is widely adopted to support data intensive distributed applications. Many of them are mission critical and require inherent high availability of Hadoop. Unfortunately, Hadoop has no high availability support yet, and it is not trivial to enhance Hadoop. Based on thorough investigation of Hadoop, this paper proposes a metadata replication based solution to enable Hadoop high availability by removing single point of failure in Hadoop. The solution involves three major phases: in initialization phase, each standby/slave node is registered to active/primary node and its initial metadata (such as version file and file system image) are caught up with those of active/primary node; in replication phase, the runtime metadata (such as outstanding operations and lease states) for failover in future are replicated; in failover phase, standby/new elected primary node takes over all communications. The solution presents several unique features for Hadoop, such as runtime configurable synchronization mode. The experiments demonstrate the feasibility and efficiency of our solution.