Map-reduce programming model and hadoop distributed file system for use in undergraduate curriculum

  • Authors:
  • K. Madurai;B. Ramamurthy

  • Affiliations:
  • University at Buffalo, NY;University at Buffalo, NY

  • Venue:
  • Journal of Computing Sciences in Colleges
  • Year:
  • 2009

Quantified Score

Hi-index 0.01

Visualization

Abstract

In this tutorial we will discuss the details of the map-reduce programming model that has been brought to prominence by the Google File System (GFS) [1]. We will also demonstrate the application of MapReduce using Hadoop Distributed File System (HDFS) [2]. We will discuss two case studies (i) word count on a web log and (ii) financial analytics using Markovitz model [4]. Implementation of HDFS on a single node, two-node and multiple-node cluster will be demonstrated. The challenges in assembling a HDFS and using MapReduce will be discussed. Expected audiences are undergraduate teachers who want to learn about MapReduce and HDFS and those who are interested in introducing these in the curriculum. These two topics were featured very prominently in SIGCSE 2008.