Simplifying MapReduce data processing

Authors:
Chih-Shan Liao;Jin-Ming Shih;Ruay-Shiung Chang
Affiliations:
Department of Computer Science and Information Engineering, National Dong Hwa University, 1, Sec. 2, Da Hsueh Rd., Shou-Feng, Hualien, 974, Taiwan;Department of Computer Science and Information Engineering, National Dong Hwa University, 1, Sec. 2, Da Hsueh Rd., Shou-Feng, Hualien, 974, Taiwan;Department of Computer Science and Information Engineering, National Dong Hwa University, 1, Sec. 2, Da Hsueh Rd., Shou-Feng, Hualien, 974, Taiwan
Venue:
International Journal of Computational Science and Engineering
Year:
2013

Citing 9
Cited 0

The Google file system

SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
MapReduce: simplified data processing on large clusters

OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Bigtable: a distributed storage system for structured data

OSDI '06 Proceedings of the 7th symposium on Operating systems design and implementation
Pairwise document similarity in large collections with MapReduce

HLT-Short '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers
SQL in the Clouds

Computing in Science and Engineering
Hadoop: The Definitive Guide

Hadoop: The Definitive Guide
The Hadoop Distributed File System

MSST '10 Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST)
Large-scale incremental processing using distributed transactions and notifications

OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Hadoop in Action

Hadoop in Action

Quantified Score

Hi-index	0.00

Visualization

Abstract

MapReduce is a programming model developed by Google for processing and generating large datasets in distributed environments. Many real world tasks can be implemented by two functions, map and reduce. MapReduce plays a key role in cloud computing, since it decreases the complexity of the distributed programming and is easy to be developed on large clusters of common machines. Hadoop, an open-source project, is used to implement Google MapReduce architecture. It is widely used by many applications such as FaceBook, Yahoo, Twitter, and so on. However, it is difficult to decouple an application into functions of map and reduce for common users. In this paper, we focus on convenient use of MapReduce and propose using components to compose a MapReduce solution. We develop a web-based graphic user interface for ordinary users to utilise MapReduce without the real programming. Users only have to know how to specify their tasks in target-value-action tuples. Real examples are provided for demonstration.