Evaluating SPLASH-2 Applications Using MapReduce

Authors:
Shengkai Zhu;Zhiwei Xiao;Haibo Chen;Rong Chen;Weihua Zhang;Binyu Zang
Affiliations:
Parallel Processing Institute, Fudan University,;Parallel Processing Institute, Fudan University,;Parallel Processing Institute, Fudan University,;Parallel Processing Institute, Fudan University,;Parallel Processing Institute, Fudan University,;Parallel Processing Institute, Fudan University,
Venue:
APPT '09 Proceedings of the 8th International Symposium on Advanced Parallel Processing Technologies
Year:
2009

Citing 11
Cited 1

A comparison of sorting algorithms for the connection machine CM-2

SPAA '91 Proceedings of the third annual ACM symposium on Parallel algorithms and architectures
SPLASH: Stanford parallel applications for shared-memory

ACM SIGARCH Computer Architecture News
The SPLASH-2 programs: characterization and methodological considerations

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Map-reduce-merge: simplified relational data processing on large clusters

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Evaluating MapReduce for Multi-core and Multiprocessor Systems

HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
MapReduce: simplified data processing on large clusters

Communications of the ACM - 50th anniversary issue: 1958 - 2008
Fully distributed EM for very large datasets

Proceedings of the 25th international conference on Machine learning
Mars: a MapReduce framework on graphics processors

Proceedings of the 17th international conference on Parallel architectures and compilation techniques
MapReduce for Data Intensive Scientific Analyses

ESCIENCE '08 Proceedings of the 2008 Fourth IEEE International Conference on eScience
Pairwise document similarity in large collections with MapReduce

HLT-Short '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers
Fast, easy, and cheap: construction of statistical machine translation models with MapReduce

StatMT '08 Proceedings of the Third Workshop on Statistical Machine Translation

Evaluating the suitability of mapreduce for surface temperature analysis codes

Proceedings of the second international workshop on Data intensive computing in the clouds

Quantified Score

Hi-index	0.00

Visualization

Abstract

MapReduce has been prevalent for running data-parallel applications. By hiding other non-functionality parts such as parallelism, fault tolerance and load balance from programmers, MapReduce significantly simplifies the programming of large clusters. Due to the mentioned features of MapReduce above, researchers have also explored the use of MapReduce on other application domains, such as machine learning, textual retrieval and statistical translation, among others. In this paper, we study the feasibility of running typical supercomputing applications using the MapReduce framework. We port two applications (Water Spatial and Radix Sort) from the Stanford SPLASH-2 suite to MapReduce. By completely evaluating them in Hadoop, an open-source MapReduce framework for clusters, we analyze the major performance bottleneck of them in the MapReduce framework. Based on this, we also provide several suggestions in enhancing the MapReduce framework to suite these applications.