Evaluating MapReduce for Multi-core and Multiprocessor Systems

Authors:
Colby Ranger;Ramanan Raghuraman;Arun Penmetsa;Gary Bradski;Christos Kozyrakis
Affiliations:
Computer Systems Laboratory, Stanford University. Email: cranger@stanford.edu;Computer Systems Laboratory, Stanford University. Email: ramananr@stanford.edu;Computer Systems Laboratory, Stanford University. Email: penmetsa@stanford.edu;Computer Systems Laboratory, Stanford University. Email: garybradski@gmail.com;Computer Systems Laboratory, Stanford University. Email: christos@ee.stanford.edu.
Venue:
HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
Year:
2007

Citing 0
Cited 152

MapReduce: simplified data processing on large clusters

Communications of the ACM - 50th anniversary issue: 1958 - 2008
Merge: a programming model for heterogeneous multi-core systems

Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Streamware: programming general-purpose multicore processors using streams

Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Cut-and-stitch: efficient parallel learning of linear dynamical systems on smps

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Page-Based Anomaly Detection in Large Scale Web Clusters Using Adaptive MapReduce (Extended Abstract)

RAID '08 Proceedings of the 11th international symposium on Recent Advances in Intrusion Detection
Mars: a MapReduce framework on graphics processors

Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Low-pain, high-gain multicore programming in Haskell: coordinating irregular symbolic computations on multicore architectures

Proceedings of the 4th workshop on Declarative aspects of multicore programming
Gordon: using flash memory to build fast, power-efficient clusters for data-intensive applications

Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Supporting MapReduce on large-scale asymmetric multi-core clusters

ACM SIGOPS Operating Systems Review
Using realistic simulation for performance analysis of mapreduce setups

Proceedings of the 1st ACM workshop on Large-Scale system and application performance
MapReduce optimization using regulated dynamic prioritization

Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems
Performance Issues in Parallelizing Data-Intensive Applications on a Multi-core Cluster

CCGRID '09 Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid
Towards Efficient MapReduce Using MPI

Proceedings of the 16th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
A Data Parallel Algorithm for XML DOM Parsing

XSym '09 Proceedings of the 6th International XML Database Symposium on Database and XML Technologies
Evaluating SPLASH-2 Applications Using MapReduce

APPT '09 Proceedings of the 8th International Symposium on Advanced Parallel Processing Technologies
MapReduce Programming Model for .NET-Based Cloud Computing

Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
Implementing Parallel Google Map-Reduce in Eden

Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
Flexible filters: load balancing through backpressure for stream programs

EMSOFT '09 Proceedings of the seventh ACM international conference on Embedded software
A platform for developing adaptable multicore applications

CASES '09 Proceedings of the 2009 international conference on Compilers, architecture, and synthesis for embedded systems
Distributed aggregation for data-parallel computing: interfaces and implementations

Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Grace: safe multithreaded programming for C/C++

Proceedings of the 24th ACM SIGPLAN conference on Object oriented programming systems languages and applications
Composing and executing parallel data-flow graphs with shell pipes

Proceedings of the 4th Workshop on Workflows in Support of Large-Scale Science
Evaluating MapReduce on Virtual Machines: The Hadoop Case

CloudCom '09 Proceedings of the 1st International Conference on Cloud Computing
Parallel K-Means Clustering Based on MapReduce

CloudCom '09 Proceedings of the 1st International Conference on Cloud Computing
FPMR: MapReduce framework on FPGA

Proceedings of the 18th annual ACM/SIGDA international symposium on Field programmable gate arrays
Cloud-TM: harnessing the cloud with distributed transactional memories

ACM SIGOPS Operating Systems Review
Robust and flexible power-proportional storage

Proceedings of the 1st ACM symposium on Cloud computing
Compiler and runtime support for enabling generalized reduction computations on heterogeneous parallel configurations

Proceedings of the 24th ACM International Conference on Supercomputing
Assigning tasks for efficiency in Hadoop: extended abstract

Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
Misco: a MapReduce framework for mobile systems

Proceedings of the 3rd International Conference on PErvasive Technologies Related to Assistive Environments
Designing Accelerator-Based Distributed Systems for High Performance

CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
A Capabilities-Aware Programming Model for Asymmetric High-End Systems

CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
A Map-Reduce System with an Alternate API for Multi-core Environments

CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
MapReduce for the cell broadband engine architecture

IBM Journal of Research and Development
Twister: a runtime for iterative MapReduce

Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Multi-GPU volume rendering using MapReduce

Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Avoiding deadlock avoidance

Proceedings of the 19th international conference on Parallel architectures and compilation techniques
MapCG: writing parallel program portable between CPU and GPU

Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Tiled-MapReduce: optimizing resource usages of data-parallel applications on multicore with tiling

Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Corey: an operating system for many cores

OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
A programming framework for integrating web-based spatiotemporal sensor data with MapReduce capabilities

Proceedings of the ACM SIGSPATIAL International Workshop on GeoStreaming
Self-replicating objects for multicore platforms

ECOOP'10 Proceedings of the 24th European conference on Object-oriented programming
A survey of algorithmic skeleton frameworks: high-level structured parallel programming enablers

Software—Practice & Experience - Focus on Selected PhD Literature Reviews in the Practical Aspects of Software Technology
A capabilities-aware framework for using computational accelerators in data-intensive computing

Journal of Parallel and Distributed Computing
An analysis of Linux scalability to many cores

OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Stable deterministic multithreading through schedule memoization

OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Dynamic proportional share scheduling in Hadoop

JSSPP'10 Proceedings of the 15th international conference on Job scheduling strategies for parallel processing
Composition kernel: a multi-core processor virtualization layer for rich functional smart products

SEUS'10 Proceedings of the 8th IFIP WG 10.2 international conference on Software technologies for embedded and ubiquitous systems
Scheduling divisible MapReduce computations

Journal of Parallel and Distributed Computing
COREMU: a scalable and portable parallel full-system emulator

Proceedings of the 16th ACM symposium on Principles and practice of parallel programming
Dynamic cache contention detection in multi-threaded applications

Proceedings of the 7th ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Parallel skyline computation on multicore architectures

Information Systems
High performance predictable histogramming on GPUs: exploring and evaluating algorithm trade-offs

Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units
A case for scaling applications to many-core with OS clustering

Proceedings of the sixth conference on Computer systems
Variable-sized map and locality-aware reduce on public-resource grids

Future Generation Computer Systems
A load-aware scheduler for MapReduce framework in heterogeneous cloud environments

Proceedings of the 2011 ACM Symposium on Applied Computing
A fast approach for parallel deduplication on multicore processors

Proceedings of the 2011 ACM Symposium on Applied Computing
Efficient parallel skyline processing using hyperplane projections

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Parallelism and data movement characterization of contemporary application classes

Proceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures
Garbage collection auto-tuning for Java mapreduce on multi-cores

Proceedings of the international symposium on Memory management
Phoenix++: modular MapReduce for shared-memory systems

Proceedings of the second international workshop on MapReduce and its applications
Demand-driven software race detection using hardware performance counters

Proceedings of the 38th annual international symposium on Computer architecture
Scheduling for real-time mobile MapReduce systems

Proceedings of the 5th ACM international conference on Distributed event-based system
Data intensive analysis on the gordon high performance data and compute system

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
ETLMR: a highly scalable dimensional ETL framework based on mapreduce

DaWaK'11 Proceedings of the 13th international conference on Data warehousing and knowledge discovery
Dthreads: efficient deterministic multithreading

SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
SHERIFF: precise detection and automatic mitigation of false sharing

Proceedings of the 2011 ACM international conference on Object oriented programming systems languages and applications
Hadoop acceleration through network levitated merge

Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Performance evaluation of MapReduce using full virtualisation on a departmental cloud

International Journal of Applied Mathematics and Computer Science - SPECIAL SECTION: Efficient Resource Management for Grid-Enabled Applications
Parallel data processing with MapReduce: a survey

ACM SIGMOD Record
More convenient more overhead: the performance evaluation of Hadoop streaming

Proceedings of the 2011 ACM Symposium on Research in Applied Computation
Riding the elephant: managing ensembles with hadoop

Proceedings of the 2011 ACM international workshop on Many task computing on grids and supercomputers
MATE-EC2: a middleware for processing data with AWS

Proceedings of the 2011 ACM international workshop on Many task computing on grids and supercomputers
PMA: Pixel-based multi-anchor algorithm for image recognition on multi-core systems

Proceedings of the 2012 International Workshop on Programming Models and Applications for Multicores and Manycores
Semi-sparse algorithm based on multi-layer optimization for recommendation system

Proceedings of the 2012 International Workshop on Programming Models and Applications for Multicores and Manycores
Scalable co-clustering algorithms

ICA3PP'10 Proceedings of the 10th international conference on Algorithms and Architectures for Parallel Processing - Volume Part I
Variable-Sized map and locality-aware reduce on public-resource grids

GPC'10 Proceedings of the 5th international conference on Advances in Grid and Pervasive Computing
Iterative optimization for the data center

ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
Tarazu: optimizing MapReduce on heterogeneous clusters

ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
A case for unlimited watchpoints

ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
DVM: towards a datacenter-scale virtual machine

VEE '12 Proceedings of the 8th ACM SIGPLAN/SIGOPS conference on Virtual Execution Environments
Parallel application memory scheduling

Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Event retrieval in video archives using rough set theory and partially supervised learning

Multimedia Tools and Applications
A parallel method for computing rough set approximations

Information Sciences: an International Journal
iMapReduce: A Distributed Computing Framework for Iterative Computation

Journal of Grid Computing
Scalable sequence similarity search and join in main memory on multi-cores

Euro-Par'11 Proceedings of the 2011 international conference on Parallel Processing - Volume 2
P2P-MapReduce: Parallel data processing in dynamic Cloud environments

Journal of Computer and System Sciences
The efficiency of mapreduce in parallel external memory

LATIN'12 Proceedings of the 10th Latin American international conference on Theoretical Informatics
Compiler and runtime support for enabling reduction computations on heterogeneous systems

Concurrency and Computation: Practice & Experience
Optimizing MapReduce for GPUs with effective shared memory usage

Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing
Adapting scientific computing problems to clouds using MapReduce

Future Generation Computer Systems
Elastic computing: A portable optimization framework for hybrid computers

Parallel Computing
Managing large graphs on multi-cores with graph awareness

USENIX ATC'12 Proceedings of the 2012 USENIX conference on Annual Technical Conference
A MapReduce-supported network structure for data centers

Concurrency and Computation: Practice & Experience
SSMalloc: a low-latency, locality-conscious memory allocator with stable performance scalability

Proceedings of the Asia-Pacific Workshop on Systems
Parallel rough set based knowledge acquisition using MapReduce from big data

Proceedings of the 1st International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications
Parallel decision tree with application to water quality data analysis

ISNN'12 Proceedings of the 9th international conference on Advances in Neural Networks - Volume Part II
Hierarchical merge for scalable MapReduce

Proceedings of the 2012 workshop on Management of big data systems
SSMalloc: a low-latency, locality-conscious memory allocator with stable performance scalability

APSys'12 Proceedings of the Third ACM SIGOPS Asia-Pacific conference on Systems
Reducing data movement costs using energy efficient, active computation on SSD

HotPower'12 Proceedings of the 2012 USENIX conference on Power-Aware Computing and Systems
Accelerating MapReduce on a coupled CPU-GPU architecture

SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Cache-sensitive MapReduce DGEMM algorithms for shared memory architectures

Proceedings of the South African Institute for Computer Scientists and Information Technologists Conference
Fast parallel algorithms for blocked dense matrix multiplication on shared memory architectures

ICA3PP'12 Proceedings of the 12th international conference on Algorithms and Architectures for Parallel Processing - Volume Part I
G-Hadoop: MapReduce across distributed data centers for data-intensive computing

Future Generation Computer Systems
Assessing MapReduce for Internet Computing: A Comparison of Hadoop and BitDew-MapReduce

GRID '12 Proceedings of the 2012 ACM/IEEE 13th International Conference on Grid Computing
Accelerating text mining workloads in a MapReduce-based distributed GPU environment

Journal of Parallel and Distributed Computing
Cogset: a high performance MapReduce engine

Concurrency and Computation: Practice & Experience
MapReduce-Based data stream processing over large history data

ICSOC'12 Proceedings of the 10th international conference on Service-Oriented Computing
Network-Based inference algorithm on hadoop

ISMIS'12 Proceedings of the 20th international conference on Foundations of Intelligent Systems
Grex: An efficient MapReduce framework for graphics processing units

Journal of Parallel and Distributed Computing
Scalable deterministic replay in a parallel full-system emulator

Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
Makeflow: a portable abstraction for data intensive computing on clusters, clouds, and grids

Proceedings of the 1st ACM SIGMOD Workshop on Scalable Workflow Execution Engines and Technologies
Tiled-MapReduce: Efficient and Flexible MapReduce Processing on Multicore with Tiling

ACM Transactions on Architecture and Code Optimization (TACO)
MRSG - A MapReduce simulator over SimGrid

Parallel Computing
Holistic run-time parallelism management for time and energy efficiency

Proceedings of the 27th international ACM conference on International conference on supercomputing
Conversion: multi-version concurrency control for main memory segments

Proceedings of the 8th ACM European Conference on Computer Systems
Whose cache line is it anyway?: operating system support for live detection and repair of false sharing

Proceedings of the 8th ACM European Conference on Computer Systems
Tomograph: highlighting query parallelism in a multi-core system

Proceedings of the Sixth International Workshop on Testing Database Systems
Utility-based acceleration of multithreaded applications on asymmetric CMPs

Proceedings of the 40th Annual International Symposium on Computer Architecture
AC-DIMM: associative computing with STT-MRAM

Proceedings of the 40th Annual International Symposium on Computer Architecture
Protozoa: adaptive granularity cache coherence

Proceedings of the 40th Annual International Symposium on Computer Architecture
MapReduce with communication overlap (MaRCO)

Journal of Parallel and Distributed Computing
Software-controlled transparent management of heterogeneous memory resources in virtualized systems

Proceedings of the ACM SIGPLAN Workshop on Memory Systems Performance and Correctness
Dynamic directories: a mechanism for reducing on-chip interconnect power in multicores

DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
Cloud MapReduce for particle filter-based data assimilation for wildfire spread simulation

Proceedings of the High Performance Computing Symposium
Lazy tree mapping: generalizing and scaling deterministic parallelism

Proceedings of the 4th Asia-Pacific Workshop on Systems
Detection of false sharing using machine learning

SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
HAT: history-based auto-tuning MapReduce in heterogeneous environments

The Journal of Supercomputing
Enforcing Minimum Necessary Access in Healthcare Through Integrated Audit and Access Control

Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles

ACM SIGOPS 24th Symposium on Operating Systems Principles
Everything you always wanted to know about synchronization but were afraid to ask

Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
Parrot: a practical runtime for deterministic, stable, and reliable threads

Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
X-Stream: edge-centric graph processing using streaming partitions

Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
Memory-efficient groupby-aggregate using compressed buffer trees

Proceedings of the 4th annual Symposium on Cloud Computing
Scale-up vs scale-out for Hadoop: time to rethink?

Proceedings of the 4th annual Symposium on Cloud Computing
Fairness-aware scheduling on single-ISA heterogeneous multi-cores

PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Accelerate MapReduce on GPUs with multi-level reduction

Proceedings of the 5th Asia-Pacific Symposium on Internetware
A framework for an in-depth comparison of scale-up and scale-out

DISCS-2013 Proceedings of the 2013 International Workshop on Data-Intensive Scalable Computing Systems
Hone: "Scaling down" Hadoop on shared-memory systems

Proceedings of the VLDB Endowment
Flexible filters in stream programs

ACM Transactions on Embedded Computing Systems (TECS)
REF: resource elasticity fairness with sharing incentives for multiprocessors

Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
VSwapper: a memory swapper for virtualized environments

Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
Fence-free work stealing on bounded TSO processors

Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
PREDATOR: predictive false sharing detection

Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programming
Efficient deterministic multithreading without global barriers

Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programming
Experiences Developing the OpenUH Compiler and Runtime Infrastructure

International Journal of Parallel Programming
Analyzing the performance of SMP memory allocators with iterative MapReduce applications

Parallel Computing
X10-FT: Transparent fault tolerance for APGAS language and runtime

Parallel Computing
A MapReduce task scheduling algorithm for deadline constraints

Cluster Computing
Semi-sparse algorithm based on multi-layer optimization for recommender system

The Journal of Supercomputing
A comparison of parallel large-scale knowledge acquisition using rough set theory on different MapReduce runtime systems

International Journal of Approximate Reasoning
Active flash: towards energy-efficient, in-situ data analytics on extreme-scale machines

FAST'13 Proceedings of the 11th USENIX conference on File and Storage Technologies

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper evaluates the suitability of the MapReduce model for multi-core and multi-processor systems. MapReduce was created by Google for application development on data-centers with thousands of servers. It allows programmers to write functional-style code that is automaticatlly parallelized and scheduled in a distributed system. We describe Phoenix, an implementation of MapReduce for shared-memory systems that includes a programming API and an efficient runtime system. The Phoenix run-time automatically manages thread creation, dynamic task scheduling, data partitioning, and fault tolerance across processor nodes. We study Phoenix with multi-core and symmetric multiprocessor systems and evaluate its performance potential and error recovery features. We also compare MapReduce code to code written in lower-level APIs such as P-threads. Overall, we establish that, given a careful implementation, MapReduce is a promising model for scalable performance on shared-memory systems with simple parallel code.