Static type checking of Hadoop MapReduce programs

Authors:
Jens Dörre;Sven Apel;Christian Lengauer
Affiliations:
University of Passau, Passau, Germany;University of Passau, Passau, Germany;University of Passau, Passau, Germany
Venue:
Proceedings of the second international workshop on MapReduce and its applications
Year:
2011

Citing 10
Cited 1

Featherweight Java: a minimal core calculus for Java and GJ

Proceedings of the 14th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Types and programming languages

Types and programming languages
Patterns and skeletons for parallel and distributed computing

Patterns and skeletons for parallel and distributed computing
Transforming rapid prototypes to efficient parallel programs

Patterns and skeletons for parallel and distributed computing
Features from functional programming for a C++ skeleton library: Research Articles

Concurrency and Computation: Practice & Experience - 2002 ACM Java Grande–ISCOPE Conference Part II
Google's MapReduce programming model – Revisited

Science of Computer Programming
MapReduce: simplified data processing on large clusters

Communications of the ACM - 50th anniversary issue: 1958 - 2008
Pig latin: a not-so-foreign language for data processing

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Implementing Parallel Google Map-Reduce in Eden

Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
Parallel processing of data from very large-scale wireless sensor networks

Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing

Using Coq in specification and program extraction of hadoop mapreduce applications

SEFM'11 Proceedings of the 9th international conference on Software engineering and formal methods

Quantified Score

Hi-index	0.00

Visualization

Abstract

MapReduce is a programming model for the development of Web-scale programs. It is based on concepts from functional programming, namely higher-order functions, which can be strongly typed using parametric polymorphism. Yet this connection is tenuous. For example, in Hadoop, the connection between the two phases of a MapReduce computation is unsafe: there is no static type check of the generic type parameters involved. We provide a static check for Hadoop programs without asking the user to write any more code. To this end, we use strongly typed higher-order functions checked by the standard Java 5 type checker together with the Hadoop program. We also generate automatically the code needed to execute this program with a standard Hadoop implementation.