Oivos: Simple and Efficient Distributed Data Processing

Authors:
Steffen Viken Valvåg;Dag Johansen
Affiliations:
-;-
Venue:
HPCC '08 Proceedings of the 2008 10th IEEE International Conference on High Performance Computing and Communications
Year:
2008

Citing 0
Cited 2

MapReduce optimization using regulated dynamic prioritization

Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems
Cogset: a high performance MapReduce engine

Concurrency and Computation: Practice & Experience

Quantified Score

Hi-index	0.00

Visualization

Abstract

The complexity of implementing large scale distributed computations has motivated new programming models. Google's MapReduce model has gained widespread use and aims to hide the complex details of data partitioning and distribution, scheduling, synchronization, and fault tolerance. However, our experiences from the enterprise search business indicate that many real-life applications must be implemented as a collection of related MapReduce programs. Since the execution of these programs must be monitored and coordinated externally, several issues concerning scheduling, synchronization, and fault tolerance resurface. To address these limitations, we introduce Oivos; a high-level declarative programming model and its underlying runtime. We show how Oivos programs may specify computations that span multiple heterogeneous and interdependent data sets, how the programs are compiled and optimized, and how our run-time orchestrates and monitors their distributed execution. Our experimental evaluation reveals that Oivos programs do less I/O and execute significantly faster than the equivalent sequences of MapReduce passes.