M3: Stream Processing on Main-Memory MapReduce

Authors:
Ahmed M. Aly;Asmaa Sallam;Bala M. Gnanasekaran;Long-Van Nguyen-Dinh;Walid G. Aref;Mourad Ouzzani;Arif Ghafoor
Affiliations:
-;-;-;-;-;-;-
Venue:
ICDE '12 Proceedings of the 2012 IEEE 28th International Conference on Data Engineering
Year:
2012

Citing 0
Cited 1

Modeling performance of a parallel streaming engine: bridging theory and costs

Proceedings of the 4th ACM/SPEC International Conference on Performance Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

The continuous growth of social web applications along with the development of sensor capabilities in electronic devices is creating countless opportunities to analyze the enormous amounts of data that is continuously steaming from these applications and devices. To process large scale data on large scale computing clusters, MapReduce has been introduced as a framework for parallel computing. However, most of the current implementations of the MapReduce framework support only the execution of fixed-input jobs. Such restriction makes these implementations inapplicable for most streaming applications, in which queries are continuous in nature, and input data streams are continuously received at high arrival rates. In this demonstration, we showcase M$^3$, a prototype implementation of the MapReduce framework in which continuous queries over streams of data can be efficiently answered. M$^3$ extends Hadoop, the open source implementation of MapReduce, bypassing the Hadoop Distributed File System (HDFS) to support main-memory-only processing. Moreover, M$^3$ supports continuous execution of the Map and Reduce phases where individual Mappers and Reducers never terminate.