Parallel Processing Framework on a P2P System Using Map and Reduce Primitives

Authors:
Kyungyong Lee;Tae Woong Choi;Arijit Ganguly;David I. Wolinsky;P. Oscar Boykin;Renato Figueiredo
Affiliations:
-;-;-;-;-;-
Venue:
IPDPSW '11 Proceedings of the 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and PhD Forum
Year:
2011

Citing 0
Cited 2

PonD: dynamic creation of HTC pool on demand using a decentralized resource discovery system

Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing
MatchTree: Flexible, scalable, and fault-tolerant wide-area resource discovery with distributed matchmaking and aggregation

Future Generation Computer Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a parallel processing framework for structured Peer-To-Peer (P2P) networks. A parallel processing task is expressed using Map and Reduce primitives inspired by functional programming models. The Map and Reduce tasks are distributed to a subset of nodes within a P2P network for execution by using a self-organizing multicast tree. The distribution latency cost of multicast method is $O(log(N))$, where $N$ is a number of target nodes for task processing. Each node getting a task performs the Map task, and the task result is summarized and aggregated in a distributed fashion at each node of the multicast tree during the Reduce task. We have implemented this framework on the Brunet P2P system, and the system currently supports predefined Map and Reduce tasks or tasks inserted through Remote Procedure Call (RPC) invocations. A simulation result demonstrates the scalability and efficiency of our parallel processing framework. An experiment result on Planet Lab which performs a distributed K-Means clustering to gather statistics of connection latencies among P2P nodes shows the applicability of our system in applications such as monitoring overlay networks.