Camdoop: exploiting in-network aggregation for big data applications

Authors:
Paolo Costa;Austin Donnelly;Antony Rowstron;Greg O'Shea
Affiliations:
Microsoft Research Cambridge and Imperial College London;Microsoft Research Cambridge;Microsoft Research Cambridge;Microsoft Research Cambridge
Venue:
NSDI'12 Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation
Year:
2012

Citing 37
Cited 6

A scalable content-addressable network

Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
Introduction to Parallel Processing: Algorithms and Architectures

Introduction to Parallel Processing: Algorithms and Architectures
Directed diffusion for wireless sensor networking

IEEE/ACM Transactions on Networking (TON)
Astrolabe: A robust and scalable technology for distributed system monitoring, management, and data mining

ACM Transactions on Computer Systems (TOCS)
TCP behavior with many flows

ICNP '97 Proceedings of the 1997 International Conference on Network Protocols (ICNP '97)
The Google file system

SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
A scalable distributed information management system

Proceedings of the 2004 conference on Applications, technologies, architectures, and protocols for computer communications
TAG: a Tiny AGgregation service for Ad-Hoc sensor networks

OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
Dryad: distributed data-parallel programs from sequential building blocks

Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
MapReduce: simplified data processing on large clusters

Communications of the ACM - 50th anniversary issue: 1958 - 2008
NetFPGA: reusable router architecture for experimental research

Proceedings of the ACM workshop on Programmable routers for extensible services of tomorrow
A scalable, commodity data center network architecture

Proceedings of the ACM SIGCOMM 2008 conference on Data communication
VL2: a scalable and flexible data center network

Proceedings of the ACM SIGCOMM 2009 conference on Data communication
BCube: a high performance, server-centric network architecture for modular data centers

Proceedings of the ACM SIGCOMM 2009 conference on Data communication
Safe and effective fine-grained TCP retransmissions for datacenter communication

Proceedings of the ACM SIGCOMM 2009 conference on Data communication
MapReduce: a flexible data processing tool

Communications of the ACM - Amir Pnueli: Ahead of His Time
RouteBricks: exploiting parallelism to scale software routers

Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Distributed aggregation for data-parallel computing: interfaces and implementations

Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines

The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines
Blue Gene/L torus interconnection network

IBM Journal of Research and Development
Data warehousing and analytics infrastructure at facebook

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Symbiotic routing in future data centers

Proceedings of the ACM SIGCOMM 2010 conference
Data center TCP (DCTCP)

Proceedings of the ACM SIGCOMM 2010 conference
Hedera: dynamic flow scheduling for data center networks

NSDI'10 Proceedings of the 7th USENIX conference on Networked systems design and implementation
MapReduce online

NSDI'10 Proceedings of the 7th USENIX conference on Networked systems design and implementation
DryadLINQ: a system for general-purpose distributed data-parallel computing using a high-level language

OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Network traffic characteristics of data centers in the wild

IMC '10 Proceedings of the 10th ACM SIGCOMM conference on Internet measurement
Scafida: a scale-free network inspired data center architecture

ACM SIGCOMM Computer Communication Review
ICTCP: Incast Congestion Control for TCP in data center networks

Proceedings of the 6th International COnference
ServerSwitch: a programmable and high performance platform for data center networks

Proceedings of the 8th USENIX conference on Networked systems design and implementation
CIEL: a universal execution engine for distributed data-flow computing

Proceedings of the 8th USENIX conference on Networked systems design and implementation
Apache hadoop goes realtime at Facebook

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Small-world datacenters

Proceedings of the 2nd ACM Symposium on Cloud Computing
Incoop: MapReduce for incremental computations

Proceedings of the 2nd ACM Symposium on Cloud Computing
The Case for Evaluating MapReduce Performance Using Workload Suites

MASCOTS '11 Proceedings of the 2011 IEEE 19th Annual International Symposium on Modelling, Analysis, and Simulation of Computer and Telecommunication Systems
Willow: DHT, aggregation, and publish/subscribe in one protocol

IPTPS'04 Proceedings of the Third international conference on Peer-to-Peer Systems
Jellyfish: networking data centers randomly

NSDI'12 Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation

Nobody ever got fired for using Hadoop on a cluster

Proceedings of the 1st International Workshop on Hot Topics in Cloud Data Processing
Programming your network at run-time for big data applications

Proceedings of the first workshop on Hot topics in software defined networks
Bridging the gap between applications and networks in data centers

ACM SIGOPS Operating Systems Review
CamCubeOS: a key-based network stack for 3D torus cluster topologies

Proceedings of the 22nd international symposium on High-performance parallel and distributed computing
Exploiting in-network processing for big data management

Proceedings of the 2013 Sigmod/PODS Ph.D. symposium on PhD symposium
Supporting application-specific in-network processing in data centres

Proceedings of the ACM SIGCOMM 2013 conference on SIGCOMM

Quantified Score

Hi-index	0.00

Visualization

Abstract

Large companies like Facebook, Google, and Microsoft as well as a number of small and medium enterprises daily process massive amounts of data in batch jobs and in real time applications. This generates high network traffic, which is hard to support using traditional, oversubscribed, network infrastructures. To address this issue, several novel network topologies have been proposed, aiming at increasing the bandwidth available in enterprise clusters. We observe that in many of the commonly used workloads, data is aggregated during the process and the output size is a fraction of the input size. This motivated us to explore a different point in the design space. Instead of increasing the bandwidth, we focus on decreasing the traffic by pushing aggregation from the edge into the network. We built Camdoop, a MapReduce-like system running on CamCube, a cluster design that uses a direct-connect network topology with servers directly linked to other servers. Camdoop exploits the property that CamCube servers forward traffic to perform in-network aggregation of data during the shuffle phase. Camdoop supports the same functions used in MapReduce and is compatible with existing MapReduce applications. We demonstrate that, in common cases, Camdoop significantly reduces the network traffic and provides high performance increase over a version of Camdoop running over a switch and against two production systems, Hadoop and Dryad/DryadLINQ.