Deadlock-Free Message Routing in Multiprocessor Interconnection Networks
IEEE Transactions on Computers
The turn model for adaptive routing
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
A Theory of Fault-Tolerant Routing in Wormhole Networks
IEEE Transactions on Parallel and Distributed Systems
Proceedings of the seventeenth ACM symposium on Operating systems principles
Chord: A scalable peer-to-peer lookup service for internet applications
Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
A scalable content-addressable network
Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility
SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Introduction to Parallel Processing: Algorithms and Architectures
Introduction to Parallel Processing: Algorithms and Architectures
Optimized Routing in the Cray T3D
PCRCW '94 Proceedings of the First International Workshop on Parallel Computer Routing and Communication
Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems
Middleware '01 Proceedings of the IFIP/ACM International Conference on Distributed Systems Platforms Heidelberg
Universal schemes for parallel communication
STOC '81 Proceedings of the thirteenth annual ACM symposium on Theory of computing
The impact of DHT routing geometry on resilience and proximity
Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications
A Routing Methodology for Achieving Fault Tolerance in Direct Networks
IEEE Transactions on Computers
Beehive: O(1)lookup performance for power-law query distributions in peer-to-peer overlays
NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
Beehive: O(1)lookup performance for power-law query distributions in peer-to-peer overlays
NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Immucube: Scalable Fault-Tolerant Routing for k-ary n-cube Networks
IEEE Transactions on Parallel and Distributed Systems
Bigtable: a distributed storage system for structured data
OSDI '06 Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation - Volume 7
Dryad: distributed data-parallel programs from sequential building blocks
Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
Dynamo: amazon's highly available key-value store
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
Project Kittyhawk: building a global-scale computer: Blue Gene/P as a generic computing platform
ACM SIGOPS Operating Systems Review
A scalable, commodity data center network architecture
Proceedings of the ACM SIGCOMM 2008 conference on Data communication
Dcell: a scalable and fault-tolerant network structure for data centers
Proceedings of the ACM SIGCOMM 2008 conference on Data communication
VL2: a scalable and flexible data center network
Proceedings of the ACM SIGCOMM 2009 conference on Data communication
BCube: a high performance, server-centric network architecture for modular data centers
Proceedings of the ACM SIGCOMM 2009 conference on Data communication
Why should we integrate services, servers, and networking in a data center?
Proceedings of the 1st ACM workshop on Research on enterprise networking
RouteBricks: exploiting parallelism to scale software routers
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Blue Gene/L torus interconnection network
IBM Journal of Research and Development
Cassandra: a decentralized structured storage system
ACM SIGOPS Operating Systems Review
Symbiotic routing in future data centers
Proceedings of the ACM SIGCOMM 2010 conference
Proceedings of the ACM SIGCOMM 2010 conference
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Finding a needle in Haystack: facebook's photo storage
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Apache hadoop goes realtime at Facebook
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Proceedings of the 2nd ACM Symposium on Cloud Computing
Camdoop: exploiting in-network aggregation for big data applications
NSDI'12 Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation
How hard can it be? designing and implementing a deployable multipath TCP
NSDI'12 Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation
Scribe: a large-scale and decentralized application-level multicast infrastructure
IEEE Journal on Selected Areas in Communications
nsdi'13 Proceedings of the 10th USENIX conference on Networked Systems Design and Implementation
Review: A survey on architectures and energy efficiency in Data Center Networks
Computer Communications
Hi-index | 0.00 |
Cluster fabric interconnects that use 3D torus topologies are increasingly being deployed in data center clusters. In our prior work, we demonstrated that by using these topologies and letting applications implement custom routing protocols and perform operations on path, it is possible to increase performance and simplify development. However, these benefits cannot be achieved using mainstream point-to-point networking stacks such as TCP/IP or MPI, which hide the underlying topology and do not allow the implementation of any in-network operations. In this paper we describe CamCubeOS, a novel key-based communication stack, purposely designed from scratch for 3D torus fabric interconnects. We note that many of the applications used in clusters are key-based. Therefore, we designed CamCubeOS to natively support key-based operations. We select a virtual topology that perfectly matches the underlying physical topology and we use the keyspace to expose the physical locality, thus avoiding the typical overhead incurred by overlay-based approaches. We report on our experience in building several applications on top of CamCubeOS and we evaluate their performance and feasibility using a prototype and large-scale simulations.