Multicast routing in datagram internetworks and extended LANs
ACM Transactions on Computer Systems (TOCS)
Parity-based loss recovery for reliable multicast transmission
SIGCOMM '97 Proceedings of the ACM SIGCOMM '97 conference on Applications, technologies, architectures, and protocols for computer communication
A reliable multicast framework for light-weight sessions and application level framing
IEEE/ACM Transactions on Networking (TON)
A digital fountain approach to reliable distribution of bulk data
Proceedings of the ACM SIGCOMM '98 conference on Applications, technologies, architectures, and protocols for computer communication
ACM Transactions on Computer Systems (TOCS)
Enabling conferencing applications on the internet using an overlay muilticast architecture
Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
Distributed Systems for System Architects
Distributed Systems for System Architects
PfHSN '96 Proceedings of the TC6 WG6.1/6.4 Fifth International Workshop on Protocols for High-Speed Networks V
Lightweight probabilistic broadcast
ACM Transactions on Computer Systems (TOCS)
Total order broadcast and multicast algorithms: Taxonomy and survey
ACM Computing Surveys (CSUR)
Slingshot: Time-CriticalMulticast for Clustered Applications
NCA '05 Proceedings of the Fourth IEEE International Symposium on Network Computing and Applications
PLATO: Predictive Latency-Aware Total Ordering
SRDS '06 Proceedings of the 25th IEEE Symposium on Reliable Distributed Systems
RMTP: a reliable multicast transport protocol
INFOCOM'96 Proceedings of the Fifteenth annual joint conference of the IEEE computer and communications societies conference on The conference on computer communications - Volume 3
The PGM reliable multicast protocol
IEEE Network: The Magazine of Global Internetworking
Maelstrom: transparent error correction for lambda networks
NSDI'08 Proceedings of the 5th USENIX Symposium on Networked Systems Design and Implementation
Smoke and mirrors: reflecting files at a geographically remote location without loss of performance
FAST '09 Proccedings of the 7th conference on File and storage technologies
Efficient on-demand operations in dynamic distributed infrastructures
LADIS '08 Proceedings of the 2nd Workshop on Large-Scale Distributed Systems and Middleware
LIPSIN: line speed publish/subscribe inter-networking
Proceedings of the ACM SIGCOMM 2009 conference on Data communication
Using machine learning to maintain pub/sub system QoS in dynamic environments
Proceedings of the 8th International Workshop on Adaptive and Reflective MIddleware
Evaluating Transport Protocols for Real-Time Event Stream Processing Middleware and Applications
OTM '09 Proceedings of the Confederated International Conferences, CoopIS, DOA, IS, and ODBASE 2009 on On the Move to Meaningful Internet Systems: Part I
Dr. multicast: Rx for data center communication scalability
Proceedings of the 5th European conference on Computer systems
Adapting distributed real-time and embedded pub/sub middleware for cloud computing environments
Proceedings of the ACM/IFIP/USENIX 11th International Conference on Middleware
Maelstrom: transparent error correction for communication between data centers
IEEE/ACM Transactions on Networking (TON)
Pangolin: speeding up concurrent messaging for cloud-based social gaming
Proceedings of the Seventh COnference on emerging Networking EXperiments and Technologies
Achieving reliable and timely event dissemination over WAN
ICDCN'12 Proceedings of the 13th international conference on Distributed Computing and Networking
A tutorial on reliability in publish/subscribe services
Proceedings of the 6th ACM International Conference on Distributed Event-Based Systems
Timely Autonomic Adaptation of Publish/Subscribe Middleware in Dynamic Environments
International Journal of Adaptive, Resilient and Autonomic Systems
Survey On reliability in publish/subscribe services
Computer Networks: The International Journal of Computer and Telecommunications Networking
Scaling IP multicast on datacenter topologies
Proceedings of the ninth ACM conference on Emerging networking experiments and technologies
Reliable and Timely Event Notification for Publish/Subscribe Services Over the Internet
IEEE/ACM Transactions on Networking (TON)
Hi-index | 0.00 |
Ricochet is a low-latency reliable multicast protocol designed for time-critical clustered applications. It uses IP Multicast to transmit data and recovers from packet loss in end-hosts using Lateral Error Correction (LEC), a novel repair mechanism in which XORs are exchanged between receivers and combined across overlapping groups. In datacenters and clusters, application needs frequently dictate large numbers of fine-grained overlapping multicast groups. Existing multicast reliability schemes scale poorly in such settings, providing latency of packet recovery that depends inversely on the data rate within a single group: the lower the data rate, the longer it takes to recover lost packets. LEC is insensitive to the rate of data in any one group and allows each node to split its bandwidth between hundreds to thousands of fine-grained multicast groups without sacrificing timely packet recovery. As a result, Ricochet provides developers with a scalable, reliable and fast multicast primitive to layer under high-level abstractions such as publish-subscribe, group communication and replicated service/object infrastructures. We evaluate Ricochet on a 64-node cluster with up to 1024 groups per node: under various loss rates, it recovers almost all packets using LEC in tens of milliseconds and the remainder with reactive traffic within 200 milliseconds.