Remote control: distributed application configuration, management, and visualization with plush
LISA'07 Proceedings of the 21st conference on Large Installation System Administration Conference
DepSpace: a byzantine fault-tolerant coordination service
Proceedings of the 3rd ACM SIGOPS/EuroSys European Conference on Computer Systems 2008
Improving scalability and fault tolerance in an application management infrastructure
LASCO'08 First USENIX Workshop on Large-Scale Computing
Distributed application configuration, management, and visualization with plush
ACM Transactions on Internet Technology (TOIT)
Server-assisted latency management for wide-area distributed systems
USENIX ATC'12 Proceedings of the 2012 USENIX conference on Annual Technical Conference
Solving the straggler problem with bounded staleness
HotOS'13 Proceedings of the 14th USENIX conference on Hot Topics in Operating Systems
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
ACM SIGOPS 24th Symposium on Operating Systems Principles
Parrot: a practical runtime for deterministic, stable, and reliable threads
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
Hi-index | 0.00 |
Traditionally, synchronization barriers ensure that no cooperating process advances beyond a specified point until all processes have reached that point. In heterogeneous large-scale distributed computing environments, with unreliable network links and machines that may become overloaded and unresponsive, traditional barrier semantics are too strict to be effective for a range of emerging applications. In this paper, we explore several relaxations, and introduce a partial barrier, a synchronization primitive designed to enhance liveness in loosely coupled networked systems. Partial barriers are robust to variable network conditions; rather than attempting to hide the asynchrony inherent to wide-area settings, they enable appropriate application-level responses. We evaluate the improved performance of partial barriers by integrating them into three publicly available distributed applications running across PlanetLab. Further, we show how partial barriers simplify a re-implementation of MapReduce that targets wide-area environments.