Scale and performance in a distributed file system
ACM Transactions on Computer Systems (TOCS)
Linearizability: a correctness condition for concurrent objects
ACM Transactions on Programming Languages and Systems (TOPLAS)
Implementing fault-tolerant services using the state machine approach: a tutorial
ACM Computing Surveys (CSUR)
ACM Transactions on Programming Languages and Systems (TOPLAS)
Horus: a flexible group communication system
Communications of the ACM
The dangers of replication and a solution
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
ACM Transactions on Computer Systems (TOCS)
Building adaptive systems using ensemble
Software—Practice & Experience - Special issue on multiprocessor operating systems
Distributed systems (2nd Ed.)
VAXclusters (extended abstract): a closely-coupled distributed system
Proceedings of the tenth ACM symposium on Operating systems principles
Replication and fault-tolerance in the ISIS system
Proceedings of the tenth ACM symposium on Operating systems principles
Practical byzantine fault tolerance and proactive recovery
ACM Transactions on Computer Systems (TOCS)
FTCS '95 Proceedings of the Twenty-Fifth International Symposium on Fault-Tolerant Computing
Fault-scalable Byzantine fault-tolerant services
Proceedings of the twentieth ACM symposium on Operating systems principles
ACMS: the Akamai configuration management system
NSDI'05 Proceedings of the 2nd conference on Symposium on Networked Systems Design & Implementation - Volume 2
Boxwood: abstractions as the foundation for storage infrastructure
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
The Chubby lock service for loosely-coupled distributed systems
OSDI '06 Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation - Volume 7
Paxos made live: an engineering perspective
Proceedings of the twenty-sixth annual ACM symposium on Principles of distributed computing
Zyzzyva: speculative byzantine fault tolerance
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
Sinfonia: a new paradigm for building scalable distributed systems
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
Dynamo: amazon's highly available key-value store
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
DepSpace: a byzantine fault-tolerant coordination service
Proceedings of the 3rd ACM SIGOPS/EuroSys European Conference on Computer Systems 2008
A simple totally ordered broadcast protocol
LADIS '08 Proceedings of the 2nd Workshop on Large-Scale Distributed Systems and Middleware
Zeno: eventually consistent Byzantine-fault tolerance
NSDI'09 Proceedings of the 6th USENIX symposium on Networked systems design and implementation
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Weak consistency as a last resort
Proceedings of the 4th International Workshop on Large Scale Distributed Systems and Middleware
Scalable agreement: toward ordering as a service
HotDep'10 Proceedings of the Sixth international conference on Hot topics in system dependability
HotDep'10 Proceedings of the Sixth international conference on Hot topics in system dependability
Onix: a distributed control platform for large-scale production networks
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Using Paxos to build a scalable, consistent, and highly available datastore
Proceedings of the VLDB Endowment
Scale and concurrency of GIGA+: file system directories with millions of files
FAST'11 Proceedings of the 9th USENIX conference on File and stroage technologies
DepSky: dependable and secure storage in a cloud-of-clouds
Proceedings of the sixth conference on Computer systems
Increasing performance in byzantine fault-tolerant systems with on-demand replica consistency
Proceedings of the sixth conference on Computer systems
FATE and DESTINI: a framework for cloud recovery testing
Proceedings of the 8th USENIX conference on Networked systems design and implementation
Nova: continuous Pig/Hadoop workflows
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Databases and Social Networks
Distributed middleware reliability and fault tolerance support in system S
Proceedings of the 5th ACM international conference on Distributed event-based system
YCSB++: benchmarking and performance debugging advanced features in scalable table stores
Proceedings of the 2nd ACM Symposium on Cloud Computing
Automatic management of partitioned, replicated search services
Proceedings of the 2nd ACM Symposium on Cloud Computing
Scalable consistency in Scatter
SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
Fast crash recovery in RAMCloud
SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
Thialfi: a client notification service for internet-scale applications
SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
Detecting failures in distributed systems with the Falcon spy network
SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
PREFAIL: a programmable tool for multiple-failure injection
Proceedings of the 2011 ACM international conference on Object oriented programming systems languages and applications
Chimera: data sharing flexibility, shared nothing simplicity
Proceedings of the 15th Symposium on International Database Engineering & Applications
Adaptive and dynamic funnel replication in clouds
ACM SIGOPS Operating Systems Review
Providing fault-tolerant execution of web-service-based workflows within clouds
Proceedings of the 2nd International Workshop on Cloud Computing Platforms
Kineograph: taking the pulse of a fast-changing and connected world
Proceedings of the 7th ACM european conference on Computer Systems
A critique of snapshot isolation
Proceedings of the 7th ACM european conference on Computer Systems
CheapBFT: resource-efficient byzantine fault tolerance
Proceedings of the 7th ACM european conference on Computer Systems
The evolving landscape of data management in the cloud
International Journal of Computational Science and Engineering
Leader election for replicated services using application scores
Middleware'11 Proceedings of the 12th ACM/IFIP/USENIX international conference on Middleware
Probabilistically bounded staleness for practical partial quorums
Proceedings of the VLDB Endowment
Calvin: fast distributed transactions for partitioned database systems
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Walnut: a unified cloud object store
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
CORFU: a shared log design for flash clusters
NSDI'12 Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation
Adaptive MapReduce using situation-aware mappers
Proceedings of the 15th International Conference on Extending Database Technology
Understanding the effects and implications of compute node related failures in hadoop
Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing
Snooze: A Scalable and Autonomic Virtual Machine Management Framework for Private Clouds
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
HyperDex: a distributed, searchable key-value store
Proceedings of the ACM SIGCOMM 2012 conference on Applications, technologies, architectures, and protocols for computer communication
Hierarchical policies for software defined networks
Proceedings of the first workshop on Hot topics in software defined networks
Big data platforms as a service: challenges and approach
HotCloud'12 Proceedings of the 4th USENIX conference on Hot Topics in Cloud Ccomputing
TROPIC: transactional resource orchestration platform in the cloud
USENIX ATC'12 Proceedings of the 2012 USENIX conference on Annual Technical Conference
Gnothi: separating data and metadata for efficient and available storage replication
USENIX ATC'12 Proceedings of the 2012 USENIX conference on Annual Technical Conference
Dynamic reconfiguration of primary/backup clusters
USENIX ATC'12 Proceedings of the 2012 USENIX conference on Annual Technical Conference
Practical hardening of crash-tolerant systems
USENIX ATC'12 Proceedings of the 2012 USENIX conference on Annual Technical Conference
Unifying synchronization and events in a multicore OS
Proceedings of the Asia-Pacific Workshop on Systems
Serializability, not serial: concurrency control and availability in multi-datacenter datastores
Proceedings of the VLDB Endowment
Solving big data challenges for enterprise application performance management
Proceedings of the VLDB Endowment
The unified logging infrastructure for data analytics at Twitter
Proceedings of the VLDB Endowment
HyperDex: a distributed, searchable key-value store
ACM SIGCOMM Computer Communication Review - Special october issue SIGCOMM '12
Infrastructure outsourcing in multi-cloud environment
Proceedings of the 2012 workshop on Cloud services, federation, and the 8th open cirrus summit
On the optimization of schedules for MapReduce workloads in the presence of shared scans
The VLDB Journal — The International Journal on Very Large Data Bases
Unifying synchronization and events in a multicore OS
APSys'12 Proceedings of the Third ACM SIGOPS Asia-Pacific conference on Systems
Toward a principled framework for benchmarking consistency
HotDep'12 Proceedings of the Eighth USENIX conference on Hot Topics in System Dependability
All about Eve: execute-verify replication for multi-core servers
OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
High throughput computing over peer-to-peer networks
Future Generation Computer Systems
xOMB: extensible open middleboxes with commodity servers
Proceedings of the eighth ACM/IEEE symposium on Architectures for networking and communications systems
Zoolander: efficient latency management in NoSQL stores
Proceedings of the Posters and Demo Track
Enhancing coordination in cloud infrastructures with an extendable coordination service
Proceedings of the Workshop on Secure and Dependable Middleware for Cloud Monitoring and Management
A Distributed Cache for Hadoop Distributed File System in Real-Time Cloud Services
GRID '12 Proceedings of the 2012 ACM/IEEE 13th International Conference on Grid Computing
µLibCloud: Providing High Available and Uniform Accessing to Multiple Cloud Storages
GRID '12 Proceedings of the 2012 ACM/IEEE 13th International Conference on Grid Computing
Leader election for replicated services using application scores
Proceedings of the 12th International Middleware Conference
X10-FT: transparent fault tolerance for APGAS language and runtime
Proceedings of the 2013 International Workshop on Programming Models and Applications for Multicores and Manycores
Oozie: towards a scalable workflow management system for Hadoop
Proceedings of the 1st ACM SIGMOD Workshop on Scalable Workflow Execution Engines and Technologies
ElasTraS: An elastic, scalable, and self-managing transactional database for the cloud
ACM Transactions on Database Systems (TODS)
Component-based scalability for cloud applications
Proceedings of the 3rd International Workshop on Cloud Data and Platforms
The big data ecosystem at LinkedIn
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Fast data in the era of big data: Twitter's real-time related query suggestion architecture
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
TimeStream: reliable stream computation in the cloud
Proceedings of the 8th ACM European Conference on Computer Systems
Design and implementation of caching services in the cloud
IBM Journal of Research and Development
Split/merge: system support for elastic execution in virtual middleboxes
nsdi'13 Proceedings of the 10th USENIX conference on Networked Systems Design and Implementation
Robustness in the Salus scalable block store
nsdi'13 Proceedings of the 10th USENIX conference on Networked Systems Design and Implementation
Improving availability in distributed systems with failure informers
nsdi'13 Proceedings of the 10th USENIX conference on Networked Systems Design and Implementation
Dasu: pushing experiments to the internet's edge
nsdi'13 Proceedings of the 10th USENIX conference on Networked Systems Design and Implementation
Beyond block I/O: implementing a distributed shared log in hardware
Proceedings of the 6th International Systems and Storage Conference
Participatory networking: an API for application control of SDNs
Proceedings of the ACM SIGCOMM 2013 conference on SIGCOMM
Grand challenge: the bluebay soccer monitoring engine
Proceedings of the 7th ACM international conference on Distributed event-based systems
Escape capsule: explicit state is robust and scalable
HotOS'13 Proceedings of the 14th USENIX conference on Hot Topics in Operating Systems
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
ACM SIGOPS 24th Symposium on Operating Systems Principles
There is more consensus in Egalitarian parliaments
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
Tango: distributed data structures over a shared log
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
Leveraging sharding in the design of scalable replication protocols
Proceedings of the 4th annual Symposium on Cloud Computing
CATS: a linearizable and self-organizing key-value store
Proceedings of the 4th annual Symposium on Cloud Computing
Network-aware data caching and prefetching for cloud-hosted metadata retrieval
NDM '13 Proceedings of the Third International Workshop on Network-Aware Data Management
On the efficiency of durable state machine replication
USENIX ATC'13 Proceedings of the 2013 USENIX conference on Annual Technical Conference
DepSky: Dependable and Secure Storage in a Cloud-of-Clouds
ACM Transactions on Storage (TOS)
CORFU: A distributed shared log
ACM Transactions on Computer Systems (TOCS)
Piranha: optimizing short jobs in Hadoop
Proceedings of the VLDB Endowment
Efficient transactions for parallel data movement
PDSW '13 Proceedings of the 8th Parallel Data Storage Workshop
Proceedings of the Industrial Track of the 13th ACM/IFIP/USENIX International Middleware Conference
Resilient X10: efficient failure-aware programming
Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programming
Eventually consistent: not what you were expecting?
Communications of the ACM
An evaluation of zookeeper for high availability in system S
Proceedings of the 5th ACM/SPEC international conference on Performance engineering
Social TV analytics: a novel paradigm to transform TV watching experience
Proceedings of the 5th ACM Multimedia Systems Conference
Eventually Consistent: Not What You Were Expecting?
Queue - Performance
X10-FT: Transparent fault tolerance for APGAS language and runtime
Parallel Computing
Network virtualization in multi-tenant datacenters
NSDI'14 Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation
Aggregation and degradation in JetStream: streaming analytics in the wide area
NSDI'14 Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation
NSDI'14 Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation
Hi-index | 0.02 |
In this paper, we describe ZooKeeper, a service for coordinating processes of distributed applications. Since ZooKeeper is part of critical infrastructure, ZooKeeper aims to provide a simple and high performance kernel for building more complex coordination primitives at the client. It incorporates elements from group messaging, shared registers, and distributed lock services in a replicated, centralized service. The interface exposed by Zoo-Keeper has the wait-free aspects of shared registers with an event-driven mechanism similar to cache invalidations of distributed file systems to provide a simple, yet powerful coordination service. The ZooKeeper interface enables a high-performance service implementation. In addition to the wait-free property, ZooKeeper provides a per client guarantee of FIFO execution of requests and linearizability for all requests that change the ZooKeeper state. These design decisions enable the implementation of a high performance processing pipeline with read requests being satisfied by local servers. We show for the target workloads, 2:1 to 100:1 read to write ratio, that ZooKeeper can handle tens to hundreds of thousands of transactions per second. This performance allows ZooKeeper to be used extensively by client applications.