Science of Computer Programming
Using Time Instead of Timeout for Fault-Tolerant Distributed Systems.
ACM Transactions on Programming Languages and Systems (TOPLAS)
Distributed systems: methods and tools for specification. An advanced course
Distributed systems: methods and tools for specification. An advanced course
Applications of Byzantine agreement in database systems
ACM Transactions on Database Systems (TODS)
Reliable communication in the presence of failures
ACM Transactions on Computer Systems (TOCS)
Highly available distributed services and fault-tolerant distributed garbage collection
PODC '86 Proceedings of the fifth annual ACM symposium on Principles of distributed computing
SIGCOMM '88 Symposium proceedings on Communications architectures and protocols
Towards a theory of replicated processing
Proceedings of a Symposium on Formal Techniques in Real-Time and Fault-Tolerant Systems
Reliable scheduling in a TMR database system
ACM Transactions on Computer Systems (TOCS)
Preserving and using context information in interprocess communication
ACM Transactions on Computer Systems (TOCS)
Early-delivery atomic broadcast
PODC '90 Proceedings of the ninth annual ACM symposium on Principles of distributed computing
Designing distributed services using refinement mappings
Designing distributed services using refinement mappings
Impossibility of distributed consensus with one faulty process
Journal of the ACM (JACM)
Replication and fault-tolerance in the ISIS system
Proceedings of the tenth ACM symposium on Operating systems principles
Synchronization in Distributed Programs
ACM Transactions on Programming Languages and Systems (TOPLAS)
The Byzantine Generals Problem
ACM Transactions on Programming Languages and Systems (TOPLAS)
Fail-stop processors: an approach to designing fault-tolerant computing systems
ACM Transactions on Computer Systems (TOCS)
Byzantine generals in action: implementing fail-stop processors
ACM Transactions on Computer Systems (TOCS)
Time, clocks, and the ordering of events in a distributed system
Communications of the ACM
Self-stabilizing systems in spite of distributed control
Communications of the ACM
Notes on Data Base Operating Systems
Operating Systems, An Advanced Course
Byzantine clock synchronization
PODC '84 Proceedings of the third annual ACM symposium on Principles of distributed computing
Fault-tolerant clock synchronization
PODC '84 Proceedings of the third annual ACM symposium on Principles of distributed computing
Early-delivery atomic broadcast
PODC '90 Proceedings of the ninth annual ACM symposium on Principles of distributed computing
Principal Features of the VOLTAN Family of Reliable Node Architectures for Distributed Systems
IEEE Transactions on Computers - Special issue on fault-tolerant computing
The consensus problem in fault-tolerant computing
ACM Computing Surveys (CSUR)
Causal controversy at Le Mont St.-Michel
ACM SIGOPS Operating Systems Review
High availability in a real-time system
ACM SIGOPS Operating Systems Review
The process group approach to reliable distributed computing
Communications of the ACM
Unifying self-stabilization and fault-tolerance
PODC '93 Proceedings of the twelfth annual ACM symposium on Principles of distributed computing
How to securely replicate services
ACM Transactions on Programming Languages and Systems (TOPLAS)
Secure agreement protocols: reliable and atomic group multicast in rampart
CCS '94 Proceedings of the 2nd ACM Conference on Computer and communications security
A security architecture for fault-tolerant systems
ACM Transactions on Computer Systems (TOCS) - Special issue on computer architecture
Supporting Fault-Tolerant Parallel Programming in Linda
IEEE Transactions on Parallel and Distributed Systems
Programming Language Support for Writing Fault-Tolerant Distributed Software
IEEE Transactions on Computers - Special issue on fault-tolerant computing
Hypervisor-based fault tolerance
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
A highly available scalable ITV system
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
Hypervisor-based fault tolerance
ACM Transactions on Computer Systems (TOCS) - Special issue on operating system principles
Unreliable failure detectors for reliable distributed systems
Journal of the ACM (JACM)
Distributing trust with the Rampart toolkit
Communications of the ACM
From group communication to transactions in distributed systems
Communications of the ACM
A Secure Group Membership Protocol
IEEE Transactions on Software Engineering
CCS '96 Proceedings of the 3rd ACM conference on Computer and communications security
Implementing Fail-Silent Nodes for Distributed Systems
IEEE Transactions on Computers
Efficient message ordering in dynamic networks
PODC '96 Proceedings of the fifteenth annual ACM symposium on Principles of distributed computing
Comparing primary-backup and state machines for crash failures
PODC '96 Proceedings of the fifteenth annual ACM symposium on Principles of distributed computing
Specifying and using a partitionable group communication service
PODC '97 Proceedings of the sixteenth annual ACM symposium on Principles of distributed computing
PODC '97 Proceedings of the sixteenth annual ACM symposium on Principles of distributed computing
Design and Evaluation of a Window-Consistent Replication Service
IEEE Transactions on Computers
Path independence for authentication in large-scale systems
Proceedings of the 4th ACM conference on Computer and communications security
Cloning: a novel method for interactive parallel simulation
Proceedings of the 29th conference on Winter simulation
Fault tolerance in distributed Ada 95
IRTAW '97 Proceedings of the eighth international workshop on Real-Time Ada
Synthesis of fault-tolerant concurrent programs
PODC '98 Proceedings of the seventeenth annual ACM symposium on Principles of distributed computing
Dynamic virtual logical processes
PADS '98 Proceedings of the twelfth workshop on Parallel and distributed simulation
Fault-tolerant wait-free shared objects
Journal of the ACM (JACM)
ACM Transactions on Computer Systems (TOCS)
Designing Masking Fault-Tolerance via Nonmasking Fault-Tolerance
IEEE Transactions on Software Engineering
Multi-μ: an Ada 95 based architecture for fault tolerance support of real-time systems
Proceedings of the 1998 annual ACM SIGAda international conference on Ada
Coyote: a system for constructing fine-grain configurable communication services
ACM Transactions on Computer Systems (TOCS)
An evaluation of flow control in group communication
IEEE/ACM Transactions on Networking (TON)
Practical Byzantine fault tolerance
OSDI '99 Proceedings of the third symposium on Operating systems design and implementation
Client-Access Protocols for Replicated Services
IEEE Transactions on Software Engineering
Resilient Authentication Using Path Independence
IEEE Transactions on Computers
A Real-Time Primary-Backup Replication Service
IEEE Transactions on Parallel and Distributed Systems
Fundamentals of fault-tolerant distributed computing in asynchronous environments
ACM Computing Surveys (CSUR)
Replicated invocations in wide-area systems
Proceedings of the 8th ACM SIGOPS European workshop on Support for composing distributed applications
Replica Determinism and Flexible Scheduling in Hard Real-Time Dependable Systems
IEEE Transactions on Computers
An architecture for distributed OASIS services
IFIP/ACM International Conference on Distributed systems platforms
Efficient atomic broadcast using deterministic merge
Proceedings of the nineteenth annual ACM symposium on Principles of distributed computing
Specifying and using a partitionable group communication service
ACM Transactions on Computer Systems (TOCS)
Consensus-based fault-tolerant total order multicast
IEEE Transactions on Parallel and Distributed Systems
Lamport on mutual exclusion: 27 years of planting seeds
Proceedings of the twentieth annual ACM symposium on Principles of distributed computing
BASE: using abstraction to improve fault tolerance
SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Group communication specifications: a comprehensive study
ACM Computing Surveys (CSUR)
High availability in a real-time system
EW 5 Proceedings of the 5th workshop on ACM SIGOPS European workshop: Models and paradigms for distributed systems structuring
Proceedings of the 2001 workshop on New security paradigms
ACM Transactions on Modeling and Computer Simulation (TOMACS)
Design and evaluation of a conit-based continuous consistency model for replicated services
ACM Transactions on Computer Systems (TOCS)
Practical byzantine fault tolerance and proactive recovery
ACM Transactions on Computer Systems (TOCS)
Active disk paxos with infinitely many processes
Proceedings of the twenty-first annual symposium on Principles of distributed computing
ACM Transactions on Computer Systems (TOCS)
The Journal of Supercomputing
An Architecture for Survivable Coordination in Large Distributed Systems
IEEE Transactions on Knowledge and Data Engineering
The Cost of Recovery in Message Logging Protocols
IEEE Transactions on Knowledge and Data Engineering
Consensus-Based Fault-Tolerant Total Order Multicast
IEEE Transactions on Parallel and Distributed Systems
On Group Communication Support in CORBA
IEEE Transactions on Parallel and Distributed Systems
Structuring Fault-Tolerant Object Systems for Modularity in a Distributed Environment
IEEE Transactions on Parallel and Distributed Systems
Specifying and Verifying Requirements of Real-Time Systems
IEEE Transactions on Software Engineering
The Database State Machine Approach
Distributed and Parallel Databases
Design and Verification of Distributed Recovery Blocks with CSP
Formal Methods in System Design
Exception handling and resolution for transactional object groups
Advances in exception handling techniques
Addressing Scalability Issues Using the CLF Middleware
EDOC '01 Proceedings of the 5th IEEE International Conference on Enterprise Distributed Object Computing
Online Non-stop Software Update Using Replicated Execution Blocks
COMPSAC '00 24th International Computer Software and Applications Conference
DISC '00 Proceedings of the 14th International Conference on Distributed Computing
Scalable Secure Storage when Half the System Is Faulty
ICALP '00 Proceedings of the 27th International Colloquium on Automata, Languages and Programming
Fault Tolerance by Transparent Replication for Distributed Ada 95
Ada-Europe '99 Proceedings of the 1999 Ada-Europe International Conference on Reliable Software Technologies
How to Modify the GNAT Frontend tp Experiment with Ada Extensions
Ada-Europe '99 Proceedings of the 1999 Ada-Europe International Conference on Reliable Software Technologies
Building Robust Applications by Reusing Non-robust Legacy Software
Ada Europe '01 Proceedings of the 6th Ade-Europe International Conference Leuven on Reliable Software Technologies
Transparent Environment for Replicated Ravenscar Applications
Ada-Europe '02 Proceedings of the 7th Ada-Europe International Conference on Reliable Software Technologies
A Tailorable Distributed Programming Environment
Ada-Europe '02 Proceedings of the 7th Ada-Europe International Conference on Reliable Software Technologies
Building TMR-Based Reliable Servers Despite Bounded Input Lifetimes
Euro-Par '01 Proceedings of the 7th International Euro-Par Conference Manchester on Parallel Processing
MetaJava - A Platform for Adaptable Operating-System Mechanisms
ECOOP '97 Proceedings of the Workshops on Object-Oriented Technology
Bus Architectures for Safety-Critical Embedded Systems
EMSOFT '01 Proceedings of the First International Workshop on Embedded Software
An Overview of Formal Verification for the Time-Triggered Architecture
FTRTFT '02 Proceedings of the 7th International Symposium on Formal Techniques in Real-Time and Fault-Tolerant Systems: Co-sponsored by IFIP WG 2.2
Agreement Problems in Fault-Tolerant Distributed Systems
SOFSEM '01 Proceedings of the 28th Conference on Current Trends in Theory and Practice of Informatics Piestany: Theory and Practice of Informatics
Broadening the Scope of Fault Tolerance within Secure Services
Revised Papers from the 8th International Workshop on Security Protocols
Exception Handling and Resolution for Transactional Object Groups
Advances in Exception Handling Techniques (the book grow out of a ECOOP 2000 workshop)
Topology-Aware Algorithms for Large-Scale Communication
Advances in Distributed Systems, Advanced Distributed Computing: From Algorithms to Systems
Integrating Group Communication with Transactions for Implementing Persistent Replicated Objects
Advances in Distributed Systems, Advanced Distributed Computing: From Algorithms to Systems
Programming Partition-Aware Network Applications
Advances in Distributed Systems, Advanced Distributed Computing: From Algorithms to Systems
Improving Scalability of Replicated Services in Mobile Agent Systems
MA '02 Proceedings of the 6th International Conference on Mobile Agents
Middleware Support for Voting and Data Fusion
DSN '01 Proceedings of the 2001 International Conference on Dependable Systems and Networks (formerly: FTCS)
Distributing Trust on the Internet
DSN '01 Proceedings of the 2001 International Conference on Dependable Systems and Networks (formerly: FTCS)
The Design and Use of Persistent Memory on the DNCP Hardware Fault-Tolerant Platform
DSN '01 Proceedings of the 2001 International Conference on Dependable Systems and Networks (formerly: FTCS)
Byzantine Fault Tolerance Can Be Fast
DSN '01 Proceedings of the 2001 International Conference on Dependable Systems and Networks (formerly: FTCS)
A Secure and Highly Available Distributed Store for Meeting Diverse Data Storage Needs
DSN '01 Proceedings of the 2001 International Conference on Dependable Systems and Networks (formerly: FTCS)
Proceedings of the 13th International Symposium on Distributed Computing
Atomic Data Access in Distributed Hash Tables
IPTPS '01 Revised Papers from the First International Workshop on Peer-to-Peer Systems
The Bancomat problem: an example of resource allocation in a partitionable asynchronous system
Theoretical Computer Science - Special issue: Distributed computing
Reconfiguration and transient recovery in state machine architectures
FTCS '96 Proceedings of the The Twenty-Sixth Annual International Symposium on Fault-Tolerant Computing (FTCS '96)
Fault-Tolerance: Java's Missing Buzzword
HCW '98 Proceedings of the Seventh Heterogeneous Computing Workshop
Using Replication and Partitioning to Build Secure Distributed Systems
SP '03 Proceedings of the 2003 IEEE Symposium on Security and Privacy
A Method for Combining Replication with Caching
SRDS '99 Proceedings of the 18th IEEE Symposium on Reliable Distributed Systems
Responsive Security for Stored Data
ICDCS '03 Proceedings of the 23rd International Conference on Distributed Computing Systems
A Replication Technique Based on a Functional and Attribute Grammar Computation Model
ISSRE '96 Proceedings of the The Seventh International Symposium on Software Reliability Engineering
BASE: Using abstraction to improve fault tolerance
ACM Transactions on Computer Systems (TOCS)
FTCS '95 Proceedings of the Twenty-Fifth International Symposium on Fault-Tolerant Computing
Fault Tolerance in Safety Critical Automotive Applications: Cost of Agreement as a Limiting Factor
FTCS '95 Proceedings of the Twenty-Fifth International Symposium on Fault-Tolerant Computing
Backoff Protocols for Distributed Mutual Exclusion and Ordering
ICDCS '01 Proceedings of the The 21st International Conference on Distributed Computing Systems
An algorithm for Supporting Fault Tolerant Objects in Distributed Object-Oriented Operating Systems
IWOOOS '95 Proceedings of the 4th International Workshop on Object-Orientation in Operating Systems
Filtering Duplicated Invocations Using Symmetric Proxies
IWOOOS '95 Proceedings of the 4th International Workshop on Object-Orientation in Operating Systems
Separating agreement from execution for byzantine fault tolerant services
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
A Timeout-Based Message Ordering Protocol for a Lightweight Software Implementation of TMR Systems
IEEE Transactions on Parallel and Distributed Systems
Synthesis of fault-tolerant concurrent programs
ACM Transactions on Programming Languages and Systems (TOPLAS)
Distributed communication in ML
Journal of Functional Programming
Replication Management in Reliable Real-Time Systems
Real-Time Systems
A weakest failure detector-based asynchronous consensus protocol for f
Information Processing Letters
An analysis of update ordering in distributed replication systems
Future Generation Computer Systems - Special issue: Advanced services for clusters and internet computing
Highly available, fault-tolerant, parallel dataflows
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Reliable Distributed Network Management by Replication
Journal of Network and Systems Management
The weakest failure detectors to solve certain fundamental problems in distributed computing
Proceedings of the twenty-third annual ACM symposium on Principles of distributed computing
Implementing a replicated service with group communication
Journal of Systems Architecture: the EUROMICRO Journal
Replication for web hosting systems
ACM Computing Surveys (CSUR)
Replication for web hosting systems
ACM Computing Surveys (CSUR)
Handling message semantics with Generic Broadcast protocols
Distributed Computing
Replication algorithms for the World-Wide Web
Journal of Systems Architecture: the EUROMICRO Journal
Total order broadcast and multicast algorithms: Taxonomy and survey
ACM Computing Surveys (CSUR)
The Guardian Model and Primitives for Exception Handling in Distributed Systems
IEEE Transactions on Software Engineering
Consistent and automatic replica regeneration
ACM Transactions on Storage (TOS)
Comparison of Database Replication Techniques Based on Total Order Broadcast
IEEE Transactions on Knowledge and Data Engineering
Geographically Distributed System for Catastrophic Recovery
LISA '02 Proceedings of the 16th USENIX conference on System administration
Simple and Efficient Oracle-Based Consensus Protocols for Asynchronous Byzantine Systems
IEEE Transactions on Dependable and Secure Computing
Distributed Computing
Architectural support for mode-driven fault tolerance in distributed applications
WADS '05 Proceedings of the 2005 workshop on Architecting dependable systems
Plutus: Scalable Secure File Sharing on Untrusted Storage
FAST '03 Proceedings of the 2nd USENIX Conference on File and Storage Technologies
Implementing Trustworthy Services Using Replicated State Machines
IEEE Security and Privacy
BAR fault tolerance for cooperative services
Proceedings of the twentieth ACM symposium on Operating systems principles
Fault-scalable Byzantine fault-tolerant services
Proceedings of the twentieth ACM symposium on Operating systems principles
Proceedings of the twentieth ACM symposium on Operating systems principles
FTWeb: A Fault Tolerant Infrastructure for Web Services
EDOC '05 Proceedings of the Ninth IEEE International EDOC Enterprise Computing Conference
Dynamic data replication and consistency in mobile environments
DSM '05 Proceedings of the 2nd international doctoral symposium on Middleware
From Set Membership to Group Membership: A Separation of Concerns
IEEE Transactions on Dependable and Secure Computing
Active Replication of Multithreaded Applications
IEEE Transactions on Parallel and Distributed Systems
Trust but verify: accountability for network services
Proceedings of the 11th workshop on ACM SIGOPS European workshop
WS-replication: a framework for highly available web services
Proceedings of the 15th international conference on World Wide Web
BTS: a Byzantine fault-tolerant tuple space
Proceedings of the 2006 ACM symposium on Applied computing
Active disk Paxos with infinitely many processes
Distributed Computing - Special issue: PODC 02
MobiEyes: A Distributed Location Monitoring Service Using Moving Location Queries
IEEE Transactions on Mobile Computing
IEEE Transactions on Dependable and Secure Computing
The SMART way to migrate replicated stateful services
Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006
Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006
Behaviour Abstraction for Communicating Sequential Processes
Fundamenta Informaticae
Specifying and using intrusion masking models to process distributed operations
Journal of Computer Security
Design and implementation of a secure wide-area object middleware
Computer Networks: The International Journal of Computer and Telecommunications Networking
Tight bounds for asynchronous randomized consensus
Proceedings of the thirty-ninth annual ACM symposium on Theory of computing
HOTDEP'06 Proceedings of the 2nd conference on Hot Topics in System Dependability - Volume 2
The case for Byzantine fault detection
HOTDEP'06 Proceedings of the 2nd conference on Hot Topics in System Dependability - Volume 2
The phoenix recovery system: rebuilding from the ashes of an internet catastrophe
HOTOS'03 Proceedings of the 9th conference on Hot Topics in Operating Systems - Volume 9
Secure data replication over untrusted hosts
HOTOS'03 Proceedings of the 9th conference on Hot Topics in Operating Systems - Volume 9
Consistent and automatic replica regeneration
NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
Proactive recovery in a Byzantine-fault-tolerant system
OSDI'00 Proceedings of the 4th conference on Symposium on Operating System Design & Implementation - Volume 4
Chain replication for supporting high throughput and availability
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
A Parsimonious Approach for Obtaining Resource-Efficient and Trustworthy Execution
IEEE Transactions on Dependable and Secure Computing
Unified support for heterogeneous security policies in distributed systems
SSYM'98 Proceedings of the 7th conference on USENIX Security Symposium - Volume 7
Implementing causal logging using OrbixWeb interception
COOTS'99 Proceedings of the 5th conference on USENIX Conference on Object-Oriented Technologies & Systems - Volume 5
Filterfresh: hot replication of java RMI server objects
COOTS'98 Proceedings of the 4th conference on USENIX Conference on Object-Oriented Technologies and Systems - Volume 4
TCLTK '98 Proceedings of the 3rd Annual USENIX Workshop on Tcl/Tk - Volume 3
Asynchronous Agreement and Its Relation with Error-Correcting Codes
IEEE Transactions on Computers
Proceedings of the 16th international symposium on High performance distributed computing
Tashkent+: memory-aware load balancing and update filtering in replicated databases
Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
Paxos made live: an engineering perspective
Proceedings of the twenty-sixth annual ACM symposium on Principles of distributed computing
Proceedings of the twenty-sixth annual ACM symposium on Principles of distributed computing
Strong accountability for network storage
ACM Transactions on Storage (TOS)
Zyzzyva: speculative byzantine fault tolerance
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
PeerReview: practical accountability for distributed systems
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
Attested append-only memory: making adversaries stick to their word
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
HQ replication: a hybrid quorum protocol for byzantine fault tolerance
OSDI '06 Proceedings of the 7th symposium on Operating systems design and implementation
Design of a cheat-resistant P2P online gaming system
Proceedings of the 2nd international conference on Digital interactive media in entertainment and arts
Exploiting type-awareness in a self-recovering disk
Proceedings of the 2007 ACM workshop on Storage security and survivability
Flexible intrusion tolerant voting architecture
Proceedings of the 2007 ACM workshop on Scalable trusted computing
Pronto: High availability for standard off-the-shelf databases
Journal of Parallel and Distributed Computing
A survey of linguistic structures for application-level fault tolerance
ACM Computing Surveys (CSUR)
DepSpace: a byzantine fault-tolerant coordination service
Proceedings of the 3rd ACM SIGOPS/EuroSys European Conference on Computer Systems 2008
Optimistic transactional active replication
Proceedings of the 2nd international conference on Ubiquitous information management and communication
Fingerpointing correlated failures in replicated systems
SYSML'07 Proceedings of the 2nd USENIX workshop on Tackling computer systems problems with machine learning techniques
Conflict-aware load-balancing techniques for database replication
Proceedings of the 2008 ACM symposium on Applied computing
Data and code integrity in Grid environments
SMO'06 Proceedings of the 6th WSEAS International Conference on Simulation, Modelling and Optimization
Got predictability?: experiences with fault-tolerant middleware
Proceedings of the 2007 ACM/IFIP/USENIX international conference on Middleware companion
Replica placement for high availability in distributed stream processing systems
Proceedings of the second international conference on Distributed event-based systems
Nysiad: practical protocol transformation to tolerate Byzantine failures
NSDI'08 Proceedings of the 5th USENIX Symposium on Networked Systems Design and Implementation
Zyzzyva: speculative Byzantine fault tolerance
Communications of the ACM - Remembering Jim Gray
Virtual infrastructure for collision-prone wireless networks
Proceedings of the twenty-seventh ACM symposium on Principles of distributed computing
Randomized consensus in expected O(n log n) individual work
Proceedings of the twenty-seventh ACM symposium on Principles of distributed computing
Research note: On Byzantine generals with alternative plans
Journal of Parallel and Distributed Computing
Tight bounds for asynchronous randomized consensus
Journal of the ACM (JACM)
Preserving the consistency of distributed objects with real-time transactions
NOTERE '08 Proceedings of the 8th international conference on New technologies in distributed systems
Handling Emergent Nondeterminism in Replicated Services
Architecting Dependable Systems V
Programming with Live Distributed Objects
ECOOP '08 Proceedings of the 22nd European conference on Object-Oriented Programming
Optimizing Threshold Protocols in Adversarial Structures
DISC '08 Proceedings of the 22nd international symposium on Distributed Computing
Showing correctness of a replication algorithm in a component based system
IDEAS '08 Proceedings of the 2008 international symposium on Database engineering & applications
Fault-tolerant stream processing using a distributed, replicated file system
Proceedings of the VLDB Endowment
Experiences in engineering active replication into a traditional three-tiered client-server system
Proceedings of the 2008 RISE/EFTS Joint International Workshop on Software Engineering for Resilient Systems
Solving Atomic Multicast When Groups Crash
OPODIS '08 Proceedings of the 12th International Conference on Principles of Distributed Systems
Reliability versus performance for critical applications
Journal of Parallel and Distributed Computing
Living with nondeterminism in replicated middleware applications
Proceedings of the ACM/IFIP/USENIX 2006 International Conference on Middleware
A simple totally ordered broadcast protocol
LADIS '08 Proceedings of the 2nd Workshop on Large-Scale Distributed Systems and Middleware
Paxos for System Builders: an overview
LADIS '08 Proceedings of the 2nd Workshop on Large-Scale Distributed Systems and Middleware
Configuration-space performance anomaly depiction
LADIS '08 Proceedings of the 2nd Workshop on Large-Scale Distributed Systems and Middleware
Reducing the costs of large-scale BFT replication
LADIS '08 Proceedings of the 2nd Workshop on Large-Scale Distributed Systems and Middleware
Design and implementation of a Byzantine fault tolerance framework for Web services
Journal of Systems and Software
CrystalBall: predicting and preventing inconsistencies in deployed distributed systems
NSDI'09 Proceedings of the 6th USENIX symposium on Networked systems design and implementation
Tolerating latency in replicated state machines through client speculation
NSDI'09 Proceedings of the 6th USENIX symposium on Networked systems design and implementation
A Generic Group Communication Approach for Hybrid Distributed Systems
DAIS '09 Proceedings of the 9th IFIP WG 6.1 International Conference on Distributed Applications and Interoperable Systems
Dynamic atomic storage without consensus
Proceedings of the 28th ACM symposium on Principles of distributed computing
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Symmetric active/active metadata service for high availability parallel file systems
Journal of Parallel and Distributed Computing
Ripley: automatically securing web 2.0 applications through replicated execution
Proceedings of the 16th ACM conference on Computer and communications security
Zyzzyva: Speculative Byzantine fault tolerance
ACM Transactions on Computer Systems (TOCS)
A Decidable Probability Logic for Timed Probabilistic Systems
Fundamenta Informaticae
The Design of Finite State Machine for Asynchronous Replication Protocol
ICIC '07 Proceedings of the 3rd International Conference on Intelligent Computing: Advanced Intelligent Computing Theories and Applications. With Aspects of Artificial Intelligence
Weak Synchrony Models and Failure Detectors for Message Passing (k-)Set Agreement
OPODIS '09 Proceedings of the 13th International Conference on Principles of Distributed Systems
Proactive Fortification of Fault-Tolerant Services
OPODIS '09 Proceedings of the 13th International Conference on Principles of Distributed Systems
A novel approach for component-based fault-tolerant software development
Information and Software Technology
The reliability analysis of resiliency framework for Grid Services
ACST '08 Proceedings of the Fourth IASTED International Conference on Advances in Computer Science and Technology
Predicting and preventing inconsistencies in deployed distributed systems
ACM Transactions on Computer Systems (TOCS)
Semi-passive replication and Lazy Consensus
Journal of Parallel and Distributed Computing
ACM SIGACT News
Policy-based access control for weakly consistent replication
Proceedings of the 5th European conference on Computer systems
A pattern-based approach for modeling and analyzing error recovery
Architecting dependable systems IV
A scalable and secure cryptographic service
Proceedings of the 21st annual IFIP WG 11.3 working conference on Data and applications security
Byzantine consensus with few synchronous links
OPODIS'07 Proceedings of the 11th international conference on Principles of distributed systems
Fault tolerance in finite state machines using fusion
ICDCN'08 Proceedings of the 9th international conference on Distributed computing and networking
Lithium: virtual machine storage for the cloud
Proceedings of the 1st ACM symposium on Cloud computing
Towards a practical approach to confidential Byzantine fault tolerance
Future directions in distributed computing
A data-centric approach for scalable state machine replication
Future directions in distributed computing
Best-effort group service in dynamic networks
Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
ACM Transactions on Computer Systems (TOCS)
Throughput optimal total order broadcast for cluster environments
ACM Transactions on Computer Systems (TOCS)
Enabling replication in the ASSISTANT programming model
Proceedings of the 6th International Wireless Communications and Mobile Computing Conference
Scalable byzantine computation
ACM SIGACT News
The byzantine empire in the intercloud
ACM SIGACT News
Prophecy: using history for high-throughput fault tolerance
NSDI'10 Proceedings of the 7th USENIX conference on Networked systems design and implementation
Mencius: building efficient replicated state machines for WANs
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
ZooKeeper: wait-free coordination for internet-scale systems
USENIXATC'10 Proceedings of the 2010 USENIX conference on USENIX annual technical conference
The failure detector abstraction
ACM Computing Surveys (CSUR)
Implementing fault-tolerant services using state machines: beyond replication
DISC'10 Proceedings of the 24th international conference on Distributed computing
Programming distributed systems with group IO
EUROMICRO-PDP'02 Proceedings of the 10th Euromicro conference on Parallel, distributed and network-based processing
The design of a practical system for fault-tolerant virtual machines
ACM SIGOPS Operating Systems Review
Scalable virtual machine storage using local disks
ACM SIGOPS Operating Systems Review
The case for determinism in database systems
Proceedings of the VLDB Endowment
Declarative configuration management for complex and dynamic networks
Proceedings of the 6th International COnference
Storyboard: optimistic deterministic multithreading
HotDep'10 Proceedings of the Sixth international conference on Hot topics in system dependability
HotDep'10 Proceedings of the Sixth international conference on Hot topics in system dependability
HotDep'10 Proceedings of the Sixth international conference on Hot topics in system dependability
Deterministic process groups in dOS
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Efficient system-enforced deterministic parallelism
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Scalable transactions in the cloud: partitioning revisited
OTM'10 Proceedings of the 2010 international conference on On the move to meaningful internet systems: Part II
Synoptic: summarizing system logs with refinement
SLAML'10 Proceedings of the 2010 workshop on Managing systems via log analysis and machine learning techniques
Putting events in context: aspects for event-based distributed programming
Proceedings of the tenth international conference on Aspect-oriented software development
DieCast: Testing Distributed Systems with an Accurate Scale Model
ACM Transactions on Computer Systems (TOCS)
Paxos replicated state machines as the basis of a high-performance data store
Proceedings of the 8th USENIX conference on Networked systems design and implementation
Plutus: scalable secure file sharing on untrusted storage
FAST'03 Proceedings of the 2nd USENIX conference on File and storage technologies
The role of accountability in dependable distributed systems
HotDep'05 Proceedings of the First conference on Hot topics in system dependability
Managing self-inflicted nondeterminism
HotDep'05 Proceedings of the First conference on Hot topics in system dependability
HotDep'06 Proceedings of the Second conference on Hot topics in system dependability
The case for byzantine fault detection
HotDep'06 Proceedings of the Second conference on Hot topics in system dependability
Beyond one-third faulty replicas in byzantine fault tolerant systems
NSDI'07 Proceedings of the 4th USENIX conference on Networked systems design & implementation
Distributed and fault-tolerant execution framework for transaction processing
Proceedings of the 4th Annual International Conference on Systems and Storage
A latency and fault-tolerance optimizer for online parallel query plans
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
A layered approach for identifying systematic faults of component-based software systems
Proceedings of the 16th international workshop on Component-oriented programming
Multi-writer regular registers in dynamic distributed systems with byzantine failures
Proceedings of the 3rd International Workshop on Theoretical Aspects of Dynamic Distributed Systems
Scalable consistency in Scatter
SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
Detecting and surviving data races using complementary schedules
SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
An algorithm for implementing BFT registers in distributed systems with bounded churn
SSS'11 Proceedings of the 13th international conference on Stabilization, safety, and security of distributed systems
Evaluating the viability of process replication reliability for exascale systems
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Living with nondeterminism in replicated middleware applications
Middleware'06 Proceedings of the 7th ACM/IFIP/USENIX international conference on Middleware
Extending the UMIOP specification for reliable multicast in CORBA
OTM'05 Proceedings of the 2005 Confederated international conference on On the Move to Meaningful Internet Systems - Volume >Part I
Integrating the ROMIOP and ETF specifications for atomic multicast in CORBA
OTM'05 Proceedings of the 2005 Confederated international conference on On the Move to Meaningful Internet Systems - Volume >Part I
Group communication: from practice to theory
SOFSEM'06 Proceedings of the 32nd conference on Current Trends in Theory and Practice of Computer Science
Run-time switching between total order algorithms
Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
Faults in large distributed systems and what we can do about them
Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
Replication predicates for dependent-failure algorithms
Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
Behavioral distance for intrusion detection
RAID'05 Proceedings of the 8th international conference on Recent Advances in Intrusion Detection
Commensal cuckoo: secure group partitioning for large-scale services
ACM SIGOPS Operating Systems Review
From paxos to CORFU: a flash-speed shared log
ACM SIGOPS Operating Systems Review
ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
A formal model for fault-tolerance in distributed systems
SAFECOMP'05 Proceedings of the 24th international conference on Computer Safety, Reliability, and Security
TrustedPals: secure multiparty computation implemented with smart cards
ESORICS'06 Proceedings of the 11th European conference on Research in Computer Security
A fault tolerant system using collaborative agents
TAINN'05 Proceedings of the 14th Turkish conference on Artificial Intelligence and Neural Networks
Parsimonious asynchronous byzantine-fault-tolerant atomic broadcast
OPODIS'05 Proceedings of the 9th international conference on Principles of Distributed Systems
Behavioral distance measurement using hidden markov models
RAID'06 Proceedings of the 9th international conference on Recent Advances in Intrusion Detection
Architecting and implementing versatile dependability
Architecting Dependable Systems III
Architecting Dependable Systems III
Dependable Systems
Improving server applications with system transactions
Proceedings of the 7th ACM european conference on Computer Systems
Replicating for performance: case studies
Replication
State machine replication with byzantine faults
Replication
Proceedings of the Seventh Annual Workshop on Cyber Security and Information Intelligence Research
Fused state machines for fault tolerance in distributed systems
OPODIS'11 Proceedings of the 15th international conference on Principles of Distributed Systems
Byzantine fault-tolerance with commutative commands
OPODIS'11 Proceedings of the 15th international conference on Principles of Distributed Systems
A protocol for the atomic capture of multiple molecules on large scale platforms
ICDCN'12 Proceedings of the 13th international conference on Distributed Computing and Networking
Byzantine agreement with homonyms in synchronous systems
ICDCN'12 Proceedings of the 13th international conference on Distributed Computing and Networking
Beyond traces and independence
Dependable and Historic Computing
RESTGroups for resilient web services
SOFSEM'12 Proceedings of the 38th international conference on Current Trends in Theory and Practice of Computer Science
CORFU: a shared log design for flash clusters
NSDI'12 Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation
Don't lose sleep over availability: the GreenUp decentralized wakeup service
NSDI'12 Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation
Gnothi: separating data and metadata for efficient and available storage replication
USENIX ATC'12 Proceedings of the 2012 USENIX conference on Annual Technical Conference
Surviving congestion in geo-distributed storage systems
USENIX ATC'12 Proceedings of the 2012 USENIX conference on Annual Technical Conference
Scalability of replicated metadata services in distributed file systems
DAIS'12 Proceedings of the 12th IFIP WG 6.1 international conference on Distributed Applications and Interoperable Systems
Homonyms with forgeable identifiers
SIROCCO'12 Proceedings of the 19th international conference on Structural Information and Communication Complexity
Pushouts in software architecture design
Proceedings of the 11th International Conference on Generative Programming and Component Engineering
Behaviour Abstraction for Communicating Sequential Processes
Fundamenta Informaticae
All about Eve: execute-verify replication for multi-core servers
OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
Making geo-replicated systems fast as possible, consistent when necessary
OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
DMME: A Distributed LTE Mobility Management Entity
Bell Labs Technical Journal
Probabilistic opaque quorum systems
DISC'07 Proceedings of the 21st international conference on Distributed Computing
Formal verification of distributed algorithms: from pseudo code to checked proofs
TCS'12 Proceedings of the 7th IFIP TC 1/WG 202 international conference on Theoretical Computer Science
ISWC'12 Proceedings of the 11th international conference on The Semantic Web - Volume Part II
Adaptive request batching for byzantine replication
ACM SIGOPS Operating Systems Review
Abstracting context in event-based software
Transactions on Aspect-Oriented Software Development IX
Enhancing group communication with self-manageable behavior
Journal of Parallel and Distributed Computing
A study of unpredictability in fault-tolerant middleware
Computer Networks: The International Journal of Computer and Telecommunications Networking
Churn Tolerance Algorithm for State Machine Replication
WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 02
Photon: fault-tolerant and scalable joining of continuous data streams
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
MoSQL: an elastic storage engine for MySQL
Proceedings of the 28th Annual ACM Symposium on Applied Computing
Efficient software-based fault tolerance approach on multicore platforms
Proceedings of the Conference on Design, Automation and Test in Europe
Avoiding disruptive failovers in transaction processing systems with multiple active nodes
Journal of Parallel and Distributed Computing
Rollback-recovery without checkpoints in distributed event processing systems
Proceedings of the 7th ACM international conference on Distributed event-based systems
Escape capsule: explicit state is robust and scalable
HotOS'13 Proceedings of the 14th USENIX conference on Hot Topics in Operating Systems
Towards secure and dependable software-defined networks
Proceedings of the second ACM SIGCOMM workshop on Hot topics in software defined networking
Distributing trusted third parties
ACM SIGACT News
Cooperative security in distributed networks
Computer Communications
Towards practical communication in Byzantine-resistant DHTs
IEEE/ACM Transactions on Networking (TON)
Adaptive atomic capture of multiple molecules
Journal of Parallel and Distributed Computing
The TClouds platform: concept, architecture and instantiations
Proceedings of the 2nd International Workshop on Dependability Issues in Cloud Computing
Assessing data availability of Cassandra in the presence of non-accurate membership
Proceedings of the 2nd International Workshop on Dependability Issues in Cloud Computing
Byzantine agreement with homonyms in synchronous systems
Theoretical Computer Science
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
ACM SIGOPS 24th Symposium on Operating Systems Principles
On the use of decentralization to enable privacy in web-scale recommendation services
Proceedings of the 12th ACM workshop on Workshop on privacy in the electronic society
Tango: distributed data structures over a shared log
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
Leveraging sharding in the design of scalable replication protocols
Proceedings of the 4th annual Symposium on Cloud Computing
COLO: COarse-grained LOck-stepping virtual machines for non-stop service
Proceedings of the 4th annual Symposium on Cloud Computing
Proceedings of the 4th annual Symposium on Cloud Computing
Optimizing Paxos with request exchangeability for highly available web services
Proceedings of the 5th Asia-Pacific Symposium on Internetware
On the efficiency of durable state machine replication
USENIX ATC'13 Proceedings of the 2013 USENIX conference on Annual Technical Conference
CORFU: A distributed shared log
ACM Transactions on Computer Systems (TOCS)
On the performance of a retransmission-based synchronizer
Theoretical Computer Science
A protocol for implementing byzantine storage in churn-prone distributed systems
Theoretical Computer Science
Scalable service-oriented replication with flexible consistency guarantee in the cloud
Information Sciences: an International Journal
A fault tolerant platform of web services based on service composition
Multiagent and Grid Systems
Scalable and leaderless Byzantine consensus in cloud computing environments
Information Systems Frontiers
Hi-index | 0.07 |
The state machine approach is a general method for implementing fault-tolerant services in distributed systems. This paper reviews the approach and describes protocols for two different failure models—Byzantine and fail stop. Systems reconfiguration techniques for removing faulty components and integrating repaired components are also discussed.