Chameleon: A Software Infrastructure for Adaptive Fault Tolerance
IEEE Transactions on Parallel and Distributed Systems
Gateways for accessing fault tolerance domains
IFIP/ACM International Conference on Distributed systems platforms
Policies for using replica groups and their effectiveness over the Internet
COMM '00 Proceedings of NGC 2000 on Networked group communication
An Adaptive Algorithm for Tolerating Value Faults and Crash Failures
IEEE Transactions on Parallel and Distributed Systems
OM '01 Proceedings of the 2001 ACM SIGPLAN workshop on Optimization of middleware and distributed systems
Building a dependable system from a legacy application with CORBA
Journal of Systems Architecture: the EUROMICRO Journal
The Timely Computing Base Model and Architecture
IEEE Transactions on Computers
Hierarchical Error Detection in a Software Implemented Fault Tolerance (SIFT) Environment
IEEE Transactions on Knowledge and Data Engineering
On Group Communication Support in CORBA
IEEE Transactions on Parallel and Distributed Systems
AQuA: An Adaptive Architecture that Provides Dependable Distributed Objects
IEEE Transactions on Computers
HiPC '00 Proceedings of the 7th International Conference on High Performance Computing
Design and Implementation of a Pluggable Fault Tolerant CORBA Infrastructure
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Providing QoS Customization in Distributed Object Systems
Middleware '01 Proceedings of the IFIP/ACM International Conference on Distributed Systems Platforms Heidelberg
Runtime Performance Modeling and Measurement of Adaptive Distributed Object Applications
On the Move to Meaningful Internet Systems, 2002 - DOA/CoopIS/ODBASE 2002 Confederated International Conferences DOA, CoopIS and ODBASE 2002
Middleware Support for Voting and Data Fusion
DSN '01 Proceedings of the 2001 International Conference on Dependable Systems and Networks (formerly: FTCS)
A Dynamic Replica Selection Algorithm for Tolerating Timing Faults
DSN '01 Proceedings of the 2001 International Conference on Dependable Systems and Networks (formerly: FTCS)
State Synchronization and Recovery for Strongly Consistent Replicated CORBA Objects
DSN '01 Proceedings of the 2001 International Conference on Dependable Systems and Networks (formerly: FTCS)
Implementing a CORBA-Based Architecture for Leveraging the Security Level of Existing Applications
On the Move to Meaningful Internet Systems, 2002 - DOA/CoopIS/ODBASE 2002 Confederated International Conferences DOA, CoopIS and ODBASE 2002
Eternal: a component-based framework for transparent fault-tolerant CORBA
Software—Practice & Experience - Special issue: Enterprise frameworks
QoS customization in distributed object systems
Software—Practice & Experience - Special issue: Middleware
Enforcing Determinism for the Consistent Replication of Multithreaded CORBA Applications
SRDS '99 Proceedings of the 18th IEEE Symposium on Reliable Distributed Systems
The 'QoS Query Service' for Improved Quality-of-Service Decision Making in CORBA
SRDS '99 Proceedings of the 18th IEEE Symposium on Reliable Distributed Systems
Three-tier replication for FT-CORBA infrastructures
Software—Practice & Experience
ICDCS '01 Proceedings of the The 21st International Conference on Distributed Computing Systems
Program control language: a programming language for adaptive distributed applications
Journal of Parallel and Distributed Computing
ITRA: Inter-Tier Relationship Architecture for End-to-end QoS
The Journal of Supercomputing
Experiences, Strategies, and Challenges in Building Fault-Tolerant CORBA Systems
IEEE Transactions on Computers
IEEE Transactions on Software Engineering
Towards Real-Time Fault-Tolerant CORBA Middleware
Cluster Computing
Effective Fault Treatment for Improving the Dependability of COTS and Legacy-Based Applications
IEEE Transactions on Dependable and Secure Computing
Toward Flexible Messaging for SOAP-Based Services
Proceedings of the 2004 ACM/IEEE conference on Supercomputing
Application Fault Tolerance with Armor Middleware
IEEE Internet Computing
Active Replication of Multithreaded Applications
IEEE Transactions on Parallel and Distributed Systems
WS-replication: a framework for highly available web services
Proceedings of the 15th international conference on World Wide Web
A Predictive Method for Providing Fault Tolerance in Multi-agent Systems
IAT '06 Proceedings of the IEEE/WIC/ACM international conference on Intelligent Agent Technology
Design and implementation of a secure wide-area object middleware
Computer Networks: The International Journal of Computer and Telecommunications Networking
Predictive fault tolerance in multiagent systems: a plan-based replication approach
Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Dynamic resource allocation heuristics for providing fault tolerance in multi-agent systems
Proceedings of the 2008 ACM symposium on Applied computing
Showing correctness of a replication algorithm in a component based system
IDEAS '08 Proceedings of the 2008 international symposium on Database engineering & applications
Highly available fault tolerant distributed computing using reflection and replication
Proceedings of the International Conference on Advances in Computing, Communication and Control
QuO's runtime support for quality of service in distributed objects
Middleware '98 Proceedings of the IFIP International Conference on Distributed Systems Platforms and Open Distributed Processing
A dynamic replication service for XML-documents to E-commerce
ICWE'03 Proceedings of the 2003 international conference on Web engineering
Model-centric development of highly available software systems
Architecting dependable systems IV
Exploiting commutativity for efficient replication in partitionable distributed systems
OTM'07 Proceedings of the 2007 OTM Confederated international conference on On the move to meaningful internet systems - Volume Part II
A survey of fault tolerant CORBA systems
OTM'07 Proceedings of the 2007 OTM Confederated international conference on On the move to meaningful internet systems: CoopIS, DOA, ODBASE, GADA, and IS - Volume Part I
A taxonomy of software architecture-based reliability efforts
Proceedings of the 2010 ICSE Workshop on Sharing and Reusing Architectural Knowledge
Plan-based replication for fault-tolerant multi-agent systems
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
A survey of software development approaches addressing dependability
FIDJI'04 Proceedings of the 4th international conference on Scientific Engineering of Distributed Java Applications
Architecting and implementing versatile dependability
Architecting Dependable Systems III
Increasing availability in a replicated partitionable distributed object system
ISPA'06 Proceedings of the 4th international conference on Parallel and Distributed Processing and Applications
Hi-index | 0.01 |
Dependable distributed systems are difficult to build. This is particularly true if they have dependability requirements that change during the execution of an application, and are built with commercial off-the-shelf hardware. In that case, fault tolerance must be achieved using middleware software, and mechanisms must be provided to communicate the dependability requirements of a distributed application to the system and to adapt the system's configuration to try to achieve the desired dependability. The AQuA architecture allows distributed applications to request a desired level of availability using the Quality Objects (QuO) framework and includes a dependability manager that attempts to meet requested availability levels by configuring the system in response to outside requests and changes in system resources due to faults. The AQuA architecture uses the QuO runtime to process and invoke availability requests, the Proteus dependability manager to configure the system in response to faults and availability requests, and the Ensemble protocol stack to provide group communication services. Furthermore, a CORBA interface is provided to application objects using the AQuA gateway. The gateway provides a mechanism to translate between process-level communication, as supported by Ensemble, and IIOP messages, understood by Object Request Brokers. Both active and passive replication are supported, and the replication type to use is chosen based on the performance and dependability requirements of particular distributed applications.