Timestamp-Based Orphan Elimination
IEEE Transactions on Software Engineering
Replica determinism in distributed real-time systems: a brief survey
Real-Time Systems
Unreliable failure detectors for reliable distributed systems
Journal of the ACM (JACM)
Distributed systems (2nd Ed.)
Fail-stop processors: an approach to designing fault-tolerant computing systems
ACM Transactions on Computer Systems (TOCS)
Reconciling Replication and Transactions for the End-to-End Reliability of CORBA Applications
On the Move to Meaningful Internet Systems, 2002 - DOA/CoopIS/ODBASE 2002 Confederated International Conferences DOA, CoopIS and ODBASE 2002
Deterministic Scheduling for Transactional Multithreaded Replicas
SRDS '00 Proceedings of the 19th IEEE Symposium on Reliable Distributed Systems
ITRA: Inter-Tier Relationship Architecture for End-to-end QoS
The Journal of Supercomputing
SRDS '04 Proceedings of the 23rd IEEE International Symposium on Reliable Distributed Systems
Living with nondeterminism in replicated middleware applications
Proceedings of the ACM/IFIP/USENIX 2006 International Conference on Middleware
Adaptive Failover for Real-Time Middleware with Passive Replication
RTAS '09 Proceedings of the 2009 15th IEEE Symposium on Real-Time and Embedded Technology and Applications
Preventing orphan requests by integrating replication and transactions
ADBIS'07 Proceedings of the 11th East European conference on Advances in databases and information systems
Journal of Systems Architecture: the EUROMICRO Journal
Hi-index | 0.00 |
Orphan requests are a significant problem for multi-tier distributed systems since they adversely impact system correctness by violating the exactly-once semantics of applications and may waste resources. Orphan requests stem from the failure(s) of non-deterministic components involved in nested invocations of replicated components. Resolving this problem in the context of resource constrained, component-based, distributed real-time and embedded (DRE) systems that form end-to-end task chains is challenging because conventional transaction-based solutions cannot assure real-time properties of the DRE applications. To address these challenges, this paper presents a group-failover protocol that comprises three key capabilities: real-time failure detection and client failover, timely mitigation of orphan requests, and two novel application state consistency strategies to ensure the correctness of DRE systems by maintaining the exactly-once semantics even during failures. Our solution is implemented in the context of the CIAO real-time CORBA Component Model middleware. Empirical evaluations of the group-failover protocol in both fault-free and failure recovery scenarios for DRE task chains of different sizes demonstratesa low overhead and predictable performance.