Distributed real-time specification for Java: a status report (digest)
JTRES '06 Proceedings of the 4th international workshop on Java technologies for real-time and embedded systems
On Scheduling Exception Handlers in Dynamic, Embedded Real-Time Systems
ICESS '07 Proceedings of the 3rd international conference on Embedded Software and Systems
On distributed real-time scheduling in networked embedded systems in the presence of crash failures
SEUS'07 Proceedings of the 5th IFIP WG 10.2 international conference on Software technologies for embedded and ubiquitous systems
Consensus-driven distributable thread scheduling in networked embedded systems
EUC'07 Proceedings of the 2007 international conference on Embedded and ubiquitous computing
EUC'07 Proceedings of the 2007 conference on Emerging direction in embedded and ubiquitous computing
Recovering from distributable thread failures in distributed real-time Java
ACM Transactions on Embedded Computing Systems (TECS)
Resource management policies for real-time Java remote invocations
Journal of Parallel and Distributed Computing
Hi-index | 0.02 |
We consider the problem of recovering from failures of distributable threads with assured timeliness. When a node hosting a portion of a distributable thread fails, it causes orphans-i.e., thread segments that are disconnected from the thread's root. We consider a termination model for recovering from such failures, where the orphans must be detected and aborted, and failure-exception notification must be delivered to the farthest, contiguous surviving thread segment for resuming thread execution. We present a realtime scheduling algorithm called AUA, and a distributable thread integrity protocol called TP-TR. We show that AUA and TP-TR bound the orphan cleanup and recovery time, thereby bounding thread starvation durations, and maximize the total thread accrued timeliness utility. We implement AUA and TP-TR in a real-time middleware that supports distributable threads. Our experimental studies with the implementation validate the algorithm/protocol's timebounded recovery property and confirm their effectiveness.