Scheduling Master-Slave Multiprocessor Systems
IEEE Transactions on Computers
SEDA: an architecture for well-conditioned, scalable internet services
SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Holistic twig joins: optimal XML pattern matching
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
High-performance complex event processing over streams
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
What is "next" in event processing?
Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Stateful Detection in High Throughput Distributed Systems
SRDS '07 Proceedings of the 26th IEEE International Symposium on Reliable Distributed Systems
A distributed re-configurable grid workflow engine
ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part III
Towards expressive publish/subscribe systems
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
Hi-index | 0.00 |
Workflow management system is widely accepted and used in the wide area network environment, especially in the e-Science application scenarios, to coordinate the operation of different functional components and to provide more powerful functions. The error-prone nature of the wide area network environment makes the fault-tolerance requirements of workflow management become more and more urgent. In this paper, we propose Cesar-FD, a stateful fault detection mechanism, which builds up states related to the runtime and external environments of workflow management system by aggregating multiple messages and provides more accurate notifications asynchronously. We demonstrate the use of this mechanism in the Drug Discovery Grid environment by two use cases. We also show that it can be used to detect faulty situations more accurately.