Supporting fault-tolerance for time-critical events in distributed environments
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Supporting fault-tolerance for time-critical events in distributed environments
Scientific Programming
Resource provisioning with budget constraints for adaptive applications in cloud environments
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Hi-index | 0.00 |
There are many applications where a timely response to an important event is needed. Often such response can require significant computation and possibly communication, and it can be very challenging to complete it within the time-frame the response is needed. At the same time, there could be application-specific flexibility in the computation that may be desired. This paper presents the design, implementation, and evaluation of a middleware that can support such applications. Each of the services in our target applications could have one or more service parameters, which can be modified, within the pre-specified ranges, by the middleware. The middleware enables the time-critical event handling to achieve the maximum benefit, as per the user-defined benefit function, while satisfying the time constraint. Our middleware is also based on the existing Grid infrastructure and Service-Oriented Architecture (SOA) concepts. We have evaluated our middleware and its support for adaptation using a volume rendering application and a Great Lake forecasting application. The evaluation shows that our adaptation is effective, and has a very low overhead.