Monitoring and fault tolerance for real-time online interactive applications

  • Authors:
  • Vlad Nae;Radu Prodan;Thomas Fahringer

  • Affiliations:
  • Institute of Computer Science, University of Innsbruck, Innsbruck, Austria;Institute of Computer Science, University of Innsbruck, Innsbruck, Austria;Institute of Computer Science, University of Innsbruck, Innsbruck, Austria

  • Venue:
  • Euro-Par'09 Proceedings of the 2009 international conference on Parallel processing
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

The edutain@grid European project [1] is developing a support platform for deployment, management and execution of Real-Time Online Interactive Applications (ROIA) on Grid. In this paper we present a monitoring system we developed which collects data from all the resources in a distributed environment and from the ROIA managed by our platform. We also describe a fault tolerance service which addresses not only the faults commonly encountered in distributed systems, but also faults manifesting at service level, within the platform's management services. Finally, a use-case consisting of the platform running a massively multiplayer online game as a concrete ROIA, is presented in order to demonstrate the roles of the monitoring and fault tolerance services.