Queueing networks and Markov chains: modeling and performance evaluation with computer science applications
Performance Analysis of Multilevel Parallel Applications on Shared Memory Architectures
IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
Autonomic Service Adaptation in ICENI using Ontological Annotation
GRID '03 Proceedings of the 4th International Workshop on Grid Computing
Research challenges of autonomic computing
Proceedings of the 27th international conference on Software engineering
Experience with Collaborating Managers: Node Group Manager and Provisioning Manager
ICAC '05 Proceedings of the Second International Conference on Automatic Computing
Session-Based Adaptive Overload Control for Secure Dynamic Web Applications
ICPP '05 Proceedings of the 2005 International Conference on Parallel Processing
Complete instrumentation requirements for performance analysis of Web based technologies
ISPASS '03 Proceedings of the 2003 IEEE International Symposium on Performance Analysis of Systems and Software
The need for self-managed access nodes in grid environments.
EASE '07 Proceedings of the Fourth IEEE International Workshop on Engineering of Autonomic and Autonomous Systems
Measuring and characterizing system behavior using kernel-level event logging
ATEC '00 Proceedings of the annual conference on USENIX Annual Technical Conference
Autonomic QoS-Aware resource management in grid computing using online performance models
Proceedings of the 2nd international conference on Performance evaluation methodologies and tools
Resource Management in the Autonomic Service-Oriented Architecture
ICAC '06 Proceedings of the 2006 IEEE International Conference on Autonomic Computing
Autonomic QoS control in enterprise Grid environments using online simulation
Journal of Systems and Software
Design of JFluid: a profiling technology and tool based on dynamic bytecode instrumentation
Design of JFluid: a profiling technology and tool based on dynamic bytecode instrumentation
Grid load balancing using intelligent agents
Future Generation Computer Systems
Adaptive grid job scheduling with genetic algorithms
Future Generation Computer Systems
Self-star Properties in Complex Information Systems
A GridWay-based autonomic network-aware metascheduler
Future Generation Computer Systems
Wireless Personal Communications: An International Journal
Hi-index | 0.00 |
Tantamount to the overall performance delivered by a Grid environment is the quality of the middleware on which distributed Grid applications can run. Due to its complex nature, this middleware can be difficult to investigate in full detail and can also be problematic to tune efficiently, especially when running on a production type environment. Thanks to the BSC Monitoring Framework, a set of tools that can instrument and analyze Java applications as well as the entire system, we were able to undertake both global and fine-grained investigation into one of the most popular Grid middleware of the moment, Globus Toolkit 4. The steps taken, revealed some interesting findings and resulted in the detection of some job management problems in this middleware. Primarily, the main issue was that it was possible to reach a situation which caused jobs to be lost on the node due to an overloading amount of jobs being processed by the system. Again, the BSC-MF was used to investigate this issue further and helped extract a possible solution to prevent the node becoming a point of contention in the architecture. A simple but effective policy was formulated, which prioritized the finishing and acceptance of jobs over the response time and throughput, and was evaluated as a solution to the problem. It was determined that, due to the dynamic nature of the problem, it could be best resolved by adding self-managing capabilities to the middleware. Using the new policy, a prototype of an autonomous system was built and succeeded in allowing more jobs to be accepted and finished correctly. The improvement over the original GT4 middleware was significant and resulted in better performance by a factor of 30%. The path from investigation to development, as described in this paper, might serve as a guide to others involved in the field who are interested in extracting knowledge about a Grid node, extending the Grid middleware or adding self-managing behaviour to their applications.