Systematic software development using VDM (2nd ed.)
Systematic software development using VDM (2nd ed.)
Fault classes and error detection capability of specification-based testing
ACM Transactions on Software Engineering and Methodology (TOSEM)
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: concepts and techniques
Data mining: concepts and techniques
Machine Learning
Automated support for classifying software failure reports
Proceedings of the 25th International Conference on Software Engineering
A Fault Detection Service for Wide Area Distributed Computations
HPDC '98 Proceedings of the 7th IEEE International Symposium on High Performance Distributed Computing
GridWorkflow: A Flexible Failure Handling Framework for the Grid
HPDC '03 Proceedings of the 12th IEEE International Symposium on High Performance Distributed Computing
Active learning for automatic classification of software behavior
ISSTA '04 Proceedings of the 2004 ACM SIGSOFT international symposium on Software testing and analysis
Phoenix: Making Data-Intensive Grid Applications Fault-Tolerant
GRID '04 Proceedings of the 5th IEEE/ACM International Workshop on Grid Computing
The Inca Test Harness and Reporting Framework
Proceedings of the 2004 ACM/IEEE conference on Supercomputing
Failure Diagnosis Using Decision Trees
ICAC '04 Proceedings of the First International Conference on Autonomic Computing
Collaborative Fault Diagnosis in Grids through Automated Tests
AINA '06 Proceedings of the 20th International Conference on Advanced Information Networking and Applications - Volume 01
Problem diagnosis in large-scale computing environments
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Presenting Scientific Legacy Programs as Grid Services via Program Synthesis
E-SCIENCE '06 Proceedings of the Second IEEE International Conference on e-Science and Grid Computing
A Scalable and Efficient Self-Organizing Failure Detector for Grid Applications
GRID '05 Proceedings of the 6th IEEE/ACM International Workshop on Grid Computing
The Otho Toolkit - Synthesizing tailor-made scientific grid application wrapper services
Multiagent and Grid Systems - Special Issue on "Advances in Grid services Engineering and Management"
Specification-based Synthesis of Tailor-made Grid Service Wrappers for Scientific Legacy Codes
GRID '06 Proceedings of the 7th IEEE/ACM International Conference on Grid Computing
Issues in applying data mining to grid job failure detection and diagnosis
HPDC '08 Proceedings of the 17th international symposium on High performance distributed computing
Monitoring and fault tolerance for real-time online interactive applications
Euro-Par'09 Proceedings of the 2009 international conference on Parallel processing
Hi-index | 0.00 |
With increasing size and complexity of Grids manual diagnosis of individual application faults becomes impractical and time-consuming. Quick and accurate identification of the root cause of failures is an important prerequisite for building reliable systems. We describe a pragmatic model-based technique for application-specific fault diagnosis based on indicators, symptoms and rules. Customized wrapper services then apply this knowledge to reason about root causes of failures. In addition to user-provided diagnosis models we show that given a set of past classified fault events it is possible to extract new models through learning that are able to diagnose new faults. We investigated and compared algorithms of supervised classification learning and cluster analysis. Our approach was implemented as part of the Otho Toolkit that 'service-enables' legacy applications based on synthesis of wrapper service.