Software reliability modeling survey
Handbook of software reliability engineering
Modeling and analysis of stochastic systems
Modeling and analysis of stochastic systems
Neural Networks for Pattern Recognition
Neural Networks for Pattern Recognition
Information Retrieval
New Ways to Get Accurate Reliability Measures
IEEE Software
Industry: predicting telecommunication equipment failures from sequences of network alarms
Handbook of data mining and knowledge discovery
Automatic Failure-Path Inference: A Generic Introspection Technique for Internet Applications
WIAPP '03 Proceedings of the The Third IEEE Workshop on Internet Applications
A Methodology for Detection and Estimation of Software Aging
ISSRE '98 Proceedings of the The Ninth International Symposium on Software Reliability Engineering
Early Warning of Failures through Alarm Analysis - A Case Study in Telecom Voice Mail Systems
ISSRE '03 Proceedings of the 14th International Symposium on Software Reliability Engineering
Proactive Fault Handling for System Availability Enhancement
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 16 - Volume 17
CCGRID '09 Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid
A survey of online failure prediction methods
ACM Computing Surveys (CSUR)
Journal of Parallel and Distributed Computing
Quantifying event correlations for proactive failure management in networked computing systems
Journal of Parallel and Distributed Computing
EVEREST+: run-time SLA violations prediction
Proceedings of the 5th International Workshop on Middleware for Service Oriented Computing
Failure-aware workflow scheduling in cluster environments
Cluster Computing
Failure prediction based on log files using Random Indexing and Support Vector Machines
Journal of Systems and Software
Hi-index | 0.15 |
The goal of online failure prediction is to forecast imminent failures while the system is running. This paper compares Similar Events Prediction (SEP) with two other well-known techniques for online failure prediction: a straightforward method that is based on a reliability model and Dispersion Frame Technique (DFT). SEP is based on recognition of failure-prone patterns utilizing a semi-Markov chain in combination with clustering. We applied the approaches to real data of a commercial telecommunication system. Results are presented in terms of precision, recall, F-measure and accumulated runtime-cost. The results suggest a significantly improved forecasting performance.