Pattern matching algorithms
Recognition of error symptoms in large systems
ACM '86 Proceedings of 1986 ACM Fall joint computer conference
What Supercomputers Say: A Study of Five System Logs
DSN '07 Proceedings of the 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks
Quantifying Temporal and Spatial Correlation of Failure Events for Proactive Management
SRDS '07 Proceedings of the 26th IEEE International Symposium on Reliable Distributed Systems
Analysis of execution log files
Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 2
Online event correlations analysis in system logs of large-scale cluster systems
NPC'10 Proceedings of the 2010 IFIP international conference on Network and parallel computing
Using computational intelligence to identify performance bottlenecks in a computer system
PPSN'10 Proceedings of the 11th international conference on Parallel problem solving from nature: Part I
Symptom-based problem determination using log data abstraction
Proceedings of the 2010 Conference of the Center for Advanced Studies on Collaborative Research
IBM Journal of Research and Development
Fmeter: extracting indexable low-level system signatures by counting kernel function calls
Proceedings of the 13th International Middleware Conference
Hi-index | 0.00 |
Error logs are a fruitful source of information both for diagnosis as well as for proactive fault handling - however elaborate data preparation is necessary to filter out valuable pieces of information. In addition to the usage of well-known techniques, we propose three algorithms: (a) assignment of error IDs to error messages based on Levenshtein's edit distance, (b) a clustering approach to group similar error sequences, and (c) a statistical noise filtering algorithm. By experiments using data of a commercial telecommunication system we show that data preparation is an important step to achieve accurate error-based online failure prediction.