Prediction of generalization ability in learning machines
Prediction of generalization ability in learning machines
The nature of statistical learning theory
The nature of statistical learning theory
Experimentation in software engineering: an introduction
Experimentation in software engineering: an introduction
Robust Classification for Imprecise Environments
Machine Learning
Support vector machine active learning for image retrieval
MULTIMEDIA '01 Proceedings of the ninth ACM international conference on Multimedia
Discovery of Frequent Episodes in Event Sequences
Data Mining and Knowledge Discovery
An overview of the BlueGene/L Supercomputer
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Proactive Detection of Software Aging Mechanisms in Performance Critical Computers
SEW '02 Proceedings of the 27th Annual NASA Goddard Software Engineering Workshop (SEW-27'02)
Predicting Rare Events In Temporal Domains
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Support Vector Machines: Training and Applications
Support Vector Machines: Training and Applications
Dynamic syslog mining for network failure monitoring
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Software reliability forecasting by support vector machines with simulated annealing algorithms
Journal of Systems and Software
Using bag-of-concepts to improve the performance of support vector machines in text categorization
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Data Mining Static Code Attributes to Learn Defect Predictors
IEEE Transactions on Software Engineering
A Survey on Failure Prediction of Large-Scale Server Clusters
SNPD '07 Proceedings of the Eighth ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing - Volume 02
Top 10 algorithms in data mining
Knowledge and Information Systems
Bioinformatics
Bad Words: Finding Faults in Spirit's Syslogs
CCGRID '08 Proceedings of the 2008 Eighth IEEE International Symposium on Cluster Computing and the Grid
Techniques for evaluating fault prediction models
Empirical Software Engineering
Failure Prediction in IBM BlueGene/L Event Logs
ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
ICSE '09 Proceedings of the 31st International Conference on Software Engineering
SVMs modeling for highly imbalanced classification
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics - Special issue on human computing
Using text mining and sentiment analysis for online forums hotspot detection and forecast
Decision Support Systems
Discovering word senses from text using random indexing
CICLing'08 Proceedings of the 9th international conference on Computational linguistics and intelligent text processing
Predicting computer system failures using support vector machines
WASL'08 Proceedings of the First USENIX conference on Analysis of system logs
Predicting failures of computer systems: a case study for a telecommunication system
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Toward Automated Anomaly Identification in Large-Scale Systems
IEEE Transactions on Parallel and Distributed Systems
The dark side of agile software development
Proceedings of the ACM international symposium on New ideas, new paradigms, and reflections on programming and software
A multivariate classification of open source developers
Information Sciences: an International Journal
Hi-index | 0.00 |
Research problem: The impact of failures on software systems can be substantial since the recovery process can require unexpected amounts of time and resources. Accurate failure predictions can help in mitigating the impact of failures. Resources, applications, and services can be scheduled to limit the impact of failures. However, providing accurate predictions sufficiently ahead is challenging. Log files contain messages that represent a change of system state. A sequence or a pattern of messages may be used to predict failures. Contribution: We describe an approach to predict failures based on log files using Random Indexing (RI) and Support Vector Machines (SVMs). Method: RI is applied to represent sequences: each operation is characterized in terms of its context. SVMs associate sequences to a class of failures or non-failures. Weighted SVMs are applied to deal with imbalanced datasets and to improve the true positive rate. We apply our approach to log files collected during approximately three months of work in a large European manufacturing company. Results: According to our results, weighted SVMs sacrifice some specificity to improve sensitivity. Specificity remains higher than 0.80 in four out of six analyzed applications. Conclusions: Overall, our approach is very reliable in predicting both failures and non-failures.