Machine Learning Methods for Predicting Failures in Hard Drives: A Multiple-Instance Application

Authors:
Joseph F. Murray;Gordon F. Hughes;Kenneth Kreutz-Delgado
Affiliations:
-;-;-
Venue:
The Journal of Machine Learning Research
Year:
2005

Citing 0
Cited 14

Sharing experiences to learn user characteristics in dynamic environments with sparse data

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Multiple instance ranking

Proceedings of the 25th international conference on Machine learning
Qualitative classification of descent phases in commercial flight data

International Journal of Computational Intelligence Studies
Learning and multiagent reasoning for autonomous agents

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Multiple instance learning via margin maximization

Applied Numerical Mathematics
Combining finite learning automata with GSAT for the satisfiability problem

Engineering Applications of Artificial Intelligence
Adaptive system anomaly prediction for large-scale hosting infrastructures

Proceedings of the 29th ACM SIGACT-SIGOPS symposium on Principles of distributed computing
Predicting disk failures with HMM- and HSMM-based approaches

ICDM'10 Proceedings of the 10th industrial conference on Advances in data mining: applications and theoretical aspects
Consensus self-organized models for fault detection (COSMO)

Engineering Applications of Artificial Intelligence
Finding soon-to-fail disks in a haystack

HotStorage'12 Proceedings of the 4th USENIX conference on Hot Topics in Storage and File Systems
Multiple-instance learning as a classifier combining problem

Pattern Recognition
A reliability optimization method for RAID-structured storage systems based on active data migration

Journal of Systems and Software
A comparison of machine learning algorithms for proactive hard disk drive failure detection

Proceedings of the 4th international ACM Sigsoft symposium on Architecting critical systems
Water pipe condition assessment: a hierarchical beta process approach for sparse incident data

Machine Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

We compare machine learning methods applied to a difficult real-world problem: predicting computer hard-drive failure using attributes monitored internally by individual drives. The problem is one of detecting rare events in a time series of noisy and nonparametrically-distributed data. We develop a new algorithm based on the multiple-instance learning framework and the naive Bayesian classifier (mi-NB) which is specifically designed for the low false-alarm case, and is shown to have promising performance. Other methods compared are support vector machines (SVMs), unsupervised clustering, and non-parametric statistical tests (rank-sum and reverse arrangements). The failure-prediction performance of the SVM, rank-sum and mi-NB algorithm is considerably better than the threshold method currently implemented in drives, while maintaining low false alarm rates. Our results suggest that nonparametric statistical tests should be considered for learning problems involving detecting rare events in time series data. An appendix details the calculation of rank-sum significance probabilities in the case of discrete, tied observations, and we give new recommendations about when the exact calculation should be used instead of the commonly-used normal approximation. These normal approximations may be particularly inaccurate for rare event problems like hard drive failures.