System identification: theory for the user
System identification: theory for the user
Wrappers for feature subset selection
Artificial Intelligence - Special issue on relevance
On the Optimality of the Simple Bayesian Classifier under Zero-One Loss
Machine Learning - Special issue on learning with probabilistic representations
Machine Learning - Special issue on learning with probabilistic representations
httperf—a tool for measuring web server performance
ACM SIGMETRICS Performance Evaluation Review
Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference
Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference
Time Series Analysis: Forecasting and Control
Time Series Analysis: Forecasting and Control
The Vision of Autonomic Computing
Computer
Pinpoint: Problem Determination in Large, Dynamic Internet Services
DSN '02 Proceedings of the 2002 International Conference on Dependable Systems and Networks
Critical event prediction for proactive management in large-scale computer clusters
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Eigenspace-based anomaly detection in computer systems
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Ensembles of Models for Automated Diagnosis of System Performance Problems
DSN '05 Proceedings of the 2005 International Conference on Dependable Systems and Networks
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Using magpie for request extraction and workload modelling
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Processing forecasting queries
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Isolation points: Creating performance-robust enterprise systems
ACM Transactions on Autonomous and Adaptive Systems (TAAS)
One Graph Is Worth a Thousand Logs: Uncovering Hidden Structures in Massive System Event Logs
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
Predictive modelling of SAP ERP applications: challenges and solutions
Proceedings of the Fourth International ICST Conference on Performance Evaluation Methodologies and Tools
Incorporating prediction models in the SelfLet framework: a plugin approach
Proceedings of the Fourth International ICST Conference on Performance Evaluation Methodologies and Tools
Bottleneck detection using statistical intervention analysis
DSOM'07 Proceedings of the Distributed systems: operations and management 18th IFIP/IEEE international conference on Managing virtualization of networks and services
Adaptive system anomaly prediction for large-scale hosting infrastructures
Proceedings of the 29th ACM SIGACT-SIGOPS symposium on Principles of distributed computing
A Survey on Filter Techniques for Feature Selection in Gene Expression Microarray Analysis
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Proceedings of the 9th international conference on Autonomic computing
Proceedings of the Winter Simulation Conference
Self-adaptive workload classification and forecasting for proactive resource provisioning
Proceedings of the 4th ACM/SPEC International Conference on Performance Engineering
Hi-index | 0.00 |
We use data mining and machine learning techniques to predict upcoming periods of high utilization or poor performance in enterprise systems. The abundant data available and complexity of these systems defies human characterization or static models and makes the task suitable for data mining techniques. We formulate the problem as one of classification: given current and past information about the system's behavior, can we forecast whether the system will meet its performance targets over the next hour? Using real data gathered from several enterprise systems in Hewlett-Packard, we compare several approaches ranging from time series to Bayesian networks. Besides establishing the predictive power of these approaches our study analyzes three dimensions that are important for their application as a stand alone tool. First, it quantifies the gain in accuracy of multivariate prediction methods over simple statistical univariate methods. Second, it quantifies the variations in accuracy when using different classes of system and workload features. Third, it establishes that models induced using combined data from various systems generalize well and are applicable to new systems, enabling accurate predictions on systems with insufficient historical data. Together this analysis offers a promising outlook on the development of tools to automate assignment of resources to stabilize performance, (e.g., adding servers to a cluster) and allow opportunistic job scheduling (e.g., backups or virus scans).