Information-based objective functions for active data selection
Neural Computation
An introduction to computational learning theory
An introduction to computational learning theory
Machine Learning - Special issue on learning with probabilistic representations
Adaptive Probabilistic Networks with Hidden Variables
Machine Learning - Special issue on learning with probabilistic representations
An introduction to support Vector Machines: and other kernel-based learning methods
An introduction to support Vector Machines: and other kernel-based learning methods
Text Classification from Labeled and Unlabeled Documents using EM
Machine Learning - Special issue on information retrieval
Neural Networks for Pattern Recognition
Neural Networks for Pattern Recognition
The Vision of Autonomic Computing
Computer
Support Vector Machine Active Learning with Application sto Text Classification
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Performance debugging for distributed systems of black boxes
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Using probabilistic reasoning to automate software tuning
Using probabilistic reasoning to automate software tuning
Ensembles of Models for Automated Diagnosis of System Performance Problems
DSN '05 Proceedings of the 2005 International Conference on Dependable Systems and Networks
Failure Diagnosis Using Decision Trees
ICAC '04 Proceedings of the First International Conference on Autonomic Computing
ICAC '05 Proceedings of the Second International Conference on Automatic Computing
Using computers to diagnose computer problems
HOTOS'03 Proceedings of the 9th conference on Hot Topics in Operating Systems - Volume 9
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Using magpie for request extraction and workload modelling
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
A study of cross-validation and bootstrap for accuracy estimation and model selection
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Sequential update of Bayesian network structure
UAI'97 Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence
On the sample complexity of learning Bayesian networks
UAI'96 Proceedings of the Twelfth international conference on Uncertainty in artificial intelligence
Why did my pc suddenly slow down?
SYSML'07 Proceedings of the 2nd USENIX workshop on Tackling computer systems problems with machine learning techniques
Fingerprinting the datacenter: automated classification of performance crises
Proceedings of the 5th European conference on Computer systems
Automated experiment-driven management of (database) systems
HotOS'09 Proceedings of the 12th conference on Hot topics in operating systems
A case for machine learning to optimize multicore performance
HotPar'09 Proceedings of the First USENIX conference on Hot topics in parallelism
HotACI'06 Proceedings of the First international conference on Hot topics in autonomic computing
Performance optimization of deployed software-as-a-service applications
Journal of Systems and Software
Hi-index | 0.00 |
Recent research activity [2, 12, 27, 10, 1] has shown encouraging results for performance debugging and failure diagnosis and detection in systems by using approaches based on automatically inducing models and deriving correlations from observed data. We believe that maximizing the potential of this line of research will require surmounting some fundamental challenges arising not from the modeling techniques themselves, but specifically from the application of those techniques to real-world systems. We specifically formulate three challenges. First, as new data is collected from a system, previously-induced models must be continuously assessed and validated, with the ultimate aim of achieving online adaption to system changes. Second, human operators must be able to effectively interact with the models, including interpreting model findings to generate explanations, enabling human feedback to improve the models, and identifying false positives and missed detections. Third, it should be possible to formally manipulate "signatures" of system state as represented by these models, allowing us to query the system's past to identify recurring problems and manually annotate them with additional information. We contend that the specifics of this problem domain not only raise these challenges, but also provide the knowledge base from which to derive well-engineered solutions to them. We suggest some possible strategies for addressing each challenge and show how they arise in the context of a real example.