Pinpoint: Problem Determination in Large, Dynamic Internet Services
DSN '02 Proceedings of the 2002 International Conference on Dependable Systems and Networks
Performance debugging for distributed systems of black boxes
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
STRIDER: A Black-box, State-based Approach to Change and Configuration Management and Support
LISA '03 Proceedings of the 17th USENIX conference on System administration
Combining statistical monitoring and predictable recovery for self-management
WOSS '04 Proceedings of the 1st ACM SIGSOFT workshop on Self-managed systems
Ensembles of Models for Automated Diagnosis of System Performance Problems
DSN '05 Proceedings of the 2005 International Conference on Dependable Systems and Networks
ICAC '05 Proceedings of the Second International Conference on Automatic Computing
Statistical debugging: simultaneous identification of multiple bugs
ICML '06 Proceedings of the 23rd international conference on Machine learning
Dynamic instrumentation of production systems
ATEC '04 Proceedings of the annual conference on USENIX Annual Technical Conference
Using runtime paths for macroanalysis
HOTOS'03 Proceedings of the 9th conference on Hot Topics in Operating Systems - Volume 9
HOTOS'05 Proceedings of the 10th conference on Hot Topics in Operating Systems - Volume 10
Path-based faliure and evolution management
NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
Configuration debugging as search: finding the needle in the haystack
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Automatic misconfiguration troubleshooting with peerpressure
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Using magpie for request extraction and workload modelling
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Detecting application-level failures in component-based Internet services
IEEE Transactions on Neural Networks
Hang analysis: fighting responsiveness bugs
Proceedings of the 3rd ACM SIGOPS/EuroSys European Conference on Computer Systems 2008
DIADS: addressing the "my-problem-or-yours" syndrome with integrated SAN and database diagnosis
FAST '09 Proccedings of the 7th conference on File and storage technologies
Towards versatile performance models for complex, popular applications
ACM SIGMETRICS Performance Evaluation Review
Practical performance models for complex, popular applications
Proceedings of the ACM SIGMETRICS international conference on Measurement and modeling of computer systems
CLUEBOX: a performance log analyzer for automated troubleshooting
WASL'08 Proceedings of the First USENIX conference on Analysis of system logs
Hi-index | 0.00 |
Users are often frustrated when they encounter a sudden decrease in the responsiveness of their personal computers. However, it is often difficult to pinpoint a particular offending process and the resource it is overconsuming, even when such a simple explanation does exist. We present preliminary results from several weeks of PC usage showing that user-perceived unresponsiveness often has such a simple explanation and that simple statistical models often suffice to pinpoint the problem. The statistical models we build use all the performance counters for all running processes. When the user expresses frustration at a given time point, we can use these models to determine which processes are acting most anomalously, and in turn which features of those processes are most anomalous. We present an investigative tool that ranks processes and features according to their degree of anomaly, and allows the user to interactively examine the relevant time series.