The visual display of quantitative information
The visual display of quantitative information
Gprof: A call graph execution profiler
SIGPLAN '82 Proceedings of the 1982 SIGPLAN symposium on Compiler construction
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Performance debugging for distributed systems of black boxes
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
prefuse: a toolkit for interactive information visualization
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
WAP5: black-box performance debugging for wide-area systems
Proceedings of the 15th international conference on World Wide Web
Stardust: tracking activity in a distributed storage system
SIGMETRICS '06/Performance '06 Proceedings of the joint international conference on Measurement and modeling of computer systems
Pattern Recognition and Machine Learning (Information Science and Statistics)
Pattern Recognition and Machine Learning (Information Science and Statistics)
Emergent (mis)behavior vs. complex software systems
Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006
Dynamic instrumentation of production systems
ATEC '04 Proceedings of the annual conference on USENIX Annual Technical Conference
Ursa minor: versatile cluster-based storage
FAST'05 Proceedings of the 4th conference on USENIX Conference on File and Storage Technologies - Volume 4
Path-based faliure and evolution management
NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
Using magpie for request extraction and workload modelling
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Bigtable: a distributed storage system for structured data
OSDI '06 Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation - Volume 7
Pip: detecting the unexpected in distributed systems
NSDI'06 Proceedings of the 3rd conference on Networked Systems Design & Implementation - Volume 3
Whodunit: transactional profiling for multi-tier applications
Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
Categorizing and differencing system behaviours
HotAC II Hot Topics in Autonomic Computing on Hot Topics in Autonomic Computing
Triage: diagnosing production run failures at the user's site
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
DARC: dynamic analysis of root causes of latency distributions
SIGMETRICS '08 Proceedings of the 2008 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Informed data distribution selection in a self-predicting storage system
ICAC '06 Proceedings of the 2006 IEEE International Conference on Autonomic Computing
OptiScope: Performance Accountability for Optimizing Compilers
Proceedings of the 7th annual IEEE/ACM International Symposium on Code Generation and Optimization
Detecting large-scale system problems by mining console logs
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Black-box problem diagnosis in parallel file systems
FAST'10 Proceedings of the 8th USENIX conference on File and storage technologies
Experiences with tracing causality in networked services
INM/WREN'10 Proceedings of the 2010 internet network management conference on Research on enterprise networking
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
X-trace: a pervasive network tracing framework
NSDI'07 Proceedings of the 4th USENIX conference on Networked systems design & implementation
Otus: resource attribution in data-intensive clusters
Proceedings of the second international workshop on MapReduce and its applications
Practical experiences with chronics discovery in large telecommunications systems
SLAML '11 Managing Large-scale Systems via the Analysis of System Logs and the Application of Machine Learning Techniques
Practical experiences with chronics discovery in large telecommunications systems
ACM SIGOPS Operating Systems Review
Modeling the parallel execution of black-box services
HotCloud'11 Proceedings of the 3rd USENIX conference on Hot topics in cloud computing
Understanding performance modeling for modular mobile-cloud applications
ICPE '12 Proceedings of the 3rd ACM/SPEC International Conference on Performance Engineering
Structured comparative analysis of systems logs to diagnose performance problems
NSDI'12 Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation
Automated diagnosis without predictability is a recipe for failure
HotCloud'12 Proceedings of the 4th USENIX conference on Hot Topics in Cloud Ccomputing
3-Dimensional root cause diagnosis via co-analysis
Proceedings of the 9th international conference on Autonomic computing
X-ray: automating root-cause diagnosis of performance anomalies in production software
OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
Theia: visual signatures for problem diagnosis in large hadoop clusters
lisa'12 Proceedings of the 26th international conference on Large Installation System Administration: strategies, tools, and techniques
Evaluating MapReduce for profiling application traffic
Proceedings of the first edition workshop on High performance and programmable networking
Proceedings of the 4th ACM/SPEC International Conference on Performance Engineering
Understanding latency variations of black box services
Proceedings of the 22nd international conference on World Wide Web
An online service-oriented performance profiling tool for cloud computing systems
Frontiers of Computer Science: Selected Publications from Chinese Universities
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
ACM SIGOPS 24th Symposium on Operating Systems Principles
IOFlow: a software-defined storage architecture
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
Limplock: understanding the impact of limpware on scale-out cloud systems
Proceedings of the 4th annual Symposium on Cloud Computing
Robust assessment of changes in cellular networks
Proceedings of the ninth ACM conference on Emerging networking experiments and technologies
Performance troubleshooting in data centers: an annotated bibliography?
ACM SIGOPS Operating Systems Review
NetCheck: network diagnoses from blackbox traces
NSDI'14 Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation
Hi-index | 0.00 |
The causes of performance changes in a distributed system often elude even its developers. This paper develops a new technique for gaining insight into such changes: comparing request flows from two executions (e.g., of two system versions or time periods). Building on end-to-end request-flow tracing within and across components, algorithms are described for identifying and ranking changes in the flow and/or timing of request processing. The implementation of these algorithms in a tool called Spectroscope is evaluated. Six case studies are presented of using Spectroscope to diagnose performance changes in a distributed storage service caused by code changes, configuration modifications, and component degradations, demonstrating the value and efficacy of comparing request flows. Preliminary experiences of using Spectroscope to diagnose performance changes within select Google services are also presented.