Pinpoint: Problem Determination in Large, Dynamic Internet Services
DSN '02 Proceedings of the 2002 International Conference on Dependable Systems and Networks
Performance debugging for distributed systems of black boxes
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
gprof: a call graph execution profiler
ACM SIGPLAN Notices - Best of PLDI 1979-1999
WAP5: black-box performance debugging for wide-area systems
Proceedings of the 15th international conference on World Wide Web
Stardust: tracking activity in a distributed storage system
SIGMETRICS '06/Performance '06 Proceedings of the joint international conference on Measurement and modeling of computer systems
Dynamic instrumentation of production systems
ATEC '04 Proceedings of the annual conference on USENIX Annual Technical Conference
Path-based faliure and evolution management
NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
Using magpie for request extraction and workload modelling
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Pip: detecting the unexpected in distributed systems
NSDI'06 Proceedings of the 3rd conference on Networked Systems Design & Implementation - Volume 3
MapReduce: simplified data processing on large clusters
Communications of the ACM - 50th anniversary issue: 1958 - 2008
BorderPatrol: isolating events for black-box tracing
Proceedings of the 3rd ACM SIGOPS/EuroSys European Conference on Computer Systems 2008
Bigtable: A Distributed Storage System for Structured Data
ACM Transactions on Computer Systems (TOCS)
DARC: dynamic analysis of root causes of latency distributions
SIGMETRICS '08 Proceedings of the 2008 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Fingerprinting the datacenter: automated classification of performance crises
Proceedings of the 5th European conference on Computer systems
Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1
USENIX'09 Proceedings of the 2009 conference on USENIX Annual technical conference
Performance measurement and analysis tools for extremely scalable systems
Concurrency and Computation: Practice & Experience - International Supercomputing Conference
Pinpointing the Subsystems Responsible for the Performance Deviations in a Load Test
ISSRE '10 Proceedings of the 2010 IEEE 21st International Symposium on Software Reliability Engineering
Diagnosing performance changes by comparing request flows
Proceedings of the 8th USENIX conference on Networked systems design and implementation
An Adaptive Performance Modeling Approach to Performance Profiling of Multi-service Web Applications
COMPSAC '11 Proceedings of the 2011 IEEE 35th Annual Computer Software and Applications Conference
Modeling the parallel execution of black-box services
HotCloud'11 Proceedings of the 3rd USENIX conference on Hot topics in cloud computing
Precise, Scalable, and Online Request Tracing for Multitier Services of Black Boxes
IEEE Transactions on Parallel and Distributed Systems
P-Tracer: Path-Based Performance Profiling in Cloud Computing Systems
COMPSAC '12 Proceedings of the 2012 IEEE 36th Annual Computer Software and Applications Conference
Hi-index | 0.00 |
The growing scale and complexity of component interactions in cloud computing systems post great challenges for operators to understand the characteristics of system performance. Profiling has long been proved to be an effective approach to performance analysis; however, existing approaches confront new challenges that emerge in cloud computing systems. First, the efficiency of the profiling becomes of critical concern; second, service-oriented profiling should be considered to support separation-of-concerns performance analysis. To address the above issues, in this paper, we present P-Tracer, an online performance profiling tool specifically tailored for cloud computing systems. P-Tracer constructs a specific search engine that proactively processes performance logs and generates a particular index for fast queries; second, for each service, P-Tracer retrieves a statistical insight of performance characteristics from multi-dimensions and provides operators with a suite of web-based interfaces to query the critical information. We evaluate P-Tracer in the aspects of tracing overheads, data preprocessing scalability and querying efficiency. Three real-world case studies that happened in Alibaba cloud computing platform demonstrate that P-Tracer can help operators understand software behaviors and localize the primary causes of performance anomalies effectively and efficiently.