A relational approach to monitoring complex systems
ACM Transactions on Computer Systems (TOCS)
Performance assertion checking
SOSP '93 Proceedings of the fourteenth ACM symposium on Operating systems principles
Performance debugging using parallel performance predicates
PADD '93 Proceedings of the 1993 ACM/ONR workshop on Parallel and distributed debugging
Specifying Systems: The TLA+ Language and Tools for Hardware and Software Engineers
Specifying Systems: The TLA+ Language and Tools for Hardware and Software Engineers
The Vision of Autonomic Computing
Computer
My Cache or Yours? Making Storage More Exclusive
ATEC '02 Proceedings of the General Track of the annual conference on USENIX Annual Technical Conference
Pinpoint: Problem Determination in Large, Dynamic Internet Services
DSN '02 Proceedings of the 2002 International Conference on Dependable Systems and Networks
Artificial Intelligence: A Modern Approach
Artificial Intelligence: A Modern Approach
Dynamic tracking of page miss ratio curve for memory management
ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Capturing, indexing, clustering, and retrieving system history
Proceedings of the twentieth ACM symposium on Operating systems principles
Tracking Probabilistic Correlation of Monitoring Data for Fault Detection in Complex Systems
DSN '06 Proceedings of the International Conference on Dependable Systems and Networks
Mace: language support for building distributed systems
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
I/O system performance debugging using model-driven anomaly characterization
FAST'05 Proceedings of the 4th conference on USENIX Conference on File and Storage Technologies - Volume 4
Path-based faliure and evolution management
NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
Automatic misconfiguration troubleshooting with peerpressure
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Using magpie for request extraction and workload modelling
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Pip: detecting the unexpected in distributed systems
NSDI'06 Proceedings of the 3rd conference on Networked Systems Design & Implementation - Volume 3
Capturing performance knowledge for automated analysis
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Dynamic partitioning of the cache hierarchy in shared data centers
Proceedings of the VLDB Endowment
Towards End-to-End Quality of Service: Controlling I/O Interference in Shared Storage Servers
Middleware '08 Proceedings of the ACM/IFIP/USENIX 9th International Middleware Conference
Dynamic resource allocation for database servers running on virtual storage
FAST '09 Proccedings of the 7th conference on File and storage technologies
Evaluation techniques for storage hierarchies
IBM Systems Journal
Modellus: Automated modeling of complex internet data center applications
ACM Transactions on the Web (TWEB)
Analytical modeling for what-if analysis in complex cloud computing applications
ACM SIGMETRICS Performance Evaluation Review
Performance troubleshooting in data centers: an annotated bibliography?
ACM SIGOPS Operating Systems Review
Hi-index | 0.00 |
As modern multi-tier systems are becoming increasingly large and complex, it becomes more difficult for system analysts to understand the overall behavior of the system, and diagnose performance problems. To assist analysts inspect performance behavior, we introduce SelfTalk, a novel declarative language that allows analysts to query and understand the status of a large scale system. SelfTalk is sufficiently expressive to encode an analyst's high-level hypotheses about system invariants, normal correlations between system metrics, or other a priori derived performance models, such as, "I expect that the throughputs of interconnected system components are linearly correlated". Given a hypothesis, Dena, our runtime support system, instantiates and validates it using actual monitoring data within specific system configurations. We evaluate SelfTalk/Dena by posing several hypotheses about system behavior and querying Dena to validate system behavior in a multi-tier dynamic content server. We find that Dena automatically validates the system performance based on the pre-existing hypotheses and helps to diagnose system misbehavior.