Causality: models, reasoning, and inference
Causality: models, reasoning, and inference
Database Systems: The Complete Book
Database Systems: The Complete Book
A characterization of the sensitivity of query optimization to storage access cost parameters
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
A File System Tracing Package for Berkeley UNIX
A File System Tracing Package for Berkeley UNIX
Performance debugging for distributed systems of black boxes
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
SQLCM: A Continuous Monitoring Framework for Relational Database Engines
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Ensembles of Models for Automated Diagnosis of System Performance Problems
DSN '05 Proceedings of the 2005 International Conference on Dependable Systems and Networks
Capturing, indexing, clustering, and retrieving system history
Proceedings of the twentieth ACM symposium on Operating systems principles
Tracefs: A File System to Trace Them All
FAST '04 Proceedings of the 3rd USENIX Conference on File and Storage Technologies
Stardust: tracking activity in a distributed storage system
SIGMETRICS '06/Performance '06 Proceedings of the joint international conference on Measurement and modeling of computer systems
Genesis: A Scalable Self-Evolving Performance Management Framework for Storage Systems
ICDCS '06 Proceedings of the 26th IEEE International Conference on Distributed Computing Systems
I/O system performance debugging using model-driven anomaly characterization
FAST'05 Proceedings of the 4th conference on USENIX Conference on File and Storage Technologies - Volume 4
Path-based faliure and evolution management
NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Automatic misconfiguration troubleshooting with peerpressure
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Modeling the relative fitness of storage
Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Operating system profiling via latency analysis
OSDI '06 Proceedings of the 7th symposium on Operating systems design and implementation
Automatic SQL tuning in oracle 10g
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Why did my pc suddenly slow down?
SYSML'07 Proceedings of the 2nd USENIX workshop on Tackling computer systems problems with machine learning techniques
Fa: A System for Automating Failure Diagnosis
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Towards Adaptive Costing of Database Access Methods
ICDEW '07 Proceedings of the 2007 IEEE 23rd International Conference on Data Engineering Workshop
High speed and robust event correlation
IEEE Communications Magazine
DIADS: a problem diagnosis tool for databases and storage area networks
Proceedings of the VLDB Endowment
Proceedings of the 3rd Annual Haifa Experimental Systems Conference
Towards I/O analysis of HPC systems and a generic architecture to collect access patterns
Computer Science - Research and Development
Hi-index | 0.00 |
We present DIADS, an integrated DIAgnosis tool for Databases and Storage area networks (SANs). Existing diagnosis tools in this domain have a database-only (e.g., [11]) or SAN-only (e.g., [28]) focus. DIADS is a first-of-a-kind framework based on a careful integration of information from the database and SAN subsystems; and is not a simple concatenation of database-only and SAN-only modules. This approach not only increases the accuracy of diagnosis, but also leads to significant improvements in efficiency. DIADS uses a novel combination of non-intrusive machine learning techniques (e.g., Kernel Density Estimation) and domain knowledge encoded in a new symptoms database design. The machine learning component provides core techniques for problem diagnosis from monitoring data, and domain knowledge acts as checks-and-balances to guide the diagnosis in the right direction. This unique system design enables DIADS to function effectively even in the presence of multiple concurrent problems as well as noisy data prevalent in production environments. We demonstrate the efficacy of our approach through a detailed experimental evaluation of DIADS implemented on a real data center testbed with PostgreSQL databases and an enterprise SAN.