ADMiRe: An Algebraic Data Mining Approach to System Performance Analysis

Authors:
Ning Jiang;Roy Villafane;Kien A. Hua;Abhijit Sawant;Kiran Prabhakara
Affiliations:
IEEE;IEEE;IEEE;-;-
Venue:
IEEE Transactions on Knowledge and Data Engineering
Year:
2005

Citing 22
Cited 1

A new approach to I/O performance evaluation: self-scaling I/O benchmarks, predicted I/O performance

SIGMETRICS '93 Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Web server workload characterization: the search for invariants

Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Mining quantitative association rules in large relational tables

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Database management systems

Database management systems
Self-similarity in file systems

SIGMETRICS '98/PERFORMANCE '98 Proceedings of the 1998 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Workfile Disk Management for Concurrent Mergesorts in a Multiprocessor Database System

Distributed and Parallel Databases
Maintaining knowledge about temporal intervals

Communications of the ACM
Fast-Start: quick fault recovery in oracle

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Infominer: mining surprising periodic patterns

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Clustering by pattern similarity in large data sets

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
What Makes Patterns Interesting in Knowledge Discovery Systems

IEEE Transactions on Knowledge and Data Engineering
Performance Analysis of Dynamic Finite Versioning Schemes: Storage Cost vs. Obsolescence

IEEE Transactions on Knowledge and Data Engineering
Efficient Similarity Search In Sequence Databases

FODO '93 Proceedings of the 4th International Conference on Foundations of Data Organization and Algorithms
Set-Oriented Mining for Association Rules in Relational Databases

ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Checkpointing in Oracle

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
An Interval Classifier for Database Mining Applications

VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Incremental Meta-Mining from Large Temporal Data Sets

ER '98 Proceedings of the Workshops on Data Warehousing and Data Mining: Advances in Database Technologies
Efficient Mining of Partial Periodic Patterns in Time Series Database

ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Load Balancing and Hot Spot Relief for Hash Routing among a Collection of Proxy Caches

ICDCS '99 Proceedings of the 19th IEEE International Conference on Distributed Computing Systems
ADMiRe: an algebraic approach to system performance analysis using data mining techniques

Proceedings of the 2003 ACM symposium on Applied computing

Efficient and Scalable Algorithms for Inferring Likely Invariants in Distributed Systems

IEEE Transactions on Knowledge and Data Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Performance analysis of computing systems is an increasingly difficult task due to growing system complexity. Traditional tools rely on ad hoc procedures. With these, determining which of the manifold system and workload parameters to examine is often a lengthy and highly speculative process. The analysis is often incomplete and, therefore, prone to revealing faulty conclusions and not uncovering useful tuning knowledge. We address this problem by introducing a data mining approach called ADMiRe (Analyzer for Data Mining Results). In this scheme, regression analysis is first applied to performance data to discover correlations between various system and workload parameters. The results of this analysis are summarized in sets of regression rules. The user can then formulate intuitive algebraic expressions to manipulate these sets of rules to capture critical information. To demonstrate this approach, we use ADMiRe to analyze an Oracle database system running the TPC-C (Transaction Processing Performance Council) benchmark. The results generated by ADMiRe were confirmed by Oracle experts. We also show that by applying ADMiRe to Microsoft Internet Information Server performance data, we can improve system performance by 20 percent.