Large and small sample comparisons of various variance estimators
WSC '86 Proceedings of the 18th conference on Winter simulation
An Approach to Selecting Metrics for Detecting Performance Problems in Information Systems
SMW '96 Proceedings of the 2nd IEEE International Workshop on Systems Management (SMW'96)
Hi-index | 0.00 |
Resolving intermittent performance problems in computer systems is made easier by pinpointing when a change occurs in the system's perforrnance-determinin g factors (e.g., workload composition, configuration). Since we often lack direct measurements of performance factors, this paper presents a procedure for indirectly detecting such changes by analyzing performance characteristics (e.g., response times, queue lengths). Our procedure employs a widely used clustering algorithm to identify candidate change points (the times at which performance factors change), and a newly developed statistical test (based on an AR(1) time series model) to determine the signficance of candidate change points. We evaluate our procedure by using simulations of M/M/1, FCFS queueing systems and by applying our procedure to measurements of a mainframe computer system at a large telephone company. These evaluations suggest that our procedure is effective in practice, especially for larger sample sizes and smaller utilizations. We further conclude that indirectly detecting changes in performance factors appears to be inherently difficult in that the sensitivity of a detection procedure depends on the magnitude of the change in performance characteristics, which often has a nonlinear relationship with the change in performance factors. Thus, a change in performance factors (e.g., increased service times) may be more readily detected in some situations (e.g., very low or very high utilizations) than in others (e.g., moderate utilizations). A key insight here is that the sensitivity of the detection procedure can be improved by choosing appropriate measures of performance characteristics. For example, our experience and analysis suggest that queue lengths can be more sensitive than response times to changes in arrival rates.