Quantitative system performance: computer system analysis using queueing network models
Quantitative system performance: computer system analysis using queueing network models
Robust weighted orthogonal regression in the errors-in-variables model
Journal of Multivariate Analysis
Robust Workload Estimation in Queueing Network Performance Models
PDP '08 Proceedings of the 16th Euromicro Conference on Parallel, Distributed and Network-Based Processing (PDP 2008)
Linear grouping using orthogonal regression
Computational Statistics & Data Analysis
Indirect estimation of service demands in the presence of structural changes
Performance Evaluation
Hi-index | 0.00 |
Inferring service time from workload and utilization data is important to predict the performance of computer systems. While the utilization law expresses a linear relationship between the workload submitted to a computing system and its utilization, the automated analysis of real world datasets is far from trivial. Hardware and software upgrades modify the service time and periodic activities affect the utilization law. Therefore, multiple regression lines must be found in the datasets to explain the different behaviours of the system. In this paper, we propose a new methodology that works in three main phases, which involve clustering based on density of points, splitting of clusters and estimation of regression lines obtained from our extension of a clusterwise regression algorithm and a refinement procedure to remove and merge clusters. A cumulative effect of these phases is the simultaneous determination of the number of clusters while correctly identifying the point-to-cluster membership, the underlying regression lines and the outliers. A novel feature of our approach is that the selection of the number of clusters exploits the structure of the data and is not based on the model complexity as in most previous methods. A computational comparison of our method with suitable existing approaches on real world data as well as challenging synthetic "realistic" data shows the efficiency of our algorithm.