Applying statistical methodology to optimize and simplify software metric models with missing data
Proceedings of the 2006 ACM symposium on Applied computing
Outlier elimination in construction of software metric models
Proceedings of the 2007 ACM symposium on Applied computing
Hi-index | 0.00 |
A software metric model can be used to predict a target metric (e.g., the development work effort) for a future release of a software system based on the projectýs predictor metrics (e.g., the project team size). However, missing or incomplete data often appear in the data samples used to construct the model. So far, the least biased and thus the most recommended software metric models for dealing with the missing/incomplete data are those constructed by using the maximum likelihood methods. It is true that the inclusion of a particular predictor metric in the model construction is initially based on an intuitive or experience-based assumption that the predictor metric impacts significantly the target metric. Nevertheless, this assumption has to be verified. Previous research on metric models constructed by using the maximum likelihood methods simply took this verification for granted. This can result in probable inclusion of superfluous predictor metric(s) and/or unnecessary predictor metric complexity. In this paper, we propose a methodology to optimize and simplify such models based on the results of appropriate hypothesis tests. An experiment is also reported to demonstrate the use of our methodology in trimming redundant predictor metric(s) and/or unnecessary predictor metric complexity.