Software project development cost estimation
Journal of Systems and Software
Statistical analysis with missing data
Statistical analysis with missing data
Software engineering metrics and models
Software engineering metrics and models
An Evaluation of Expert Systems for Software Engineering Management
IEEE Transactions on Software Engineering
Rule-based approach to computing module cohesion
ICSE '93 Proceedings of the 15th international conference on Software Engineering
Explaining the cost of European space and military projects
Proceedings of the 21st international conference on Software engineering
Software Cost Estimation with Incomplete Data
IEEE Transactions on Software Engineering
IEEE Transactions on Software Engineering - Special section on the seventh international software metrics symposium
Software Development Cost Estimation Using Function Points
IEEE Transactions on Software Engineering
Assessing the Benefits of Imputing ERP Projects with Missing Data
METRICS '01 Proceedings of the 7th International Symposium on Software Metrics
Using Public Domain Metrics To Estimate Software Development Effort
METRICS '01 Proceedings of the 7th International Symposium on Software Metrics
Building A Software Cost Estimation Model Based On Categorical Data
METRICS '01 Proceedings of the 7th International Symposium on Software Metrics
Dealing with Missing Software Project Data
METRICS '03 Proceedings of the 9th International Symposium on Software Metrics
Optimizing and Simplifying Software Metric Models Constructed Using Maximum Likelihood Methods
COMPSAC '05 Proceedings of the 29th Annual International Computer Software and Applications Conference - Volume 01
Preprocessing DNS log data for effective data mining
ICC'09 Proceedings of the 2009 IEEE international conference on Communications
Hi-index | 0.00 |
During the construction of a software metric model, the decision on whether a particular predictor metric should be included is most likely based on an intuitive or experience based assumption that the predictor metric has an impact on the target metric with a statistical significance. However, a model constructed based on such an assumption may contain redundant predictor metric(s) and/or unnecessary predictor metric complexity. This is because the assumption made before the model construction is not verified after the model is constructed. To resolve the first problem (i.e., possible redundant predictor metric(s)), we propose a statistical hypothesis testing methodology to verify "retrospectively" the statistical significance of the impact of each predictor metric on the target metric. If the variation of a predictor metric does not correlate enough with the variation of the target metric, the predictor metric should be deleted from the model. For the second problem (i.e., unnecessary predictor metric complexity), we use "goodness-of-fit" to determine whether certain categories of a categorical predictor metric should be combined together. In addition, missing data often appear in the data sample used for constructing the model. We use a modified k-nearest neighbors (k-NN) imputation method to deal with this problem. A study using data from the "Repository Data Disk - Release 6" is reported. The results indicate that our methodology can be useful in trimming redundant predictor metrics and identifying unnecessary categories initially assumed for a categorical predictor metric in the model.