Estimating Software Project Effort Using Analogies
IEEE Transactions on Software Engineering
Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Text Classification from Labeled and Unlabeled Documents using EM
Machine Learning - Special issue on information retrieval
Case Studies for Method and Tool Evaluation
IEEE Software
A Comparative Study of Cost Estimation Models for Web Hypermedia Applications
Empirical Software Engineering
A Simulation Study of the Model Evaluation Criterion MMRE
IEEE Transactions on Software Engineering
Active learning for automatic classification of software behavior
ISSTA '04 Proceedings of the 2004 ACM SIGSOFT international symposium on Software testing and analysis
Distribution Patterns of Effort Estimations
EUROMICRO '04 Proceedings of the 30th EUROMICRO Conference
Semi-Supervised Self-Training of Object Detection Models
WACV-MOTION '05 Proceedings of the Seventh IEEE Workshops on Application of Computer Vision (WACV/MOTION'05) - Volume 1 - Volume 01
Cross versus Within-Company Cost Estimation Studies: A Systematic Review
IEEE Transactions on Software Engineering
Finding Prototypes For Nearest Neighbor Classifiers
IEEE Transactions on Computers
Empirical evaluation of analogy-x for software cost estimation
Proceedings of the Second ACM-IEEE international symposium on Empirical software engineering and measurement
A study of project selection and feature weighting for analogy based software cost estimation
Journal of Systems and Software
Why comparative effort prediction studies may be invalid
PROMISE '09 Proceedings of the 5th International Conference on Predictor Models in Software Engineering
Cross-project defect prediction: a large scale experiment on data vs. domain vs. process
Proceedings of the the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
On the relative value of cross-company and within-company data for defect prediction
Empirical Software Engineering
Applying moving windows to software effort estimation
ESEM '09 Proceedings of the 2009 3rd International Symposium on Empirical Software Engineering and Measurement
Using differences among replications of software engineering experiments to gain knowledge
ESEM '09 Proceedings of the 2009 3rd International Symposium on Empirical Software Engineering and Measurement
When to use data from other projects for effort estimation
Proceedings of the IEEE/ACM international conference on Automated software engineering
How to Find Relevant Data for Effort Estimation?
ESEM '11 Proceedings of the 2011 International Symposium on Empirical Software Engineering and Measurement
Transfer learning for cross-company software defect prediction
Information and Software Technology
Exploiting the Essential Assumptions of Analogy-Based Effort Estimation
IEEE Transactions on Software Engineering
Software defect prediction using semi-supervised learning with dimension reduction
Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering
Can cross-company data improve performance in software effort estimation?
Proceedings of the 8th International Conference on Predictive Models in Software Engineering
IEEE Transactions on Software Engineering
Hi-index | 0.00 |
Background: Developing and maintaining a software effort estimation (SEE) data set within a company (within data) is costly. Often times parts of data may be missing or too difficult to collect, e.g. effort values. However, information about the past projects-although incomplete- may be helpful, when incorporated with the SEE data sets from other companies (cross data). Aim: Utilizing cross data to aid within company estimates and local experts; Proposing a synergy between semi-supervised, active and cross company learning for software effort estimation. Method: The proposed method: 1) Summarizes existing unlabeled within data; 2) Uses cross data to provide pseudo-labels for the summarized within data; 3) Uses steps 1 and 2 to provide an estimate for the within test data as an input for the local company experts. We use 21 data sets and compare the proposed method to existing state-of-the-art within and cross company effort estimation methods subject to evaluation by 7 different error measures. Results: In 132 out of 147 settings (21 data sets X 7 error measures = 147 settings), the proposed method performs as well as the state-of-the-art methods. Also, the proposed method summarizes the past within data down to at most 15% of the original data. Conclusion: It is important to look for synergies amongst cross company and within-company effort estimation data, even when the latter is imperfect or sparse. In this research, we provide the experts with a method that: 1) is competent (performs as well as prior within and cross data estimation methods) 2) reflects on local data (estimates come from the within data); 3) is succinct (summarizes within data down to 15% or less); 4) cheap (easy to build).