Using Decision Trees to Predict the Certification Result of a Build

Authors:
Ahmed E. Hassan;Ken Zhang
Affiliations:
University of Victoria, Canada;Toronto Software Lab IBM Canada
Venue:
ASE '06 Proceedings of the 21st IEEE/ACM International Conference on Automated Software Engineering
Year:
2006

Citing 0
Cited 3

Predicting build failures using social network analysis on developer communication

ICSE '09 Proceedings of the 31st International Conference on Software Engineering
Predicting build outcome with developer interaction in Jazz

Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 2
Studying the fix-time for bugs in large open source projects

Proceedings of the 7th International Conference on Predictive Models in Software Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Large teams of practitioners (developers, testers, etc.) usually work in parallel on the same code base. A major concern when working in parallel is the introduction of integration bugs in the latest shared code. These latent bugs are likely to slow down the project unless they are discovered as soon as possible. Many companies have adopted daily or weekly processes which build the latest source code and certify it by executing simple manual smoke/sanity tests or extensive automated integration test suites. Other members of a team can then use the certified build to develop new features or to perform additional analysis, such as performance or usability testing. For large projects the certification process may take a few days. This long certification process forces team members to either use outdated or uncertified (possibly buggy) versions of the code. In this paper, we create decision trees to predict ahead of time the certification result of a build. By accurately predicting the outcome of the certification process, members of large software teams can work more effectively in parallel. Members can start using the latest code without waiting for the certification process to be completed. To perform our study, we mine historical information (code changes and certification results) for a large software project which is being developed at the IBM Toronto Labs. Our study shows that using a combination of project attributes (such as the number of modified subsystems in a build and certification results of previous builds), we can correctly predict 69% of the time that a build will fail certification. We can as well correctly predict 95% of the time if a build will pass certification.