Hipikat: recommending pertinent software development artifacts
Proceedings of the 25th International Conference on Software Engineering
Identifying Reasons for Software Changes Using Historic Databases
ICSM '00 Proceedings of the International Conference on Software Maintenance (ICSM'00)
Populating a Release History Database from Version Control and Bug Tracking Systems
ICSM '03 Proceedings of the International Conference on Software Maintenance
Data Mining Static Code Attributes to Learn Defect Predictors
IEEE Transactions on Software Engineering
Predicting Faults from Cached History
ICSE '07 Proceedings of the 29th international conference on Software Engineering
Predicting Defects for Eclipse
PROMISE '07 Proceedings of the Third International Workshop on Predictor Models in Software Engineering
Proceedings of the 30th international conference on Software engineering
Fair and balanced?: bias in bug-fix datasets
Proceedings of the the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
Journal of Systems and Software
The missing links: bugs and bug-fix commits
Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering
A Case Study of Bias in Bug-Fix Datasets
WCRE '10 Proceedings of the 2010 17th Working Conference on Reverse Engineering
Dealing with noise in defect prediction
Proceedings of the 33rd International Conference on Software Engineering
Don't touch my code!: examining the effects of ownership on software quality
Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering
ReLink: recovering links between bugs and changes
Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering
Ecological inference in empirical software engineering
ASE '11 Proceedings of the 2011 26th IEEE/ACM International Conference on Automated Software Engineering
Recalling the "imprecision" of cross-project defect prediction
Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering
Hi-index | 0.00 |
Most empirical disciplines promote the reuse and sharing of datasets, as it leads to greater possibility of replication. While this is increasingly the case in Empirical Software Engineering, some of the most popular bug-fix datasets are now known to be biased. This raises two significant concerns: first, that sample bias may lead to underperforming prediction models, and second, that the external validity of the studies based on biased datasets may be suspect. This issue has raised considerable consternation in the ESE literature in recent years. However, there is a confounding factor of these datasets that has not been examined carefully: size. Biased datasets are sampling only some of the data that could be sampled, and doing so in a biased fashion; but biased samples could be smaller, or larger. Smaller data sets in general provide less reliable bases for estimating models, and thus could lead to inferior model performance. In this setting, we ask the question, what affects performance more, bias, or size? We conduct a detailed, large-scale meta-analysis, using simulated datasets sampled with bias from a high-quality dataset which is relatively free of bias. Our results suggest that size always matters just as much bias direction, and in fact much more than bias direction when considering information-retrieval measures such as AUCROC and F-score. This indicates that at least for prediction models, even when dealing with sampling bias, simply finding larger samples can sometimes be sufficient. Our analysis also exposes the complexity of the bias issue, and raises further issues to be explored in the future.