A case study of open source software development: the Apache server
Proceedings of the 22nd international conference on Software engineering
Object-Oriented and Classical Software Engineering
Object-Oriented and Classical Software Engineering
Two case studies of open source software development: Apache and Mozilla
ACM Transactions on Software Engineering and Methodology (TOSEM)
Editorial: Open Source and Empirical Software Engineering
Empirical Software Engineering
SEEWeb: making experimental artifacts available
ACM SIGSOFT Software Engineering Notes
Predicting the Probability of Change in Object-Oriented Systems
IEEE Transactions on Software Engineering
Replaying development history to assess the effectiveness of change propagation tools
Empirical Software Engineering
Which warnings should I fix first?
Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
Self-organization process in open-source software: An empirical study
Information and Software Technology
Journal of Software Maintenance and Evolution: Research and Practice
Automated classification of change messages in open source projects
Proceedings of the 2008 ACM symposium on Applied computing
Proceedings of the joint international and annual ERCIM workshops on Principles of software evolution (IWPSE) and software evolution (Evol) workshops
The Linux kernel as a case study in software evolution
Journal of Systems and Software
Automatic construction of an effective training set for prioritizing static analysis warnings
Proceedings of the IEEE/ACM international conference on Automated software engineering
The missing links: bugs and bug-fix commits
Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering
Using hierarchal change mining to manage network security policy evolution
Hot-ICE'11 Proceedings of the 11th USENIX conference on Hot topics in management of internet, cloud, and enterprise networks and services
Dealing with noise in defect prediction
Proceedings of the 33rd International Conference on Software Engineering
ReLink: recovering links between bugs and changes
Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering
Free/Libre open-source software development: What we know and what we do not know
ACM Computing Surveys (CSUR)
Are Developers Fixing Their Own Bugs?: Tracing Bug-Fixing and Bug-Seeding Committers
International Journal of Open Source Software and Processes
A comparison of identity merge algorithms for software repositories
Science of Computer Programming
Hi-index | 0.00 |
A recent editorial in Empirical Software Engineering suggested that open-source software projects offer a great deal of data that can be used for experimentation. These data not only include source code, but also artifacts such as defect reports and update logs. A common type of update log that experimenters may wish to investigate is the ChangeLog, which lists changes and the reasons for which they were made. ChangeLog files are created to support the development of software rather than for the needs of researchers, so questions need to be asked about the limitations of using them to support research. This paper presents evidence that the ChangeLog files provided at three open-source web sites were incomplete. We examined at least three ChangeLog files for each of three different open-source software products, namely, GNUJSP, GCC-g++, and Jikes. We developed a method for counting changes that ensures that, as far as possible, each individual ChangeLog entry is treated as a single change. For each ChangeLog file, we compared the actual changes in the source code to the entries in the ChangeLog file and discovered significant omissions. For example, using our change-counting method, only 35 of the 93 changes in version 1.11 of Jikes appear in the ChangeLog file—that is, over 62% of the changes were not recorded there. The percentage of omissions we found ranged from 3.7 to 78.6%. These are significant omissions that should be taken into account when using ChangeLog files for research. Before using ChangeLog files as a basis for research into the development and maintenance of open-source software, experimenters should carefully check for omissions and inaccuracies.