Supporting the analysis of clones in software systems: Research Articles

  • Authors:
  • Cory J. Kapser;Michael W. Godfrey

  • Affiliations:
  • David R. Cheriton School of Computer Science, University of Waterloo, Waterloo, Ontario, Canada N2L 3G1;David R. Cheriton School of Computer Science, University of Waterloo, Waterloo, Ontario, Canada N2L 3G1

  • Venue:
  • Journal of Software Maintenance and Evolution: Research and Practice - IEEE International Conference on Software Maintenance (ICSM2005)
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Code duplication is a well-documented problem in industrial software systems. There has been considerable research into techniques for detecting duplication in software, and there are several effective tools to perform this task. However, there have been few detailed qualitative studies into how cloning actually manifests itself within software systems. This is primarily due to the large result sets that many clone-detection tools return; these result sets are very difficult to manage without complementary tool support that can scale to the size of the problem, and this kind of support does not currently exist. In this paper we present an in-depth case study of cloning in a large software system that is in wide use, the Apache Web server; we provide insights into cloning as it exists in this system, and we demonstrate techniques to manage and make effective use of the large result sets of clone-detection tools. In our case study, we found several interesting types of cloning occurrences, such as ‘cloning hotspots’, where a single subsystem comprising only 17% of the system code contained 38.8% of the clones. We also found several examples of cloning behavior that were beneficial to the development of the system, in particular cloning as a way to add experimental functionality. Copyright © 2006 John Wiley & Sons, Ltd.