Clone region descriptors: Representing and tracking duplication in source code

  • Authors:
  • Ekwa Duala-Ekoko;Martin P. Robillard

  • Affiliations:
  • McGill University, Montreal, Canada;McGill University, Montreal, Canada

  • Venue:
  • ACM Transactions on Software Engineering and Methodology (TOSEM)
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Source code duplication, commonly known as code cloning, is considered an obstacle to software maintenance because changes to a cloned region often require consistent changes to other regions of the source code. Research has provided evidence that the elimination of clones may not always be practical, feasible, or cost-effective. We present a clone management approach that describes clone regions in a robust way that is independent from the exact text of clone regions or their location in a file, and that provides support for tracking clones in evolving software. Our technique relies on the concept of abstract clone region descriptors (CRDs), which describe clone regions using a combination of their syntactic, structural, and lexical information. We present our definition of CRDs, and describe a clone tracking system capable of producing CRDs from the output of different clone detection tools, notifying developers of modifications to clone regions, and supporting updates to the documented clone relationships. We evaluated the performance and usefulness of our approach across three clone detection tools and five subject systems, and the results indicate that CRDs are a practical and robust representation for tracking code clones in evolving software.