Tree-pattern-based duplicate code detection

  • Authors:
  • Hyo-Sub Lee;Kyung-Goo Doh

  • Affiliations:
  • Hanyang University, Ansan, South Korea;Hanyang University, Ansan, South Korea

  • Venue:
  • Proceedings of the ACM first international workshop on Data-intensive software management and mining
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a tree-pattern-based method of automatically and accurately finding code clones in program files. Duplicate tree-patterns are first collected by anti-unification algorithm and redundancy-free exhaustive comparisons, and then finally clustered. The algorithm is designed in such a way that the same comparison is not repeated for speed, while thoroughly examining every possible pairs of tree patterns for accuracy. Our method maintains the syntax structure of code in tree-pattern clusters, which gives the flexibility of finding different types of clones while keeping the precision.