NICAD: Accurate Detection of Near-Miss Intentional Clones Using Flexible Pretty-Printing and Code Normalization

  • Authors:
  • Chanchal K. Roy;James R. Cordy

  • Affiliations:
  • -;-

  • Venue:
  • ICPC '08 Proceedings of the 2008 The 16th IEEE International Conference on Program Comprehension
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper examines the effectiveness of a new language-specific parser-based but lightweight clone detection approach. Exploiting a novel application of a source transformation system, the method accurately finds near-miss clones using an efficient text line comparison technique. The transformation system assists the methodin three ways. First, using agile parsing it provides user-specified flexible pretty-printing to remove noise, standardize formatting and break program statements into parts such that potential changes can be detected as simple linewise text differences. Second, it provides efficient flexible extraction of potential clones to be compared using island grammars and agile parsing to select granularities and enumerate potential clones. Third, using transformation rules it provides flexible code normalization to allow for local editing differences between similar code segments and filtering out of uninteresting parts of potential clones. In this paper we introduce the theory and practice of the framework and demonstrate its use in finding function clones in C code. Early experiments indicate that the method is capable of finding near-miss clones with high precision and recall, and with reasonable performance.