Boreas: an accurate and scalable token-based approach to code clone detection

  • Authors:
  • Yang Yuan;Yao Guo

  • Affiliations:
  • Peking University, China;Peking University, China

  • Venue:
  • Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Detecting code clones in a program has many applications in software engineering and other related fields. In this paper, we present Boreas, an accurate and scalable token-based approach for code clone detection. Boreas introduces a novel counting-based method to define the characteristic matrices, which are able to describe the program segments distinctly and effectively for the purpose of clone detection. We conducted experiments on JDK 7 and Linux kernel 2.6.38.6 source code. Experimental results show that Boreas is able to match the detecting accuracy of a recently proposed syntactic-based tool Deckard, with the execution time reduced by more than an order of magnitude.