CCFinder: a multilinguistic token-based code clone detection system for large scale source code
IEEE Transactions on Software Engineering
Identifying Similar Code with Program Dependence Graphs
WCRE '01 Proceedings of the Eighth Working Conference on Reverse Engineering (WCRE'01)
Clone Detection Using Abstract Syntax Trees
ICSM '98 Proceedings of the International Conference on Software Maintenance
CP-Miner: Finding Copy-Paste and Related Bugs in Large-Scale Software Code
IEEE Transactions on Software Engineering
GPLAG: detection of software plagiarism by program dependence graph analysis
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
DECKARD: Scalable and Accurate Tree-Based Detection of Code Clones
ICSE '07 Proceedings of the 29th international conference on Software Engineering
Scalable detection of semantic clones
Proceedings of the 30th international conference on Software engineering
ICPC '08 Proceedings of the 2008 The 16th IEEE International Conference on Program Comprehension
Comparison and evaluation of code clone detection techniques and tools: A qualitative approach
Science of Computer Programming
CMCD: Count Matrix Based Code Clone Detection
APSEC '11 Proceedings of the 2011 18th Asia-Pacific Software Engineering Conference
Hi-index | 0.00 |
Detecting code clones in a program has many applications in software engineering and other related fields. In this paper, we present Boreas, an accurate and scalable token-based approach for code clone detection. Boreas introduces a novel counting-based method to define the characteristic matrices, which are able to describe the program segments distinctly and effectively for the purpose of clone detection. We conducted experiments on JDK 7 and Linux kernel 2.6.38.6 source code. Experimental results show that Boreas is able to match the detecting accuracy of a recently proposed syntactic-based tool Deckard, with the execution time reduced by more than an order of magnitude.