CP-Miner: Finding Copy-Paste and Related Bugs in Large-Scale Software Code
IEEE Transactions on Software Engineering
DECKARD: Scalable and Accurate Tree-Based Detection of Code Clones
ICSE '07 Proceedings of the 29th international conference on Software Engineering
Comparison and evaluation of code clone detection techniques and tools: A qualitative approach
Science of Computer Programming
ICSE '09 Proceedings of the 31st International Conference on Software Engineering
CMCD: Count Matrix Based Code Clone Detection
APSEC '11 Proceedings of the 2011 18th Asia-Pacific Software Engineering Conference
Hi-index | 0.00 |
In this paper, we introduce a new token based algorithm for code clone detection. Count Environment(CE) is certain scenario related to variables. Count Vector(CV) for one variable is consisted of counting occurrences of this variable in different CEs. Count Matrix(CM) for one code fragment is consisted of different CVs of all variables in the code fragment. We use CVs to depict variables, and use CM to represent a code fragment. Two code fragments will be compared by their corresponding CMs, and during the comparison, two heuristics are used. Experimental results show that our algorithm is significantly faster than Deckard, a state-of-the-art syntactic technique for detecting code clones.