Suffix arrays: a new method for on-line string searches
SIAM Journal on Computing
A Space-Economical Suffix Tree Construction Algorithm
Journal of the ACM (JACM)
CCFinder: a multilinguistic token-based code clone detection system for large scale source code
IEEE Transactions on Software Engineering
Experiment on the Automatic Detection of Function Clones in a Software System Using Metrics
ICSM '96 Proceedings of the 1996 International Conference on Software Maintenance
Evaluating Clone Detection Tools for Use during Preventative Maintenance
SCAM '02 Proceedings of the Second IEEE International Workshop on Source Code Analysis and Manipulation
Identifying Similar Code with Program Dependence Graphs
WCRE '01 Proceedings of the Eighth Working Conference on Reverse Engineering (WCRE'01)
Assessing the Benefits of Incorporating Function Clone Detection in a Development Process
ICSM '97 Proceedings of the International Conference on Software Maintenance
Clone Detection Using Abstract Syntax Trees
ICSM '98 Proceedings of the International Conference on Software Maintenance
A Language Independent Approach for Detecting Duplicated Code
ICSM '99 Proceedings of the IEEE International Conference on Software Maintenance
Eliminating redundancies with a "composition with adaptation" meta-programming technique
Proceedings of the 9th European software engineering conference held jointly with 11th ACM SIGSOFT international symposium on Foundations of software engineering
Identifying redundancy in source code using fingerprints
CASCON '93 Proceedings of the 1993 conference of the Centre for Advanced Studies on Collaborative research: software engineering - Volume 1
Replacing suffix trees with enhanced suffix arrays
Journal of Discrete Algorithms - SPIRE 2002
Beyond templates: a study of clones in the STL and some general implications
Proceedings of the 27th international conference on Software engineering
Detecting higher-level similarity patterns in programs
Proceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on Foundations of software engineering
CP-Miner: Finding Copy-Paste and Related Bugs in Large-Scale Software Code
IEEE Transactions on Software Engineering
Clone Detection Using Abstract Syntax Suffix Trees
WCRE '06 Proceedings of the 13th Working Conference on Reverse Engineering
Hi-index | 0.00 |
Code clones are similar code fragments that occur at multiple locations in a software system. Detection of code clones provides useful information for maintenance, reengineering, program understanding and reuse. Several techniques have been proposed to detect code clones. These techniques differ in the code representation used for analysis of clones, ranging from plain text to parse trees and program dependence graphs. Clone detection based on lexical tokens involves minimal code transformation and gives good results, but is computationally expensive because of the large number of tokens that need to be compared. We explored string algorithms to find suitable data structures and algorithms for efficient token based clone detection and implemented them in our tool Repeated Tokens Finder (RTF). Instead of using suffix tree for string matching, we use more memory efficient suffix array. RTF incorporates a suffix array based linear time algorithm to detect string matches. It also provides a simple and customizable tokenization mechanism. Initial analysis and experiments show that our clone detection is simple, scalable, and performs better than the previous well-known tools.