Algorithms on strings, trees, and sequences: computer science and computational biology
Algorithms on strings, trees, and sequences: computer science and computational biology
N degrees of separation: multi-dimensional separation of concerns
Proceedings of the 21st international conference on Software engineering
On finding duplication and near-duplication in large software systems
WCRE '95 Proceedings of the Second Working Conference on Reverse Engineering
Clone Detection Using Abstract Syntax Trees
ICSM '98 Proceedings of the International Conference on Software Maintenance
On the Use of Clone Detection for Identifying Crosscutting Concern Code
IEEE Transactions on Software Engineering
Clone detection and refactoring
Companion to the 21st ACM SIGPLAN symposium on Object-oriented programming systems, languages, and applications
Comparison and evaluation of code clone detection techniques and tools: A qualitative approach
Science of Computer Programming
A Model Engineering Approach to Tool Interoperability
Software Language Engineering
Tree-pattern-based duplicate code detection
Proceedings of the ACM first international workshop on Data-intensive software management and mining
Representing clones in a localized manner
Proceedings of the 5th International Workshop on Software Clones
IDE-based real-time focused search for near-miss clones
Proceedings of the 27th Annual ACM Symposium on Applied Computing
Resource requirement prediction using clone detection technique
Future Generation Computer Systems
Viewing functions as token sequences to highlight similarities in source code
Science of Computer Programming
Hi-index | 0.00 |
A code clone represents a sequence of statements that are duplicated in multiple locations of a program. Clones often arise in source code as a result of multiple cut/paste operations on the source, or due to the emergence of crosscutting concerns. Programs containing code clones can manifest problems during the maintenance phase. When a fault is found or an update is needed on the original copy of a code section, all similar clones must also be found so that they can be fixed or updated accordingly. The ability to detect clones becomes a necessity when performing maintenance tasks. However, if done manually, clone detection can be a slow and tedious activity that is also error prone. A tool that can automatically detect clones offers a significant advantage during software evolution. With such an automated detection tool, clones can be found and updated in less time. Moreover, restructuring or refactoring of these clones can yield better performance and modularity in the program.This paper describes an investigation into an automatic clone detection technique developed as a plug-in for Microsoft's new Phoenix framework. Our investigation finds function-level clones in a program using abstract syntax trees (ASTs) and suffix trees. An AST provides the structural representation of the code after the lexical analysis process. The AST nodes are used to generate a suffix tree, which allows analysis on the nodes to be performed rapidly. We use the same methods that have been successfully applied to find duplicate sections in biological sequences to search for matches on the suffix tree that is generated, which in turn reveal matches in the code.