Identifying syntactic differences between two programs
Software—Practice & Experience
Automated assistance for program restructuring
ACM Transactions on Software Engineering and Methodology (TOSEM)
Parameterized Duplication in Strings: Algorithms and an Application to Software Maintenance
SIAM Journal on Computing
Pattern matching for clone and concept detection
Reverse engineering
Enhanced code compression for embedded RISC processors
Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
JavaML: a markup language for Java source code
Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
Compiler techniques for code compaction
ACM Transactions on Programming Languages and Systems (TOPLAS)
Analyzing and compressing assembly code
SIGPLAN '84 Proceedings of the 1984 SIGPLAN symposium on Compiler construction
ARM Architecture Reference Manual
ARM Architecture Reference Manual
The ICON Programming Language
Sifting out the mud: low level C++ code reuse
OOPSLA '02 Proceedings of the 17th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
CCFinder: a multilinguistic token-based code clone detection system for large scale source code
IEEE Transactions on Software Engineering
Experiment on the Automatic Detection of Function Clones in a Software System Using Metrics
ICSM '96 Proceedings of the 1996 International Conference on Software Maintenance
Efficiently mining frequent trees in a forest
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Rapid identification of repeated patterns in strings, trees and arrays
STOC '72 Proceedings of the fourth annual ACM symposium on Theory of computing
On finding duplication and near-duplication in large software systems
WCRE '95 Proceedings of the Second Working Conference on Reverse Engineering
Clone Detection Using Abstract Syntax Trees
ICSM '98 Proceedings of the International Conference on Software Maintenance
A Language Independent Approach for Detecting Duplicated Code
ICSM '99 Proceedings of the IEEE International Conference on Software Maintenance
Software—Practice & Experience
CP-Miner: Finding Copy-Paste and Related Bugs in Large-Scale Software Code
IEEE Transactions on Software Engineering
Clone Detection Using Abstract Syntax Suffix Trees
WCRE '06 Proceedings of the 13th Working Conference on Reverse Engineering
Frequent Subtree Mining - An Overview
Fundamenta Informaticae - Advances in Mining Graphs, Trees and Sequences
DECKARD: Scalable and Accurate Tree-Based Detection of Code Clones
ICSE '07 Proceedings of the 29th international conference on Software Engineering
Deducing similarities in Java sources from bytecodes
ATEC '98 Proceedings of the annual conference on USENIX Annual Technical Conference
Comparison and Evaluation of Clone Detection Tools
IEEE Transactions on Software Engineering
Finding Clones with Dup: Analysis of an Experiment
IEEE Transactions on Software Engineering
Clone Detection via Structural Abstraction
WCRE '07 Proceedings of the 14th Working Conference on Reverse Engineering
Anti-unification for Unranked Terms and Hedges
Journal of Automated Reasoning
Hi-index | 0.00 |
This paper describes the design, implementation, and application of a new algorithm to detect cloned code. It operates on the abstract syntax trees formed by many compilers as an intermediate representation. It extends prior work by identifying clones even when arbitrary subtrees have been changed. These subtrees may represent structural rather than simply lexical code differences. In several hundred thousand lines of Java and C# code, 20---50% of the clones that we find involve these structural changes, which are not accounted for by previous methods. Our method also identifies cloning in declarations, so it is somewhat more general than conventional procedural abstraction.