Identifying syntactic differences between two programs
Software—Practice & Experience
Suffix arrays: a new method for on-line string searches
SIAM Journal on Computing
Parameterized pattern matching: algorithms and applications
Journal of Computer and System Sciences
Sim: a utility for detecting similarity in computer programs
SIGCSE '99 The proceedings of the thirtieth SIGCSE technical symposium on Computer science education
Refactoring: improving the design of existing code
Refactoring: improving the design of existing code
A Space-Economical Suffix Tree Construction Algorithm
Journal of the ACM (JACM)
An empirical study of operating systems errors
SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Growth, evolution, and structural change in open source software
IWPSE '01 Proceedings of the 4th International Workshop on Principles of Software Evolution
CCFinder: a multilinguistic token-based code clone detection system for large scale source code
IEEE Transactions on Software Engineering
Experiment on the Automatic Detection of Function Clones in a Software System Using Metrics
ICSM '96 Proceedings of the 1996 International Conference on Software Maintenance
An Approach to Identify Duplicated Web Pages
COMPSAC '02 Proceedings of the 26th International Computer Software and Applications Conference on Prolonging Software Life: Development and Redevelopment
On Software Maintenance Process Improvement Based on Code Clone Analysis
PROFES '02 Proceedings of the 4th International Conference on Product Focused Software Process Improvement
Measuring Clone Based Reengineering Opportunities
METRICS '99 Proceedings of the 6th International Symposium on Software Metrics
Software Quality Analysis by Code Clones in Industrial Legacy Software
METRICS '02 Proceedings of the 8th International Symposium on Software Metrics
Evaluating Clone Detection Tools for Use during Preventative Maintenance
SCAM '02 Proceedings of the Second IEEE International Workshop on Source Code Analysis and Manipulation
On finding duplication and near-duplication in large software systems
WCRE '95 Proceedings of the Second Working Conference on Reverse Engineering
An Intermediate Representation for Reverse Engineering Analyses
WCRE '98 Proceedings of the Working Conference on Reverse Engineering (WCRE'98)
Advanced Clone-Analysis to Support Object-Oriented System Refactoring
WCRE '00 Proceedings of the Seventh Working Conference on Reverse Engineering (WCRE'00)
Identifying Similar Code with Program Dependence Graphs
WCRE '01 Proceedings of the Eighth Working Conference on Reverse Engineering (WCRE'01)
Modeling clones evolution through time series
ICSM '01 Proceedings of the IEEE International Conference on Software Maintenance (ICSM'01)
Assessing the Benefits of Incorporating Function Clone Detection in a Development Process
ICSM '97 Proceedings of the International Conference on Software Maintenance
Clone Detection Using Abstract Syntax Trees
ICSM '98 Proceedings of the International Conference on Software Maintenance
A Language Independent Approach for Detecting Duplicated Code
ICSM '99 Proceedings of the IEEE International Conference on Software Maintenance
Comprehending Reality " Practical Barriers to Industrial Adoption of Software Maintenance Automation
IWPC '03 Proceedings of the 11th IEEE International Workshop on Program Comprehension
Identification of High-Level Concept Clones in Source Code
Proceedings of the 16th IEEE international conference on Automated software engineering
Winnowing: local algorithms for document fingerprinting
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Finding Function Clones in Web Applications
CSMR '03 Proceedings of the Seventh European Conference on Software Maintenance and Reengineering
Identifying redundancy in source code using fingerprints
CASCON '93 Proceedings of the 1993 conference of the Centre for Advanced Studies on Collaborative research: software engineering - Volume 1
An Evaluation of Clone Detection Techniques for Identifying Crosscutting Concerns
ICSM '04 Proceedings of the 20th IEEE International Conference on Software Maintenance
An Ethnographic Study of Copy and Paste Programming Practices in OOPL
ISESE '04 Proceedings of the 2004 International Symposium on Empirical Software Engineering
Clone Detection in Source Code by Frequent Itemset Techniques
SCAM '04 Proceedings of the Source Code Analysis and Manipulation, Fourth IEEE International Workshop
Evaluating Clone Detection Techniques from a Refactoring Perspective
Proceedings of the 19th IEEE international conference on Automated software engineering
Practical language-independent detection of near-miss clones
CASCON '04 Proceedings of the 2004 conference of the Centre for Advanced Studies on Collaborative research
An empirical study of code clone genealogies
Proceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on Foundations of software engineering
Improved Tool Support for the Investigation of Duplication in Software
ICSM '05 Proceedings of the 21st IEEE International Conference on Software Maintenance
On the Use of Clone Detection for Identifying Crosscutting Concern Code
IEEE Transactions on Software Engineering
CP-Miner: Finding Copy-Paste and Related Bugs in Large-Scale Software Code
IEEE Transactions on Software Engineering
"Cloning Considered Harmful" Considered Harmful
WCRE '06 Proceedings of the 13th Working Conference on Reverse Engineering
Clone Detection Using Abstract Syntax Suffix Trees
WCRE '06 Proceedings of the 13th Working Conference on Reverse Engineering
CP-Miner: a tool for finding copy-paste and related bugs in operating system code
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Algorithms for Reporting and Counting Geometric Intersections
IEEE Transactions on Computers
Comparison and Evaluation of Clone Detection Tools
IEEE Transactions on Software Engineering
Finding Clones with Dup: Analysis of an Experiment
IEEE Transactions on Software Engineering
An evaluation of code similarity identification for the grow-and-prune model
Journal of Software Maintenance and Evolution: Research and Practice - Special Issue on the 12th Conference on Software Maintenance and Reengineering (CSMR 2008)
Comparison and evaluation of code clone detection techniques and tools: A qualitative approach
Science of Computer Programming
Near-miss function clones in open source software: an empirical study
Journal of Software Maintenance and Evolution: Research and Practice - Working Conference on Reverse Engineering (WCRE 2008)
A hybrid approach (syntactic and textual) to clone detection
Proceedings of the 4th International Workshop on Software Clones
An extended assessment of type-3 clones as detected by state-of-the-art tools
Software Quality Control
Representing clones in a localized manner
Proceedings of the 5th International Workshop on Software Clones
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Understanding privacy policies
Empirical Software Engineering
Hi-index | 0.00 |
Reusing software through copying and pasting is a continuous plague in software development despite the fact that it creates serious maintenance problems. Various techniques have been proposed to find duplicated redundant code (also known as software clones). A recent study has compared these techniques and shown that token-based clone detection based on suffix trees is fast but yields clone candidates that are often not syntactic units. Current techniques based on abstract syntax trees--on the other hand--find syntactic clones but are considerably less efficient. This paper describes how we can make use of suffix trees to find syntactic clones in abstract syntax trees. This new approach is able to find syntactic clones in linear time and space. The paper reports the results of a large case study in which we empirically compare the new technique to other techniques using the Bellon benchmark for clone detectors. The Bellon benchmark consists of clone pairs validated by humans for eight software systems written in C or Java from different application domains. The new contributions of this paper over the conference paper are the additional analysis of Java programs, the exploration of an alternative path that uses parse trees instead of abstract syntax trees, and the investigation of the impact on recall and precision when clone analyses insist on consistent parameter renaming.