The program dependence graph and its use in optimization
ACM Transactions on Programming Languages and Systems (TOPLAS)
Simple fast algorithms for the editing distance between trees and related problems
SIAM Journal on Computing
The art of computer programming, volume 1 (3rd ed.): fundamental algorithms
The art of computer programming, volume 1 (3rd ed.): fundamental algorithms
Parameterized Duplication in Strings: Algorithms and an Application to Software Maintenance
SIAM Journal on Computing
Pattern matching for clone and concept detection
Reverse engineering
Semantics-preserving procedure extraction
Proceedings of the 27th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
An empirical study of operating systems errors
SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
CCFinder: a multilinguistic token-based code clone detection system for large scale source code
IEEE Transactions on Software Engineering
Experiment on the Automatic Detection of Function Clones in a Software System Using Metrics
ICSM '96 Proceedings of the 1996 International Conference on Software Maintenance
Similarity Search in High Dimensions via Hashing
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
On finding duplication and near-duplication in large software systems
WCRE '95 Proceedings of the Second Working Conference on Reverse Engineering
Assessing the Benefits of Incorporating Function Clone Detection in a Development Process
ICSM '97 Proceedings of the International Conference on Software Maintenance
Clone Detection Using Abstract Syntax Trees
ICSM '98 Proceedings of the International Conference on Software Maintenance
Winnowing: local algorithms for document fingerprinting
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Locality-sensitive hashing scheme based on p-stable distributions
SCG '04 Proceedings of the twentieth annual symposium on Computational geometry
DMS®: Program Transformations for Practical Scalable Software Evolution
Proceedings of the 26th International Conference on Software Engineering
Clone Detection in Source Code by Frequent Itemset Techniques
SCAM '04 Proceedings of the Source Code Analysis and Manipulation, Fourth IEEE International Workshop
Similarity evaluation on tree-structured data
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Detecting higher-level similarity patterns in programs
Proceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on Foundations of software engineering
An empirical study of code clone genealogies
Proceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on Foundations of software engineering
Proceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on Foundations of software engineering
CP-Miner: a tool for finding copy-paste and related bugs in operating system code
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Context-based detection of clone-related bugs
Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
Structural analysis and visualization of C++ code evolution using syntax trees
Ninth international workshop on Principles of software evolution: in conjunction with the 6th ESEC/FSE joint meeting
Scalable detection of semantic clones
Proceedings of the 30th international conference on Software engineering
Clone detection in automotive model-based development
Proceedings of the 30th international conference on Software engineering
"Cloning considered harmful" considered harmful: patterns of cloning in software
Empirical Software Engineering
Clone detection and removal for Erlang/OTP within a refactoring environment
Proceedings of the 2009 ACM SIGPLAN workshop on Partial evaluation and program manipulation
An information retrieval process to aid in the analysis of code clones
Empirical Software Engineering
An evaluation of code similarity identification for the grow-and-prune model
Journal of Software Maintenance and Evolution: Research and Practice - Special Issue on the 12th Conference on Software Maintenance and Reengineering (CSMR 2008)
Comparison and evaluation of code clone detection techniques and tools: A qualitative approach
Science of Computer Programming
SNIFF: A Search Engine for Java Using Free-Form Queries
FASE '09 Proceedings of the 12th International Conference on Fundamental Approaches to Software Engineering: Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2009
Accurate and Efficient Structural Characteristic Feature Extraction for Clone Detection
FASE '09 Proceedings of the 12th International Conference on Fundamental Approaches to Software Engineering: Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2009
Complete and accurate clone detection in graph-based models
ICSE '09 Proceedings of the 31st International Conference on Software Engineering
ICSE '09 Proceedings of the 31st International Conference on Software Engineering
CloneDetective - A workbench for clone detection research
ICSE '09 Proceedings of the 31st International Conference on Software Engineering
Automatic mining of functionally equivalent code fragments via random testing
Proceedings of the eighteenth international symposium on Software testing and analysis
Detecting code clones in binary executables
Proceedings of the eighteenth international symposium on Software testing and analysis
Clone detection via structural abstraction
Software Quality Control
Get to know your clones with CeDAR
Proceedings of the 24th ACM SIGPLAN conference companion on Object oriented programming systems languages and applications
Cleman: Comprehensive Clone Group Evolution Management
ASE '08 Proceedings of the 2008 23rd IEEE/ACM International Conference on Automated Software Engineering
Tree-pattern-based duplicate code detection
Proceedings of the ACM first international workshop on Data-intensive software management and mining
Clone detection and elimination for Haskell
Proceedings of the 2010 ACM SIGPLAN workshop on Partial evaluation and program manipulation
An empirical study on the maintenance of source code clones
Empirical Software Engineering
What would other programmers do: suggesting solutions to error messages
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Clone region descriptors: Representing and tracking duplication in source code
ACM Transactions on Software Engineering and Methodology (TOSEM)
Sub-clone refactoring in open source software artifacts
Proceedings of the 2010 ACM Symposium on Applied Computing
Finding similar defects using synonymous identifier retrieval
Proceedings of the 4th International Workshop on Software Clones
Towards a multi-scale approach for source code approximate match report
Proceedings of the 4th International Workshop on Software Clones
Matching dependence-related queries in the system dependence graph
Proceedings of the IEEE/ACM international conference on Automated software engineering
Code clones in feature-oriented software product lines
GPCE '10 Proceedings of the ninth international conference on Generative programming and component engineering
Scalable and systematic detection of buggy inconsistencies in source code
Proceedings of the ACM international conference on Object oriented programming systems languages and applications
A study of the uniqueness of source code
Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering
Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering
Measuring the effects of aspect-oriented refactoring on component relationships: two case studies
Proceedings of the tenth international conference on Aspect-oriented software development
Extracting code clones for refactoring using combinations of clone metrics
Proceedings of the 5th International Workshop on Software Clones
Scalable clone detection using description logic
Proceedings of the 5th International Workshop on Software Clones
Representing clones in a localized manner
Proceedings of the 5th International Workshop on Software Clones
MeCC: memory comparison-based clone detector
Proceedings of the 33rd International Conference on Software Engineering
Frequency and risks of changes to clones
Proceedings of the 33rd International Conference on Software Engineering
Value-based program characterization and its application to software plagiarism detection
Proceedings of the 33rd International Conference on Software Engineering
Incremental clone detection and elimination for erlang programs
FASE'11/ETAPS'11 Proceedings of the 14th international conference on Fundamental approaches to software engineering: part of the joint European conferences on theory and practice of software
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Similar code detection and elimination for erlang programs
PADL'10 Proceedings of the 12th international conference on Practical Aspects of Declarative Languages
A scalable and accurate approach based on count matrix for detecting code clones
Proceedings of the 11th annual international conference on Aspect-oriented Software Development Companion
An empirical study on inconsistent changes to code clones at the release level
Science of Computer Programming
Plagiarizing smartphone applications: attack strategies and defense techniques
ESSoS'12 Proceedings of the 4th international conference on Engineering Secure Software and Systems
Automatic source code transformation for GPUs based on program comprehension
Euro-Par'11 Proceedings of the 2011 international conference on Parallel Processing - Volume 2
Empirical Software Engineering
What kind of and how clones are refactored?: a case study of three OSS projects
Proceedings of the Fifth Workshop on Refactoring Tools
A first step towards algorithm plagiarism detection
Proceedings of the 2012 International Symposium on Software Testing and Analysis
CBCD: cloned buggy code detector
Proceedings of the 34th International Conference on Software Engineering
Active refinement of clone anomaly reports
Proceedings of the 34th International Conference on Software Engineering
Cloning in DSLs: experiments with OCL
SLE'11 Proceedings of the 4th international conference on Software Language Engineering
Can I clone this piece of code here?
Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering
Boreas: an accurate and scalable token-based approach to code clone detection
Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering
Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering
Increasing clone maintenance support by unifying clone detection and refactoring activities
Information and Software Technology
Typestate-based semantic code search over partial programs
Proceedings of the ACM international conference on Object oriented programming systems languages and applications
Enriching Documents with Examples: A Corpus Mining Approach
ACM Transactions on Information Systems (TOIS)
XIAO: tuning code clones at hands of engineers in practice
Proceedings of the 28th Annual Computer Security Applications Conference
Code flows: visualizing structural evolution of source code
EuroVis'08 Proceedings of the 10th Joint Eurographics / IEEE - VGTC conference on Visualization
Resource requirement prediction using clone detection technique
Future Generation Computer Systems
RAMC: runtime abstract memory context based plagiarism detection in binary code
Proceedings of the 7th International Conference on Ubiquitous Information Management and Communication
Detecting source code similarity using code abstraction
Proceedings of the 7th International Conference on Ubiquitous Information Management and Communication
Disguised malware script detection system using hybrid genetic algorithm
Proceedings of the 28th Annual ACM Symposium on Applied Computing
Juxtapp: a scalable system for detecting code reuse among android applications
DIMVA'12 Proceedings of the 9th international conference on Detection of Intrusions and Malware, and Vulnerability Assessment
Rendezvous: a search engine for binary code
Proceedings of the 10th Working Conference on Mining Software Repositories
Searching for better configurations: a rigorous approach to clone evaluation
Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering
ECOOP'13 Proceedings of the 27th European conference on Object-Oriented Programming
Understanding the genetic makeup of Linux device drivers
Proceedings of the Seventh Workshop on Programming Languages and Operating Systems
Systematic audit of third-party android phones
Proceedings of the 4th ACM conference on Data and application security and privacy
Hi-index | 0.00 |
Detecting code clones has many software engineering applications. Existing approaches either do not scale to large code bases or are not robust against minor code modifications. In this paper, we present an efficient algorithm for identifying similar subtrees and apply it to tree representations of source code. Our algorithm is based on a novel characterization of subtrees with numerical vectors in the Euclidean space \mathbb{R}^n and an efficient algorithm to cluster these vectors w.r.t. the Euclidean distance metric. Subtrees with vectors in one cluster are considered similar. We have implemented our tree similarity algorithm as a clone detection tool called DECKARD and evaluated it on large code bases written in C and Java including the Linux kernel and JDK. Our experiments show that DECKARD is both scalable and accurate. It is also language independent, applicable to any language with a formally specified grammar.