Suffix arrays: a new method for on-line string searches
SIAM Journal on Computing
Design patterns: elements of reusable object-oriented software
Design patterns: elements of reusable object-oriented software
ICSE '94 Proceedings of the 16th international conference on Software engineering
Parameterized Duplication in Strings: Algorithms and an Application to Software Maintenance
SIAM Journal on Computing
Analysis patterns: reusable objects models
Analysis patterns: reusable objects models
Data mining: concepts and techniques
Data mining: concepts and techniques
Design and implementation of generics for the .NET Common language runtime
Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation
Software Engineering
CCFinder: a multilinguistic token-based code clone detection system for large scale source code
IEEE Transactions on Software Engineering
Substring Matching for Clone Detection and Change Tracking
ICSM '94 Proceedings of the International Conference on Software Maintenance
Experiment on the Automatic Detection of Function Clones in a Software System Using Metrics
ICSM '96 Proceedings of the 1996 International Conference on Software Maintenance
The Enhanced Suffix Array and Its Applications to Genome Analysis
WABI '02 Proceedings of the Second International Workshop on Algorithms in Bioinformatics
Optimal Exact Strring Matching Based on Suffix Arrays
SPIRE 2002 Proceedings of the 9th International Symposium on String Processing and Information Retrieval
Linear-Time Longest-Common-Prefix Computation in Suffix Arrays and Its Applications
CPM '01 Proceedings of the 12th Annual Symposium on Combinatorial Pattern Matching
On finding duplication and near-duplication in large software systems
WCRE '95 Proceedings of the Second Working Conference on Reverse Engineering
Identifying Similar Code with Program Dependence Graphs
WCRE '01 Proceedings of the Eighth Working Conference on Reverse Engineering (WCRE'01)
Clone Detection Using Abstract Syntax Trees
ICSM '98 Proceedings of the International Conference on Software Maintenance
A Language Independent Approach for Detecting Duplicated Code
ICSM '99 Proceedings of the IEEE International Conference on Software Maintenance
A Fast Algorithms for Making Suffix Arrays and for Burrows-Wheeler Transformation
DCC '98 Proceedings of the Conference on Data Compression
Eliminating redundancies with a "composition with adaptation" meta-programming technique
Proceedings of the 9th European software engineering conference held jointly with 11th ACM SIGSOFT international symposium on Foundations of software engineering
Identifying redundancy in source code using fingerprints
CASCON '93 Proceedings of the 1993 conference of the Centre for Advanced Studies on Collaborative research: software engineering - Volume 1
The Performance of Linear Time Suffix Sorting Algorithms
DCC '05 Proceedings of the Data Compression Conference
Beyond templates: a study of clones in the STL and some general implications
Proceedings of the 27th international conference on Software engineering
Linear-time construction of suffix arrays
CPM'03 Proceedings of the 14th annual conference on Combinatorial pattern matching
Space efficient linear time construction of suffix arrays
CPM'03 Proceedings of the 14th annual conference on Combinatorial pattern matching
Simple linear work suffix array construction
ICALP'03 Proceedings of the 30th international conference on Automata, languages and programming
Research journey towards industrial application of reuse technique
Proceedings of the 28th international conference on Software engineering
Proceedings of the 28th international conference on Software engineering
Proceedings of the 2006 international workshop on Summit on software engineering education
Approximate Structural Context Matching: An Approach to Recommend Relevant Examples
IEEE Transactions on Software Engineering
DECKARD: Scalable and Accurate Tree-Based Detection of Code Clones
ICSE '07 Proceedings of the 29th international conference on Software Engineering
Refactoring--Does It Improve Software Quality?
WoSQ '07 Proceedings of the 5th International Workshop on Software Quality
Method and implementation for investigating code clones in a software system
Information and Software Technology
Mining specifications of malicious behavior
Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
Context-based detection of clone-related bugs
Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
Determining detailed structural correspondence for generalization tasks
Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
Efficient token based clone detection with flexible tokenization
Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
Efficient token based clone detection with flexible tokenization
The 6th Joint Meeting on European software engineering conference and the ACM SIGSOFT symposium on the foundations of software engineering: companion papers
Assisting potentially-repetitive small-scale changes via semi-automated heuristic search
Proceedings of the twenty-second IEEE/ACM international conference on Automated software engineering
Mining specifications of malicious behavior
ISEC '08 Proceedings of the 1st India software engineering conference
Scalable detection of semantic clones
Proceedings of the 30th international conference on Software engineering
Conquering Fine-Grained Blends of Design Patterns
ICSR '08 Proceedings of the 10th international conference on Software Reuse: High Confidence Software Reuse in Large Systems
Software Reuse beyond Components with XVCL (Tutorial)
Generative and Transformational Techniques in Software Engineering II
Clone detection and removal for Erlang/OTP within a refactoring environment
Proceedings of the 2009 ACM SIGPLAN workshop on Partial evaluation and program manipulation
Detecting code clones in binary executables
Proceedings of the eighteenth international symposium on Software testing and analysis
Algorithm recognition by static analysis and its application in students' submissions assessment
Koli '08 Proceedings of the 8th International Conference on Computing Education Research
Proceedings of the 33rd International Conference on Software Engineering
Combining clustering and pattern detection for the reengineering of component-based software systems
Proceedings of the joint ACM SIGSOFT conference -- QoSA and ACM SIGSOFT symposium -- ISARCS on Quality of software architectures -- QoSA and architecting critical systems -- ISARCS
University-industry collaboration journey towards product lines
ICSR'11 Proceedings of the 12th international conference on Top productivity through software reuse
Applying a generative technique for enhanced genericity and maintainability on the J2EE platform
GPCE'05 Proceedings of the 4th international conference on Generative Programming and Component Engineering
SPLC'05 Proceedings of the 9th international conference on Software Product Lines
Hi-index | 0.00 |
Cloning in software systems is known to create problems during software maintenance. Several techniques have been proposed to detect the same or similar code fragments in software, so-called simple clones. While the knowledge of simple clones is useful, detecting design-level similarities in software could ease maintenance even further, and also help us identify reuse opportunities. We observed that recurring patterns of simple clones - so-called structural clones - often indicate the presence of interesting design-level similarities. An example would be patterns of collaborating classes or components. Finding structural clones that signify potentially useful design information requires efficient techniques to analyze the bulk of simple clone data and making non-trivial inferences based on the abstracted information. In this paper, we describe a practical solution to the problem of detecting some basic, but useful, types of design-level similarities such as groups of highly similar classes or files. First, we detect simple clones by applying conventional token-based techniques. Then we find the patterns of co-occurring clones in different files using the Frequent Itemset Mining (FIM) technique. Finally, we perform file clustering to detect those clusters of highly similar files that are likely to contribute to a design-level similarity pattern. The novelty of our approach is application of data mining techniques to detect design level similarities. Experiments confirmed that our method finds many useful structural clones and scales up to big programs. The paper describes our method for structural clone detection, a prototype tool called Clone Miner that implements the method and experimental results.