On the editing distance between unordered labeled trees
Information Processing Letters
Design patterns: elements of reusable object-oriented software
Design patterns: elements of reusable object-oriented software
Assessing software libraries by browsing similar classes, functions and relationships
Proceedings of the 21st international conference on Software engineering
Refactoring: improving the design of existing code
Refactoring: improving the design of existing code
Algorithmics and applications of tree and graph searching
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Algorithms on Trees and Graphs
Algorithms on Trees and Graphs
Clone Detection Using Abstract Syntax Trees
ICSM '98 Proceedings of the International Conference on Software Maintenance
Program representation and behavioural matching for localizing similar code fragments
CASCON '93 Proceedings of the 1993 conference of the Centre for Advanced Studies on Collaborative research: software engineering - Volume 1
Using structural context to recommend source code examples
Proceedings of the 27th international conference on Software engineering
K-gram based software birthmarks
Proceedings of the 2005 ACM symposium on Applied computing
Automatic generation of suggestions for program investigation
Proceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on Foundations of software engineering
Understanding source code evolution using abstract syntax tree matching
MSR '05 Proceedings of the 2005 international workshop on Mining software repositories
TreeRank: a similarity measure for nearest neighbor searching in phylogenetic database
SSDBM '03 Proceedings of the 15th International Conference on Scientific and Statistical Database Management
Fine-grained processing of CVS archives with APFEL
eclipse '06 Proceedings of the 2006 OOPSLA workshop on eclipse technology eXchange
Mining Software Repositories with iSPAROL and a Software Evolution Ontology
MSR '07 Proceedings of the Fourth International Workshop on Mining Software Repositories
Determining detailed structural correspondence for generalization tasks
Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
Change Distilling: Tree Differencing for Fine-Grained Source Code Change Extraction
IEEE Transactions on Software Engineering
Parseweb: a programmer assistant for reusing open source code on the web
Proceedings of the twenty-second IEEE/ACM international conference on Automated software engineering
Journal of Software Maintenance and Evolution: Research and Practice
A change-aware development environment by recording editing operations of source code
Proceedings of the 2008 international working conference on Mining software repositories
Document similarity based on concept tree distance
Proceedings of the nineteenth ACM conference on Hypertext and hypermedia
Comparison and evaluation of code clone detection techniques and tools: A qualitative approach
Science of Computer Programming
Detecting large-scale system problems by mining console logs
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Actively comparing clones inside the code editor
Proceedings of the 4th International Workshop on Software Clones
Semantic web enabled software analysis
Web Semantics: Science, Services and Agents on the World Wide Web
Operation-based, fine-grained version control model for tree-based representation
FASE'10 Proceedings of the 13th international conference on Fundamental Approaches to Software Engineering
Modeling how students learn to program
Proceedings of the 43rd ACM technical symposium on Computer Science Education
Learning programming languages through corrective feedback and concept visualisation
ICWL'11 Proceedings of the 10th international conference on Advances in Web-Based Learning
Detecting similar software applications
Proceedings of the 34th International Conference on Software Engineering
Identification of generalization refactoring opportunities
Automated Software Engineering
Data stream mining for predicting software build outcomes using source code metrics
Information and Software Technology
Hi-index | 0.00 |
Similarity analysis of source code is helpful during development to provide, for instance, better support for code reuse. Consider a development environment that analyzes code while typing and that suggests similar code examples or existing implementations from a source code repository. Mining software repositories by means of similarity measures enables and enforces reusing existing code and reduces the developing effort needed by creating a shared knowledge base of code fragments. In information retrieval similarity measures are often used to find documents similar to a given query document. This paper extends this idea to source code repositories. It introduces our approach to detect similar Java classes in software projects using tree similarity algorithms. We show how our approach allows to find similar Java classes based on an evaluation of three tree-based similarity measures in the context of five user-defined test cases as well as a preliminary software evolution analysis of a medium-sized Java project. Initial results of our technique indicate that it (1) is indeed useful to identify similar Java classes, (2)successfully identifies the ex ante and ex post versions of refactored classes, and (3) provides some interesting insights into within-version and between-version dependencies of classes within a Java project.