The vocabulary problem in human-system communication
Communications of the ACM
ACM Computing Surveys (CSUR)
Program understanding and the concept assignment problem
Communications of the ACM
Supporting the construction and evolution of component repositories
Proceedings of the 18th international conference on Software engineering
Journal of the American Society for Information Science - Special topic issue on the history of documentation and information science: part II
Assessing software libraries by browsing similar classes, functions and relationships
Proceedings of the 21st international conference on Software engineering
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Information Retrieval
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Supporting reuse by delivering task-relevant and personalized information
Proceedings of the 24th International Conference on Software Engineering
Component rank: relative significance rank for software component search
Proceedings of the 25th International Conference on Software Engineering
Hipikat: recommending pertinent software development artifacts
Proceedings of the 25th International Conference on Software Engineering
Assessing the relevance of identifier names in a legacy software system
CASCON '98 Proceedings of the 1998 conference of the Centre for Advanced Studies on Collaborative research
Program representation and behavioural matching for localizing similar code fragments
CASCON '93 Proceedings of the 1993 conference of the Centre for Advanced Studies on Collaborative research: software engineering - Volume 1
Requirements Engineering
Using structural context to recommend source code examples
Proceedings of the 27th international conference on Software engineering
Jungloid mining: helping to navigate the API jungle
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
The computation of word associations: comparing syntagmatic and paradigmatic approaches
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Automatic generation of suggestions for program investigation
Proceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on Foundations of software engineering
Detecting similar Java classes using tree algorithms
Proceedings of the 2006 international workshop on Mining software repositories
GPLAG: detection of software plagiarism by program dependence graph analysis
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
MUDABlue: an automatic categorization system for open source repositories
Journal of Systems and Software - Special issue: Selected papers from the 11th Asia Pacific software engineering conference (APSEC 2004)
XSnippet: mining For sample code
Proceedings of the 21st annual ACM SIGPLAN conference on Object-oriented programming systems, languages, and applications
Mica: A Web-Search Tool for Finding API Components and Examples
VLHCC '06 Proceedings of the Visual Languages and Human-Centric Computing
IEEE Transactions on Software Engineering
Finding Relevant Applications for Prototyping
MSR '07 Proceedings of the Fourth International Workshop on Mining Software Repositories
Parseweb: a programmer assistant for reusing open source code on the web
Proceedings of the twenty-second IEEE/ACM international conference on Automated software engineering
Proceedings of the twenty-second IEEE/ACM international conference on Automated software engineering
Approximate Dynamic Programming: Solving the Curses of Dimensionality (Wiley Series in Probability and Statistics)
Heavyweight Semantic Inducement for Requirement Elicitation and Analysis
SKG '07 Proceedings of the Third International Conference on Semantics, Knowledge and Grid
An approach to detecting duplicate bug reports using natural language and execution information
Proceedings of the 30th international conference on Software engineering
Introduction to Information Retrieval
Introduction to Information Retrieval
A theory of aspects as latent topics
Proceedings of the 23rd ACM SIGPLAN conference on Object-oriented programming systems languages and applications
SNIFF: A Search Engine for Java Using Free-Form Queries
FASE '09 Proceedings of the 12th International Conference on Fundamental Approaches to Software Engineering: Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2009
ICSE '09 Proceedings of the 31st International Conference on Software Engineering
Improving API documentation usability with knowledge pushing
ICSE '09 Proceedings of the 31st International Conference on Software Engineering
SpotWeb: Detecting Framework Hotspots and Coldspots via Mining Open Source Code on the Web
ASE '08 Proceedings of the 2008 23rd IEEE/ACM International Conference on Automated Software Engineering
A search engine for finding highly relevant applications
Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1
An empirical investigation into a large-scale Java open source code repository
Proceedings of the 2010 ACM-IEEE International Symposium on Empirical Software Engineering and Measurement
A study of the uniqueness of source code
Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering
Leveraging usage similarity for effective retrieval of examples in code repositories
Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering
Software bertillonage: finding the provenance of an entity
Proceedings of the 8th Working Conference on Mining Software Repositories
Portfolio: finding relevant functions and their usage
Proceedings of the 33rd International Conference on Software Engineering
Categorizing software applications for maintenance
ICSM '11 Proceedings of the 2011 27th IEEE International Conference on Software Maintenance
Rendezvous: a search engine for binary code
Proceedings of the 10th Working Conference on Mining Software Repositories
Extraction of product evolution tree from source code of product variants
Proceedings of the 17th International Software Product Line Conference
Hi-index | 0.00 |
Although popular text search engines allow users to retrieve similar web pages, source code search engines do not have this feature. Detecting similar applications is a notoriously difficult problem, since it implies that similar high-level requirements and their low-level implementations can be detected and matched automatically for different applications. We created a novel approach for automatically detecting Closely reLated ApplicatioNs (CLAN) that helps users detect similar applications for a given Java application. Our main contributions are an extension to a framework of relevance and a novel algorithm that computes a similarity index between Java applications using the notion of semantic layers that correspond to packages and class hierarchies. We have built CLAN and we conducted an experiment with 33 participants to evaluate CLAN and compare it with the closest competitive approach, MUDABlue. The results show with strong statistical significance that CLAN automatically detects similar applications from a large repository of 8,310 Java applications with a higher precision than MUDABlue.