A C++ Data Model Supporting Reachability Analysis and Dead Code Detection
IEEE Transactions on Software Engineering
Code web: data mining library reuse patterns
ICSE '01 Proceedings of the 23rd International Conference on Software Engineering
Component rank: relative significance rank for software component search
Proceedings of the 25th International Conference on Software Engineering
Implementing relational views of programs
SDE 1 Proceedings of the first ACM SIGSOFT/SIGPLAN software engineering symposium on Practical software development environments
Queries and Views of Programs Using a Relational Database System
Queries and Views of Programs Using a Relational Database System
Java(TM) Language Specification, The (3rd Edition) (Java (Addison-Wesley))
Java(TM) Language Specification, The (3rd Edition) (Java (Addison-Wesley))
XIRC: A Kernel for Cross-Artifact Information Engineering in Software Development Environments
WCRE '04 Proceedings of the 11th Working Conference on Reverse Engineering
Using structural context to recommend source code examples
Proceedings of the 27th international conference on Software engineering
Jungloid mining: helping to navigate the API jungle
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Hipikat: A Project Memory for Software Development
IEEE Transactions on Software Engineering
XSnippet: mining For sample code
Proceedings of the 21st annual ACM SIGPLAN conference on Object-oriented programming systems, languages, and applications
Finding Relevant Applications for Prototyping
MSR '07 Proceedings of the Fourth International Workshop on Mining Software Repositories
Mining Eclipse Developer Contributions via Author-Topic Models
MSR '07 Proceedings of the Fourth International Workshop on Mining Software Repositories
Assieme: finding and leveraging implicit references in a web search interface for programmers
Proceedings of the 20th annual ACM symposium on User interface software and technology
CodeGenie:: a tool for test-driven source code search
Companion to the 22nd ACM SIGPLAN conference on Object-oriented programming systems and applications companion
Parseweb: a programmer assistant for reusing open source code on the web
Proceedings of the twenty-second IEEE/ACM international conference on Automated software engineering
Mining concepts from code with probabilistic topic models
Proceedings of the twenty-second IEEE/ACM international conference on Automated software engineering
CodeGenie: using test-cases to search and reuse source code
Proceedings of the twenty-second IEEE/ACM international conference on Automated software engineering
A Model to Understand the Building and Running Inter-Dependencies of Software
WCRE '07 Proceedings of the 14th Working Conference on Reverse Engineering
Lightweight, Semi-automated Enactment of Pragmatic-Reuse Plans
ICSR '08 Proceedings of the 10th international conference on Software Reuse: High Confidence Software Reuse in Large Systems
Code Conjurer: Pulling Reusable Software out of Thin Air
IEEE Software
A theory of aspects as latent topics
Proceedings of the 23rd ACM SIGPLAN conference on Object-oriented programming systems languages and applications
Two studies of opportunistic programming: interleaving web foraging, learning, and writing code
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Sourcerer: mining and searching internet-scale software repositories
Data Mining and Knowledge Discovery
Applying test-driven code search to the reuse of auxiliary functionality
Proceedings of the 2009 ACM symposium on Applied Computing
SNIFF: A Search Engine for Java Using Free-Form Queries
FASE '09 Proceedings of the 12th International Conference on Fundamental Approaches to Software Engineering: Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2009
Alitheia Core: An extensible software quality monitoring platform
ICSE '09 Proceedings of the 31st International Conference on Software Engineering
Sourcerer: An internet-scale software repository
SUITE '09 Proceedings of the 2009 ICSE Workshop on Search-Driven Development-Users, Infrastructure, Tools and Evaluation
MSR '09 Proceedings of the 2009 6th IEEE International Working Conference on Mining Software Repositories
Learning from examples to improve code completion systems
Proceedings of the the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
SpotWeb: Detecting Framework Hotspots and Coldspots via Mining Open Source Code on the Web
ASE '08 Proceedings of the 2008 23rd IEEE/ACM International Conference on Automated Software Engineering
The Small Project Observatory: Visualizing software ecosystems
Science of Computer Programming
Searching API usage examples in code repositories with sourcerer API search
Proceedings of 2010 ICSE Workshop on Search-driven Development: Users, Infrastructure, Tools and Evaluation
Leveraging usage similarity for effective retrieval of examples in code repositories
Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering
A test-driven approach to code search and its application to the reuse of auxiliary functionality
Information and Software Technology
Guest editors' introduction to the 4th issue of Experimental Software and Toolkits (EST-4)
Science of Computer Programming
Hi-index | 0.00 |
A large amount of open source code is now available online, presenting a great potential resource for software developers. This has motivated software engineering researchers to develop tools and techniques to allow developers to reap the benefits of these billions of lines of source code. However, collecting and analyzing such a large quantity of source code presents a number of challenges. Although the current generation of open source code search engines provides access to the source code in an aggregated repository, they generally fail to take advantage of the rich structural information contained in the code they index. This makes them significantly less useful than Sourcerer for building state-of-the-art software engineering tools, as these tools often require access to both the structural and textual information available in source code. We have developed Sourcerer, an infrastructure for large-scale collection and analysis of open source code. By taking full advantage of the structural information extracted from source code in its repository, Sourcerer provides a foundation upon which state-of-the-art search engines and related tools can easily be built. We describe the Sourcerer infrastructure, present the applications that we have built on top of it, and discuss how existing tools could benefit from using Sourcerer.