WCRE '00 Proceedings of the Seventh Working Conference on Reverse Engineering (WCRE'00)
TA-RE: an exchange language for mining software repositories
Proceedings of the 2006 international workshop on Mining software repositories
Sourcerer: An internet-scale software repository
SUITE '09 Proceedings of the 2009 ICSE Workshop on Search-Driven Development-Users, Infrastructure, Tools and Evaluation
Towards sharing source code facts using linked data
Proceedings of the 3rd International Workshop on Search-Driven Development: Users, Infrastructure, Tools, and Evaluation
Hi-index | 0.00 |
The mining of software repository involves the extraction of both basic and value-added information from existing software repositories. Depending on stakeholders (e.g., researchers, management), these repositories are mined several times for different application purposes. To avoid unnecessary pre-processing steps and improve productivity, sharing, and integration of extracted facts and results are needed. The motivation of this research is to introduce a novel collaborative sharing platform for software datasets that supports on-the-fly inter-datasets integration. We want to facilitate and promote a paradigm shift in the source code analysis domain, similar to the one by Wikipedia in the knowledge-sharing domain. In this paper, we present the SeCold project, which is the first online, publicly available software ecosystem Linked Data dataset. As part of this research, not only theoretical background on how to publish such datasets is provided, but also the actual dataset. SeCold contains about two billion facts, such as source code statements, software licenses, and code clones from over 18.000 software projects. SeCold is also an official member of the Linked Data cloud and one of the eight largest online Linked Data datasets available on the cloud.