SOSP '91 Proceedings of the thirteenth ACM symposium on Operating systems principles
An Information Retrieval Approach for Automatically Constructing Software Libraries
IEEE Transactions on Software Engineering
Cumulated gain-based evaluation of IR techniques
ACM Transactions on Information Systems (TOIS)
Recovering Traceability Links between Code and Documentation
IEEE Transactions on Software Engineering
Recovering documentation-to-source-code traceability links using latent semantic indexing
Proceedings of the 25th International Conference on Software Engineering
Design, retrieval, and assembly in component-based software development
Communications of the ACM - Blueprint for the future of high-performance networking
A Survey on Software Components Search and Retrieval
EUROMICRO '04 Proceedings of the 30th EUROMICRO Conference
A Search Architecture for Grid Software Components
WI '04 Proceedings of the 2004 IEEE/WIC/ACM International Conference on Web Intelligence
Richer File System Metadata Using Links and Attributes
MSST '05 Proceedings of the 22nd IEEE / 13th NASA Goddard Conference on Mass Storage Systems and Technologies
Connections: using context to enhance file search
Proceedings of the twentieth ACM symposium on Operating systems principles
Optimizing web search using social annotations
Proceedings of the 16th international conference on World Wide Web
A cooperative classification mechanism for search and retrieval software components
Proceedings of the 2007 ACM symposium on Applied computing
Information re-retrieval: repeat queries in Yahoo's logs
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Confluence: enhancing contextual desktop search
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
The relationship between IR effectiveness measures and user satisfaction
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
X-Site: a workplace search tool for software engineers
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
SEC+: an enhanced search engine for component-based software development
ACM SIGSOFT Software Engineering Notes
Evaluating the Software Architecture Competence of Organizations
WICSA '08 Proceedings of the Seventh Working IEEE/IFIP Conference on Software Architecture (WICSA 2008)
On ranking techniques for desktop search
ACM Transactions on Information Systems (TOIS)
Searching and navigating petabyte-scale file systems based on facets
PDSW '07 Proceedings of the 2nd international workshop on Petascale data storage: held in conjunction with Supercomputing '07
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Novelty and diversity in information retrieval evaluation
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
The Claremont report on database research
ACM SIGMOD Record
Harvesting Large-Scale Grids for Software Resources
CCGRID '09 Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid
Minersoft: Software retrieval in grid and cloud computing infrastructures
ACM Transactions on Internet Technology (TOIT)
Hi-index | 0.00 |
In this paper, we investigate the problem of supporting keyword-based searching for the discovery of software resources that are installed on the nodes of large-scale, federated Grid computing infrastructures. We address a number of challenges that arise from the unstructured nature of software and the unavailability of software-related metadata on Grid sites. We present Minersoft, a Grid harvester that visits Grid sites, crawls their file-systems, identifies and classifies software resources, and discovers implicit associations between them. The results of Minersoft harvesting are encoded in a weighted, typed graph, named the Software Graph. A number of IR algorithms are used to enrich this graph with structural and content associations, to annotate software resources with keywords, and build inverted indexes to support keyword-based searching for software. Using a real testbed, we present an evaluation study of our approach, using data extracted from a production-quality Grid infrastructure. Experimental results show that our approach achieves high search efficiency.