SOSP '91 Proceedings of the thirteenth ACM symposium on Operating systems principles
An Information Retrieval Approach for Automatically Constructing Software Libraries
IEEE Transactions on Software Engineering
Specification matching of software components
ACM Transactions on Software Engineering and Methodology (TOSEM)
Proceedings of the 11th international conference on World Wide Web
Cumulated gain-based evaluation of IR techniques
ACM Transactions on Information Systems (TOIS)
Recovering Traceability Links between Code and Documentation
IEEE Transactions on Software Engineering
Recovering documentation-to-source-code traceability links using latent semantic indexing
Proceedings of the 25th International Conference on Software Engineering
A Survey on Software Components Search and Retrieval
EUROMICRO '04 Proceedings of the 30th EUROMICRO Conference
Richer File System Metadata Using Links and Attributes
MSST '05 Proceedings of the 22nd IEEE / 13th NASA Goddard Conference on Mass Storage Systems and Technologies
Ranking Significance of Software Components Based on Use Relations
IEEE Transactions on Software Engineering
The Anatomy of the Grid: Enabling Scalable Virtual Organizations
International Journal of High Performance Computing Applications
Connections: using context to enhance file search
Proceedings of the twentieth ACM symposium on Operating systems principles
Graph evolution: Densification and shrinking diameters
ACM Transactions on Knowledge Discovery from Data (TKDD)
Optimizing web search using social annotations
Proceedings of the 16th international conference on World Wide Web
A cooperative classification mechanism for search and retrieval software components
Proceedings of the 2007 ACM symposium on Applied computing
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Finding similar files in a large file system
WTEC'94 Proceedings of the USENIX Winter 1994 Technical Conference on USENIX Winter 1994 Technical Conference
ACM Transactions on Software Engineering and Methodology (TOSEM)
Confluence: enhancing contextual desktop search
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
The relationship between IR effectiveness measures and user satisfaction
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
X-Site: a workplace search tool for software engineers
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Supporting intelligent Web search
ACM Transactions on Internet Technology (TOIT)
SEC+: an enhanced search engine for component-based software development
ACM SIGSOFT Software Engineering Notes
A semantic approach to approximate service retrieval
ACM Transactions on Internet Technology (TOIT)
Evaluating the Software Architecture Competence of Organizations
WICSA '08 Proceedings of the Seventh Working IEEE/IFIP Conference on Software Architecture (WICSA 2008)
On ranking techniques for desktop search
ACM Transactions on Information Systems (TOIS)
An empirical investigation of software reuse benefits in a large telecom product
ACM Transactions on Software Engineering and Methodology (TOSEM)
Scalable detection of semantic clones
Proceedings of the 30th international conference on Software engineering
Searching and navigating petabyte-scale file systems based on facets
PDSW '07 Proceedings of the 2nd international workshop on Petascale data storage: held in conjunction with Supercomputing '07
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Novelty and diversity in information retrieval evaluation
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Semantics-based composition-oriented discovery of Web services
ACM Transactions on Internet Technology (TOIT)
Improving Web search using image snippets
ACM Transactions on Internet Technology (TOIT)
The Claremont report on database research
ACM SIGMOD Record
Web services provision: solutions, challenges and opportunities (invited paper)
Proceedings of the 3rd International Conference on Ubiquitous Information Management and Communication
Sourcerer: mining and searching internet-scale software repositories
Data Mining and Knowledge Discovery
Harvesting Large-Scale Grids for Software Resources
CCGRID '09 Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid
Effective Keyword Search for Software Resources Installed in Large-Scale Grid Infrastructures
WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Cloud Computing: Distributed Internet Computing for IT and Scientific Research
IEEE Internet Computing
Communications of the ACM
A search engine for finding highly relevant applications
Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1
Portfolio: finding relevant functions and their usage
Proceedings of the 33rd International Conference on Software Engineering
How Well Do Search Engines Support Code Retrieval on the Web?
ACM Transactions on Software Engineering and Methodology (TOSEM)
Topology analysis of software dependencies
ACM Transactions on Software Engineering and Methodology (TOSEM)
Hi-index | 0.00 |
One of the main goals of Cloud and Grid infrastructures is to make their services easily accessible and attractive to end-users. In this article we investigate the problem of supporting keyword-based searching for the discovery of software files that are installed on the nodes of large-scale, federated Grid and Cloud computing infrastructures. We address a number of challenges that arise from the unstructured nature of software and the unavailability of software-related metadata on large-scale networked environments. We present Minersoft, a harvester that visits Grid/Cloud infrastructures, crawls their file systems, identifies and classifies software files, and discovers implicit associations between them. The results of Minersoft harvesting are encoded in a weighted, typed graph, called the Software Graph. A number of information retrieval (IR) algorithms are used to enrich this graph with structural and content associations, to annotate software files with keywords and build inverted indexes to support keyword-based searching for software. Using a real testbed, we present an evaluation study of our approach, using data extracted from production-quality Grid and Cloud computing infrastructures. Experimental results show that Minersoft is a powerful tool for software search and discovery.