Effective Keyword Search for Software Resources Installed in Large-Scale Grid Infrastructures

  • Authors:
  • George Pallis;Asterios Katsifodimos;Marios D. Dikaiakos

  • Affiliations:
  • -;-;-

  • Venue:
  • WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we investigate the problem of supporting keyword-based searching for the discovery of software resources that are installed on the nodes of large-scale, federated Grid computing infrastructures. We address a number of challenges that arise from the unstructured nature of software and the unavailability of software-related metadata on Grid sites. We present Minersoft, a Grid harvester that visits Grid sites, crawls their file-systems, identifies and classifies software resources, and discovers implicit associations between them. The results of Minersoft harvesting are encoded in a weighted, typed graph, named the Software Graph. A number of IR algorithms are used to enrich this graph with structural and content associations, to annotate software resources with keywords, and build inverted indexes to support keyword-based searching for software. Using a real testbed, we present an evaluation study of our approach, using data extracted from a production-quality Grid infrastructure. Experimental results show that our approach achieves high search efficiency.