Building Data-Intensive Grid Applications with Globus Toolkit --- An Evaluation Based on Web Crawling

  • Authors:
  • Andreas Walter;Klemens Böhm;Stephan Schosser

  • Affiliations:
  • IPE, FZI Forschungszentrum Informatik, Haid-und-Neu-Straße 10-14, 76131 Karlsruhe,;IPD, Universität Karlsruhe, Am Fasanengarten 5, 76131 Karlsruhe,;IPD, Universität Karlsruhe, Am Fasanengarten 5, 76131 Karlsruhe,

  • Venue:
  • ICSOC '07 Proceedings of the 5th international conference on Service-Oriented Computing
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Nowadays, there is a trend to create resource-consuming applications without building heavy computer centers, but to use resources on computer systems distributed over the internet. Grid middleware is a framework to access these resources. The concern of this paper is the evaluation of a specific grid middleware, namely Globus Toolkit, for data-intensive applications. As a test case, we have designed and implemented a service-based distributed web crawler on top of this middleware: A web crawler is a complex application consisting of many nodes. It imposes significantly higher demands on grid middleware regarding administrative flexibility compared to grid applications that allocate computing power of grid nodes. We have observed that some components of Globus Toolkit are flexible enough to provide the control functionality necessary for a web crawler, while others are not. For these other components, we propose possible extensions. Since we expect the combination of those characteristics to occur with many other grid applications as well, our study is of broader interest, beyond web crawling.