Framework for querying distributed objects managed by a grid infrastructure

  • Authors:
  • Ruslan Fomkin;Tore Risch

  • Affiliations:
  • Department of Information Technology, Uppsala University, Uppsala, Sweden;Department of Information Technology, Uppsala University, Uppsala, Sweden

  • Venue:
  • DMG 2005 Proceedings of the First VLDB conference on Data Management in Grids
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Queries over scientific data often imply expensive analyses of data requiring a lot of computational resources available in Grids. We are developing a customizable query processor built on top of an established Grid infrastructure, the NorduGrid middleware, and have implemented a framework for managing long running queries in Grid environment. With the framework the user does not specify the detailed job and parallelization descriptions required by NorduGrid. Instead s/he specifies queries in terms of an application-oriented schema describing contents of files managed by the Grid and accessed through wrappers. When a query is received by the system it generates NorduGrid job descriptions submitted to NorduGrid for execution. The framework considers limitations of NorduGrid. It includes a submission mechanism, a job babysitter, and a generic data exchange mechanism. The submission mechanism generates a number of jobs for parallel execution of a user query over wrapped data files. The task of the babysitter is to submit generated jobs to NorduGrid for the execution, to monitor their execution status, and to download results from the execution. The generic exchange mechanism provides a way to exchange objects through files between Grid execution nodes and user applications.