Support for Data-Intensive, Variable-Granularity Grid Applications via Distributed File System Virtualization - A Case Study of Light Scattering Spectroscopy

  • Authors:
  • Affiliations:
  • Venue:
  • CLADE '04 Proceedings of the 2nd International Workshop on Challenges of Large Applications in Distributed Environments
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

A key challenge faced by large-scale, distributedapplications in Grid environments is efficient,seamless data management. In particular, forapplications that can benefit from access to data atvariable granularities, data management can poseadditional programming burdens to an applicationdeveloper. This paper presents a case for the use ofvirtualized distributed file systems as a basis for datamanagement for data-intensive, variable-granularityapplications. The approach leverages on-demandtransfer mechanisms of existing, de-facto network filesystem clients and servers that support transfers ofpartial data sets in an application-transparent fashion,and complement them with user-level performance andfunctionality enhancements such as caching andencrypted communication channels. The paper uses anascent application from the medical imaging field(Light Scattering Spectroscopy - LSS) as a motivationfor the approach, and as a basis for evaluating itsperformance. Results from performance experimentsthat consider the 16-processor parallel execution ofLSS analysis and database generation programs showthat, in the presence of data locality, a virtualizedwide-area distributed file system setup and configuredby Grid middleware can achieve performance levelsclose (13% overhead or less) to that of a local disk,and superior (up to 680% speedup) to non-virtualizeddistributed file systems.