Distributed Management of Massive Data: An Efficient Fine-Grain Data Access Scheme

  • Authors:
  • Bogdan Nicolae;Gabriel Antoniu;Luc Bougé

  • Affiliations:
  • University of Rennes 1/IRISA, Rennes Cedex, France 35042;INRIA/IRISA, Rennes Cedex, France 35042;ENS Cachan Brittany/IRISA, Rennes Cedex, France 35042

  • Venue:
  • High Performance Computing for Computational Science - VECPAR 2008
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper addresses the problem of efficiently storing and accessing massive data blocks in a large-scale distributed environment, while providing efficient fine-grain access to data subsets. This issue is crucial in the context of applications in the field of databases, data mining and multimedia. We propose a data sharing service based on distributed, RAM-based storage of data, while leveraging a DHT-based, natively parallel metadata management scheme. As opposed to the most commonly used grid storage infrastructures that provide mechanisms for explicit data localization and transfer, we provide a transparent access model, where data are accessed through global identifiers. Our proposal has been validated through a prototype implementation whose preliminary evaluation on the Grid'5000 testbed provides promising results.