Scalable community-driven data sharing in e-science grids

  • Authors:
  • Tobias Scholl;Bernhard Bauer;Benjamin Gufler;Richard Kuntschke;Angelika Reiser;Alfons Kemper

  • Affiliations:
  • Institut für Informatik, Technische Universität München, 85748 Garching bei München, Germany;Institut für Informatik, Technische Universität München, 85748 Garching bei München, Germany;Institut für Informatik, Technische Universität München, 85748 Garching bei München, Germany;Institut für Informatik, Technische Universität München, 85748 Garching bei München, Germany;Institut für Informatik, Technische Universität München, 85748 Garching bei München, Germany;Institut für Informatik, Technische Universität München, 85748 Garching bei München, Germany

  • Venue:
  • Future Generation Computer Systems
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

E-science projects of various disciplines face a fundamental challenge: thousands of users want to obtain new scientific results by application-specific and dynamic correlation of data from globally distributed sources. Considering the involved enormous and exponentially growing data volumes, centralized data management reaches its limits. Since scientific data are often highly skewed and exploration tasks exhibit a large degree of spatial locality, we propose the locality-aware allocation of data objects onto a distributed network of interoperating databases. HiSbase is an approach to data management in scientific federated Data Grids that addresses the scalability issue by combining established techniques of database research in the field of spatial data structures (quadtrees), histograms, and parallel databases with the scalable resource sharing and load balancing capabilities of decentralized Peer-to-Peer (P2P) networks. The proposed combination constitutes a complementary e-science infrastructure enabling load balancing and increased query throughput.