DIMM: a distributed metadata management for data-intensive HPC environments

  • Authors:
  • Brandon Szeliga;John Cavicchio;Weisong Shi

  • Affiliations:
  • Wayne State University, Detroit, MI, USA;Wayne State University, Detroit, MI, USA;Wayne State University, Detroit, MI, USA

  • Venue:
  • DADC '08 Proceedings of the 2008 international workshop on Data-aware distributed computing
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Within the scientific community many high performance applications are used in order to run experiments on data sets. These data sets can be very large in size or in number. Both of these situations can cause problems to the centralized manager scheduling the system. In our approach we minimize the role of the manager by using a distributed hash table. This way all files have a given "home" location to be at if they are going to be used which reduces location maintenance. We further reduce the strain on the central manager by using a counter-based bloomfilter. This allows the central manager to quickly and easily see if a given data set exists in the system without having to use the large storage space of a database. By adding false positive detection in the form of locality checks to the bloomfilter, we can reduce the probability of a false positive causing a problem in our system. In this fashion we can move away from a centralized manager to a more distributed one while reducing the amount of additional metadata the manager needs to maintain. With our approach we show that the workload of the centralized manager directory has been reduced significantly. Not only are we capable of reducing the number of files that must be migrated into the system by up to nearly 90% compared to a centralized storage-aware scheduling but it is also possible to achieve greater hit rates than a standard cache if job placement is influenced by data location. We also show that a significant number of false positives can be detected from the bloomfilter (at least 25%) at the cost of allowing smaller false negative instances to occur at an increased rate.