Network-aware data caching and prefetching for cloud-hosted metadata retrieval

  • Authors:
  • Bing Zhang;Brandon Ross;Sanatkumar Tripathi;Sonali Batra;Tevfik Kosar

  • Affiliations:
  • University at Buffalo (SUNY), Buffalo, New York;University at Buffalo (SUNY), Buffalo, New York;University at Buffalo (SUNY), Buffalo, New York;University at Buffalo (SUNY), Buffalo, New York;University at Buffalo (SUNY), Buffalo, New York

  • Venue:
  • NDM '13 Proceedings of the Third International Workshop on Network-Aware Data Management
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

With the overwhelming emergence of data-intensive applications in the Cloud, the wide-area transfer of metadata and other descriptive information about remote data is critically important for searching, indexing, and enumerating remote file system hierarchies, as well as for purposes of data transfer estimation and reservation. In this paper, we present a highly efficient network-aware caching and prefetching mechanism tailored to reduce metadata access latency and improve responsiveness in wide-area data transfers. To improve the maximum requests per second (RPS) handled by the system, we designed and implemented a network-aware prefetching service using dynamically provisioned parallel TCP streams. To improve the performance of accessing local metadata, we designed and implemented a non-blocking concurrent in-memory cache to handle unexpected bursts of requests. We have implemented the proposed mechanisms in the Directory Listing Service (DLS) system---a Cloud-hosted metadata retrieval, caching, and prefetching system, and have evaluated its performance on Amazon EC2 and XSEDE.