I/O streaming evaluation of batch queries for data-intensive computational turbulence

  • Authors:
  • Kalin Kanov;Eric Perlman;Randal Burns;Yanif Ahmad;Alexander Szalay

  • Affiliations:
  • Johns Hopkins University, Baltimore, Maryland;Johns Hopkins University, Baltimore, Maryland;Johns Hopkins University, Baltimore, Maryland;Johns Hopkins University, Baltimore, Maryland;Johns Hopkins University, Baltimore, Maryland

  • Venue:
  • Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

We describe a method for evaluating computational turbulence queries, including Lagrange Polynomial interpolation, based on partial sums that allows the underlying data to be accessed in any order and in parts. We exploit these properties to stream data from disk in a single pass and concurrently evaluate batch queries. The combination of sequential I/O and data sharing improves performance by an order of magnitude when compared with direct evaluation of each query. The technique also supports distributed evaluation of queries in a database cluster, assembling the partial sums from each node at the query mediator. Interpolation is fundamental to computational turbulence, over 95% of queries use these routines, and the partial sums method allows the JHU Turbulence Database Cluster to realize scale and throughput for our scientists' data-intensive workloads.