A parallel input-output system for resolving spatial data challenges: an agent-based model case study

  • Authors:
  • Eric Shook;Shaowen Wang

  • Affiliations:
  • University of Illinois at Urbana-Champaign, Urbana, IL;University of Illinois at Urbana-Champaign, Urbana, IL

  • Venue:
  • Proceedings of the ACM SIGSPATIAL Second International Workshop on High Performance and Distributed Geographic Information Systems
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

With recent advances in data collection technologies such as remote sensing and global positioning systems, the amount of spatial data being produced has been increasing at a staggering rate. Simultaneously, a shift is being experienced in computing from single-core to multi-core processors. To effectively utilize the computational power afforded by these new generation of processors for serving data-intensive geospatial applications, parallel computing techniques need to be employed. Parallel computing, however, raises new challenges associated with handling the input and output of spatial data in parallel. This paper describes a Parallel Input/Output System (PIOS) to address challenges associated with handling large amounts of diverse spatial data. The PIOS is based on a hierarchical structure that uses a scalable file partitioning strategy and combines data and metadata to enable efficient handling of terabyte-scale data sets in parallel. A spatially-explicit agent-based model is developed as a case study. Computational experiments were conducted on a supercomputer supported by the National Science Foundation. PIOS achieved ten times speedup in parallel input/output time, and was demonstrated to efficiently scale to over one thousand processing cores and handle multiple terabytes of data.