Growing artificial societies: social science from the bottom up
Growing artificial societies: social science from the bottom up
On implementing MPI-IO portably and with high performance
Proceedings of the sixth workshop on I/O in parallel and distributed systems
A case for using MPI's derived datatypes to improve I/O performance
SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Optimizing noncontiguous accesses in MPI – IO
Parallel Computing
MPI-IO/GPFS, an optimized implementation of MPI-IO on top of GPFS
Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Data Sieving and Collective I/O in ROMIO
FRONTIERS '99 Proceedings of the The 7th Symposium on the Frontiers of Massively Parallel Computation
Gordon: using flash memory to build fast, power-efficient clusters for data-intensive applications
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
A view of the parallel computing landscape
Communications of the ACM - A View of Parallel Computing
I/O performance challenges at leadership scale
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
A MapReduce approach to Gi*(d) spatial statistic
Proceedings of the ACM SIGSPATIAL International Workshop on High Performance and Distributed Geographic Information Systems
Accelerating Spatial Data Processing with MapReduce
ICPADS '10 Proceedings of the 2010 IEEE 16th International Conference on Parallel and Distributed Systems
Scalable SQL and NoSQL data stores
ACM SIGMOD Record
Hi-index | 0.00 |
With recent advances in data collection technologies such as remote sensing and global positioning systems, the amount of spatial data being produced has been increasing at a staggering rate. Simultaneously, a shift is being experienced in computing from single-core to multi-core processors. To effectively utilize the computational power afforded by these new generation of processors for serving data-intensive geospatial applications, parallel computing techniques need to be employed. Parallel computing, however, raises new challenges associated with handling the input and output of spatial data in parallel. This paper describes a Parallel Input/Output System (PIOS) to address challenges associated with handling large amounts of diverse spatial data. The PIOS is based on a hierarchical structure that uses a scalable file partitioning strategy and combines data and metadata to enable efficient handling of terabyte-scale data sets in parallel. A spatially-explicit agent-based model is developed as a case study. Computational experiments were conducted on a supercomputer supported by the National Science Foundation. PIOS achieved ten times speedup in parallel input/output time, and was demonstrated to efficiently scale to over one thousand processing cores and handle multiple terabytes of data.