Study of scalable declustering algorithms for parallel grid files
Study of scalable declustering algorithms for parallel grid files
Space-filling curves and their use in the design of geometric data structures
Theoretical Computer Science - Special issue: Latin American theoretical informatics
(Almost) optimal parallel block access to range queries
PODS '00 Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
MOCHA: a self-extensible database middleware system for distributed data sources
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Optimizing noncontiguous accesses in MPI – IO
Parallel Computing
Processing large-scale multi-dimensional data in parallel and distributed environments
Parallel Computing - Parallel data-intensive algorithms and applications
Efficient Organization of Large Multidimensional Arrays
Proceedings of the Tenth International Conference on Data Engineering
Latin Cubes and Parallel Array Access
Proceedings of the 8th International Symposium on Parallel Processing
Array Distribution in Data-Parallel Programs
LCPC '94 Proceedings of the 7th International Workshop on Languages and Compilers for Parallel Computing
Hilbert R-tree: An Improved R-tree using Fractals
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Armada: A Parallel File System for Computational Grids
CCGRID '01 Proceedings of the 1st International Symposium on Cluster Computing and the Grid
Compiling Tiled Iteration Spaces for Clusters
CLUSTER '02 Proceedings of the IEEE International Conference on Cluster Computing
Parallel netCDF: A High-Performance Scientific I/O Interface
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Iteration aware prefetching for large multidimensional datasets
SSDBM'2005 Proceedings of the 17th international conference on Scientific and statistical database management
Efficient parallel processing of range queries through replicated declustering
Distributed and Parallel Databases
UDT: UDP-based data transfer for high-speed wide area networks
Computer Networks: The International Journal of Computer and Telecommunications Networking
Journal of Computing Sciences in Colleges
Six degrees of scientific data: reading patterns for extreme scale science IO
Proceedings of the 20th international symposium on High performance distributed computing
Hi-index | 0.00 |
High performance parallel computing infrastructures, such as computing clusters, have recently become freely available for scientific researchers to solve problems of unprecedented scale through data parallelization. However scientists are not necessarily skilled in writing efficient parallel code, especially when dealing with spatial datasets. Two important performance issues involved are the heavy I/O costs and the communication overhead. To address this issue, we are developing an scheme that helps scientists realize I/O friendly and scalable data parallelization for spatial computation. Built upon our iteration aware spatial prefetching and caching techniques, this data parallelization scheme takes an explicit specification of data dependency, identifies the best feasible access patterns while applying some I/O efficiency rules and then wraps them in separate spatial data iterators for efficient cache loading and data partitioning respectively. This scheme prioritizes but reconciles the I/O costs in the different stages of a data intensive cluster application to achieve the overall best I/O performance while maintaining fair computational scalability.