HPCVIEW: A Tool for Top-down Analysis of Node Performance
The Journal of Supercomputing
Data Management: NetCDF: an Interface for Scientific Data Access
IEEE Computer Graphics and Applications
Resource allocation in a middleware for streaming data
MGC '04 Proceedings of the 2nd workshop on Middleware for grid computing
Parallel netCDF: A High-Performance Scientific I/O Interface
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
A Portable Programming Interface for Performance Evaluation on Modern Processors
International Journal of High Performance Computing Applications
A Web Service Model for Climate Data Access on the Grid
International Journal of High Performance Computing Applications
Overview of the Software Design of the Community Climate System Model
International Journal of High Performance Computing Applications
International Journal of High Performance Computing Applications
CPL6: The New Extensible, High Performance Parallel Coupler for the Community Climate System Model
International Journal of High Performance Computing Applications
Short communication: Analysis of self-describing gridded geoscience data with netCDF Operators (NCO)
Environmental Modelling & Software
Hi-index | 0.00 |
An accurate cost model that accounts for dataset size and structure can help optimize geoscience data analysis. We develop and apply a computational model to estimate data analysis costs for arithmetic operations on gridded datasets typical of satellite- or climate model-origin. For these dataset geometries our model predicts data reduction scalings that agree with measurements of widely used geoscience data processing software, the netCDF Operators (NCO). I/O performance and library design dominate throughput for simple analysis (e.g. dataset differencing). Dataset structure can reduce analysis throughput ten-fold relative to same-sized unstructured datasets. We demonstrate algorithmic optimizations which substantially increase throughput for more complex, arithmetic-dominated analysis such as weighted-averaging of multi-dimensional data. These scaling properties can help to estimate costs of distribution strategies for data reduction in cluster and grid environments.