Algebraic manipulation of scientific datasets

  • Authors:
  • Bill Howe;David Maier

  • Affiliations:
  • OGI School of Science & Engineering at, Oregon Health & Science University, Beaverton, Oregon;OGI School of Science & Engineering at, Oregon Health & Science University, Beaverton, Oregon

  • Venue:
  • VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

We investigate algebraic processing strategies for large numeric datasets equipped with a possibly irregular grid structure. Such datasets arise, for example, in computational simulations, observation networks, medical imaging, and 2-D and 3-D rendering. Existing approaches for manipulating these datasets are incomplete: The performance of SQL queries for manipulating large numeric datasets is not competitive with specialized tools. Database extensions for processing multidimensional discrete data can only model regular, rectilinear grids. Visualization software libraries are designed to process gridded datasets efficiently, but no algebra has been developed to simplify their use and afford optimization. Further, these libraries are data dependent - physical changes to data representation or organization break user programs. In this paper, we present an algebra of grid-fields for manipulating both regular and irregular gridded datasets, algebraic optimization techniques, and an implementation backed by experimental results. We compare our techniques to those of spatial databases and visualization software libraries, using real examples from an Environmental Observation and Forecasting System. We find that our approach can express optimized plans inaccessible to other techniques, resulting in improved performance with reduced programming effort.