Towards scalable ad-hoc climate anomalies search

  • Authors:
  • Peter Baumann;Dimitar Misev

  • Affiliations:
  • rasdaman GmbH, Bremen, Germany;rasdaman GmbH, Bremen, Germany

  • Venue:
  • Proceedings of the 1st ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Meteorological data contribute significantly to "Big Data"; however, not only is their volume ranging into Petabyte sizes for single objects a challenge, but also the number of dimensions -- such general 4-D spatio-temporal data cannot be handled through traditional GIS methods and tools. Actually, climate data tend to transcend these dimensions and add an extra time dimension for the simulation run time, ending up with 5-D data cubes. Traditional databases, known for their flexibility and scalability, have proven inadequate due to their lack of support for multi-dimensional rasters. Consequently, file-based implementations are being used for serving such data to the community, rather than databases. This is recently overcome by Array Databases which provide storage and query support for this information category of multi-dimensional rasters, thereby unleashing the scalability and flexibility advantages for climate data management. In this contribution, we present a case study where non-trivial analytics functionality on n-D climate data cubes has been established. Storage optimization techniques novel to standard databases allow to tune the system for interactive response in many cases. We briefly introduce the rasdaman database system used, present the database schema and practically important queries use case, and report preliminary performance observations. To the best of our knowledge, this is the first non-academic, real-life deployment of an array database for up to 5-D data sets.