Time and space optimization for processing groups of multi-dimensional scientific queries

  • Authors:
  • Suresh Aryangat;Henrique Andrade;Alan Sussman

  • Affiliations:
  • University of Maryland, College Park, MD;University of Maryland, College Park, MD;University of Maryland, College Park, MD

  • Venue:
  • Proceedings of the 18th annual international conference on Supercomputing
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data analysis applications in areas as diverse as remote sensing and telepathology require operating on and processing very large datasets. For such applications to execute efficiently, careful attention must be paid to the storage, retrieval, and manipulation of the datasets. This paper addresses the optimizations performed by a high performance database system that processes groups of data analysis requests for these applications, which we call queries. The system performs end-to-end processing of the requests, formulated as PostgreSQL declarative queries. The queries are converted into imperative descriptions, multiple imperative descriptions are merged into a single execution plan, the plan is optimized to decrease execution time via common compiler optimization techniques, and, finally, the plan is optimized to decrease memory consumption. The last two steps are experimentally shown to effectively reduc the amount of time required while conserving memory space as a group of queries is processed by the database.