dQUOB: Managing Large Data Flows Using Dynamic Embedded Queries

  • Authors:
  • Beth Plale;Karsten Schwan

  • Affiliations:
  • -;-

  • Venue:
  • HPDC '00 Proceedings of the 9th IEEE International Symposium on High Performance Distributed Computing
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

The dQUOB system satisfies client need for specific information from high-volume data streams. The data streams we speak of are the flow of data existing during large-scale visualizations, video streaming to large numbers of distributed users, and high volume business transactions. We introduce the notion of conceptualizing a data stream as a set of relational database tables so that a scientist can request information with an SQL-like query. Transformation or computation that often needs to be performed on the data in route can be conceptualized as computation performed on consecutive views of the data, with computation associated with each view. The dQUOB system moves the query code into the data stream as a quoblet as compiled code. The relational database data model has the significant advantage of presenting opportunities for efficient re-optimizations of queries and sets of queries.Using examples from global atmospheric modeling, we illustrate the usefulness of the dQUOB system. We carry the examples through the experiments to establish the viability of the approach for high performance computing with a baseline benchmark. We define a cost-metric of end-to-end latency that can be used to determine realistic cases where optimization should be applied. Finally, we show that end-to-end latency can be controlled through a probability assigned to a query that a query will evaluate to true.