Cooperating SQL Dataflow Processes for In-DB Analytics

  • Authors:
  • Qiming Chen;Meichun Hsu

  • Affiliations:
  • HP Labs, Hewlett Packard Co., Palo Alto, USA;HP Labs, Hewlett Packard Co., Palo Alto, USA

  • Venue:
  • OTM '09 Proceedings of the Confederated International Conferences, CoopIS, DOA, IS, and ODBASE 2009 on On the Move to Meaningful Internet Systems: Part I
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Pushing data-intensive analytics down to database engines is the key to high-performance and secured execution; however, the existent SQL framework is unable to express general graph-based dataflow processes, and unable to orchestrate multiple dataflow processes with inter-operation data dependencies. In this work we extend SQL to Functional Form-SQL (FF-SQL) based on a calculus of queries, to declaratively express complex dataflow graphs. A FF-SQL query is constructed from conventional queries using Function Forms (FFs). While a conventional SQL query represents a dataflow tree, a FF-SQL query represents a more general dataflow graph. Further, with FF-SQL, a group of SQL dataflow processes with data dependency among their operations can be specified as a single, integrated FF-SQL definition, and executed cooperatively inside the database engine without repeated data retrieval, duplicated computation and unnecessary data copying. A novel extension to the PostgreSQL query engine is made to support FF-SQL dataflow processes.