Data-Continuous SQL Process Model

Authors:
Qiming Chen;Meichun Hsu
Affiliations:
HP Labs, Hewlett Packard Co, Palo Alto, USA;HP Labs, Hewlett Packard Co, Palo Alto, USA
Venue:
OTM '08 Proceedings of the OTM 2008 Confederated International Conferences, CoopIS, DOA, GADA, IS, and ODBASE 2008. Part I on On the Move to Meaningful Internet Systems:
Year:
2008

Citing 22
Cited 7

Nested relation based database knowledge representation

SIGMOD '91 Proceedings of the 1991 ACM SIGMOD international conference on Management of data
A Teradata content-based multimedia object manager for massively parallel architectures

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Eddies: continuously adaptive query processing

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Models and issues in data stream systems

Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Plan-Per-Tuple Optimization Solution - Parallel Execution of Expensive User-Defined Functions

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
A Transactional Model for Long-Running Activities

VLDB '91 Proceedings of the 17th International Conference on Very Large Data Bases
CPM Revisited - An Architecture Comparison

On the Move to Meaningful Internet Systems, 2002 - DOA/CoopIS/ODBASE 2002 Confederated International Conferences DOA, CoopIS and ODBASE 2002
Dynamic-Agents for Dynamic Service Provisioning

COOPIS '98 Proceedings of the 3rd IFCIS International Conference on Cooperative Information Systems
Multi-Agent Cooperative Transactions for E-Commerce

CooplS '02 Proceedings of the 7th International Conference on Cooperative Information Systems
Inter-Enterprise Collaborative Business Process Management

Proceedings of the 17th International Conference on Data Engineering
STREAM: the stanford stream data manager (demonstration description)

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Aurora: a new model and architecture for data stream management

The VLDB Journal — The International Journal on Very Large Data Bases
Publish/Subscribe in NonStop SQL: Transactional Streams in a Relational Context

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Scientific data management in the coming decade

ACM SIGMOD Record
The CQL continuous query language: semantic foundations and query execution

The VLDB Journal — The International Journal on Very Large Data Bases
Dryad: distributed data-parallel programs from sequential building blocks

Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
Query languages and data models for database sequences and data streams

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Continuous queries in oracle

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Pig latin: a not-so-foreign language for data processing

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
User Defined Partitioning - Group Data Based on Computation Model

DaWaK '08 Proceedings of the 10th international conference on Data Warehousing and Knowledge Discovery
Correlated Query Process and P2P Execution

Globe '08 Proceedings of the 1st international conference on Data Management in Grid and Peer-to-Peer Systems
Building a scalable web query system

DNIS'07 Proceedings of the 5th international conference on Databases in networked information systems

An In-Database Streaming Solution to Multi-camera Fusion

Globe '09 Proceedings of the 2nd International Conference on Data Management in Grid and Peer-to-Peer Systems
Scaling-Up and Speeding-Up Video Analytics Inside Database Engine

DEXA '09 Proceedings of the 20th International Conference on Database and Expert Systems Applications
Extend UDF Technology for Integrated Analytics

DaWaK '09 Proceedings of the 11th International Conference on Data Warehousing and Knowledge Discovery
Cooperating SQL Dataflow Processes for In-DB Analytics

OTM '09 Proceedings of the Confederated International Conferences, CoopIS, DOA, IS, and ODBASE 2009 on On the Move to Meaningful Internet Systems: Part I
Generalized UDF for analytics inside database engine

WAIM'10 Proceedings of the 11th international conference on Web-age information management
Scale out parallel and distributed CDR stream analytics

Globe'10 Proceedings of the Third international conference on Data management in grid and peer-to-peer systems
Continuous mapreduce for In-DB stream analytics

OTM'10 Proceedings of the 2010 international conference on On the move to meaningful internet systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Motivated by automating enterprise information derivation processes, we propose a new kind of business process - Data-Continuous SQL Process (DCSP), which is data-stream driven and continuously running. The basic operators of a DCSP are database User Defined Functions (UDFs). However, we introduce a special kind of UDFs - Relation Valued Functions (RVFs) with both input and return values specified as relations. An RVF represents a relational transformation and can be composed with other relational operators. We allow an RVF to be triggered repeatedly by stream inputs , timers or event-conditions. Thesequence of executions generates a data stream . To capture such data continuation semantics we introduce the notion of station for hosting a continuously-executed RVF, and the notion ofpipe as the FIFO stream container for asynchronous communication between stations. A station is specified with the triggering factors and the outgoing pipes. A pipe is strongly typed by a relation schema with a stream key for identifying its elements. As an abstract object, a pipe can be implemented as aqueue or stream table . To allow a DCSP to be constructed from stations and pipes recursively, we introduce the notion of Data Continuous Query (DCQ) that is a query applied to a stream data source --- a stream table, a station (via pipe) or recursively a DCQ, with well defined data continuation semantics. A DCQ itself can be treated as a station, meaning that stations can be constructed from existing ones recursively in terms of SQL. Based on these notions a DCSP is modeled as a graph of stations (nodes) and pipes (links) and represented by a set of correlated DCQs. Specifying DCSP in SQL allows us to take advantage of SQL in expressing relational transformations on the stream elements, and potentially in pushing DCSP execution down to the database layer for performance and scalability. The implementation issues based on parallel database technology are discussed. The proposed approach represents a major shift in process management from one-time execution to data stream driven, open-ended execution, and an initial step in bringing BPM technology and database technology together under the data-continuation semantics.