On optimistic methods for concurrency control
ACM Transactions on Database Systems (TODS)
Transactional information systems: theory, algorithms, and the practice of concurrency control and recovery
Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Operator scheduling in data stream systems
The VLDB Journal — The International Journal on Very Large Data Bases
The CQL continuous query language: semantic foundations and query execution
The VLDB Journal — The International Journal on Very Large Data Bases
Operator scheduling in a data stream manager
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Transactional issues in sensor data management
DMSN '06 Proceedings of the 3rd workshop on Data management for sensor networks: in conjunction with VLDB 2006
SPADE: the system s declarative stream processing engine
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Meshing Streaming Updates with Persistent Data in an Active Data Warehouse
IEEE Transactions on Knowledge and Data Engineering
Stream warehousing with DataDepot
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
Continuous analytics over discontinuous streams
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
R-MESHJOIN for near-real-time data warehousing
DOLAP '10 Proceedings of the ACM 13th international workshop on Data warehousing and OLAP
Active complex event processing over event streams
Proceedings of the VLDB Endowment
Semantics of data streams and operators
ICDT'05 Proceedings of the 10th international conference on Database Theory
Transactional stream processing
Proceedings of the 15th International Conference on Extending Database Technology
Temporal Analytics on Big Data for Web Advertising
ICDE '12 Proceedings of the 2012 IEEE 28th International Conference on Data Engineering
Hi-index | 0.00 |
A recent trend in data stream processing shows the use of advanced continuous queries (CQs) that reference non-streaming resources such as relational data in databases and machine learning models. Since non-streaming resources could be shared among multiple systems, resources may be updated by the systems during the CQ-execution. As a consequence, CQs may reference resources inconsistently, and lead to a wide range of problems from inappropriate results to fatal system failures. In this paper, we address this inconsistency problem by introducing the concept of transaction processing onto data stream processing. In the first part of this paper, we introduce CQ-derived transaction, a concept that derives read-only transactions from CQs, and illustrate that the inconsistency problem is solved by ensuring serializability of derived transactions and resource updating transactions. To ensure serializability, we propose three CQ-processing strategies based on concurrency control techniques: two-phase lock strategy, snapshot strategy, and optimistic strategy. Experimental study shows our CQ-processing strategies guarantee proper results, and their performances are comparable to the performance of conventional strategy that could produce improper results. In the second part of this paper, we try to improve the performance of our proposed strategies from the viewpoint of operator scheduling. We notice a characteristic of our proposed strategies: operators could be re-evaluated to prevent non-serializable schedules causing performance degradation. We find the fact that the number of operator re-evaluation depends on operator scheduling, and propose a scheduling constraint that reduces the re-evaluation. Experimental study shows our constraint's effectiveness: if we add the proposed constraint to operator scheduling, throughput increases up to 5.2 times compared to the naïve scheduling without the constraint.