Parallel database systems: the future of high performance database systems
Communications of the ACM
Query optimization for parallel execution
SIGMOD '92 Proceedings of the 1992 ACM SIGMOD international conference on Management of data
Parallel database systems: open problems and new issues
Distributed and Parallel Databases - Special issue: Research topics in distributed and parallel databases
The state of the art in distributed query processing
ACM Computing Surveys (CSUR)
Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Adaptive ordering of pipelined stream filters
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Adaptive Caching for Continuous Queries
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Distributed operation in the Borealis stream processing engine
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Nile-PDT: a phenomenon detection and tracking framework for data stream management systems
VLDB '05 Proceedings of the 31st international conference on Very large data bases
A dynamically adaptive distributed system for processing complex continuous queries
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Design, implementation, and evaluation of the linear road bnchmark on the stream processing core
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
On-the-fly sharing for streamed aggregation
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Algorithms to accelerate multiple regular expressions matching for deep packet inspection
Proceedings of the 2006 conference on Applications, technologies, architectures, and protocols for computer communications
State-slice: new paradigm of multi-query optimization of window-based stream queries
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Maximizing the output rate of multi-way join queries over streaming information sources
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Load shedding in a data stream manager
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Tuple routing strategies for distributed eddies
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Memory-limited execution of windowed stream joins
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Network-aware query processing for stream-based applications
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Hi-index | 0.00 |
Multi-way stream joins with expensive join predicates lead to great challenge for real-time (or close to real-time) stream processing. Given the memory- and CPU-intensive nature of such stream join queries, scalable processing on a cluster must be employed. This paper proposes a novel scheme for distributed processing of generic multi-way joins with window constraints, called Pipelined State Partitioning (PSP). We target generic joins with arbitrarily join conditions, which are used in non-trivial stream applications such as image matching and biometric recognizing. The PSP scheme partitions the states into disjoint slices in the time domain, and then distributes the fine-grained states in the cluster, forming a virtual computation ring. Compared to replication-based distribution of non-equi-joins, PSP scheme is superior since: (1) zero state duplication and thus no repeated computations, (2) pipelined processing of every input tuple on multiple nodes to achieve low response time, and (3) cost-based adaptive workload distribution. We have implemented the proposed PSP schemes within the CAPE DSMS. Our experimental study demonstrates the significant performance improvements compared to the state-of-the-art generic distributed stream join algorithms.