Introduction to parallel computing: design and analysis of algorithms
Introduction to parallel computing: design and analysis of algorithms
Principles of distributed database systems (2nd ed.)
Principles of distributed database systems (2nd ed.)
Continuously adaptive continuous queries over streams
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
The SDSS skyserver: public access to the sloan digital sky server data
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Volcano An Extensible and Parallel Query Evaluation System
IEEE Transactions on Knowledge and Data Engineering
Algebraic Optimization of Computations over Scientific Databases
VLDB '93 Proceedings of the 19th International Conference on Very Large Data Bases
Parallelizing User-Defined Functions in Distributed Object-Relational DBMS
IDEAS '99 Proceedings of the 1999 International Symposium on Database Engineering & Applications
Gigascope: a stream database for network applications
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Highly available, fault-tolerant, parallel dataflows
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Tribeca: a system for managing large databases of network traffic
ATEC '98 Proceedings of the annual conference on USENIX Annual Technical Conference
Monitoring streams: a new class of data management applications
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Tuple routing strategies for distributed eddies
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
ViCo: an adaptive distributed video correlation system
MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
Highly scalable trip grouping for large-scale collective transportation systems
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Query-aware partitioning for monitoring massive network data streams
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Challenges in dependable internet-scale stream processing
Proceedings of the 2nd workshop on Dependable distributed data management
Optimistic parallelization support for event stream processing systems
Proceedings of the 5th Middleware doctoral symposium
Exploiting the power of relational databases for efficient stream processing
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Mining large distributed log data in near real time
SLAML '11 Managing Large-scale Systems via the Analysis of System Logs and the Application of Machine Learning Techniques
Scalable splitting of massive data streams
DASFAA'10 Proceedings of the 15th international conference on Database Systems for Advanced Applications - Volume Part II
Database support for processing complex aggregate queries over data streams
Proceedings of the Joint EDBT/ICDT 2013 Workshops
Adaptive input admission and management for parallel stream processing
Proceedings of the 7th ACM international conference on Distributed event-based systems
Hi-index | 0.00 |
Scientific applications require processing high-volume on-line streams of numerical data from instruments and simulations. We present an extensible stream database system that allows scalable and flexible continuous queries on such streams. Application dependent streams and query functions are defined through an object-relational model. Distributed execution plans for continuous queries are described as high-level data flow distribution templates. Using a generic template we define two partitioning strategies for scalable parallel execution of expensive stream queries: window split and window distribute. Window split provides operators for parallel execution of query functions by reducing the size of stream data units using application dependent functions as parameters. By contrast, window distribute provides operators for customized distribution of entire data units without reducing their size. We evaluate these strategies for a typical high volume scientific stream application and show that window split is favorable when expensive queries are executed on limited resources, while window distribution is better otherwise.