Infosphere project: system support for information flow applications
ACM SIGMOD Record
Distributed processing of very large datasets with DataCutter
Parallel Computing - Clusters and computational grids for scientific computing
Continuously adaptive continuous queries over streams
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
StreamIt: A Language for Streaming Applications
CC '02 Proceedings of the 11th International Conference on Compiler Construction
STREAM: the stanford stream data manager (demonstration description)
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
TelegraphCQ: continuous dataflow processing
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Aurora: a new model and architecture for data stream management
The VLDB Journal — The International Journal on Very Large Data Bases
Adaptive Control of Extreme-scale Stream Processing Systems
ICDCS '06 Proceedings of the 26th IEEE International Conference on Distributed Computing Systems
Linear road: a stream data management benchmark
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Multi-site cooperative data stream analysis
ACM SIGOPS Operating Systems Review
ViCo: an adaptive distributed video correlation system
MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
Resource-adaptive real-time new event detection
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Scalability of fork/join queueing networks with blocking
Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Load shedding and distributed resource control of stream processing networks
Performance Evaluation
SPC: a distributed, scalable platform for data mining
Proceedings of the 4th international workshop on Data mining standards, services and platforms
GrubJoin: An Adaptive, Multi-Way, Windowed Stream Join with Time Correlation-Aware CPU Load Shedding
IEEE Transactions on Knowledge and Data Engineering
Extending XQuery with window functions
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Executing stream joins on the cell processor
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Project Kittyhawk: building a global-scale computer: Blue Gene/P as a generic computing platform
ACM SIGOPS Operating Systems Review
SPADE: the system s declarative stream processing engine
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Event detection in sensor networks for modern oil fields
Proceedings of the second international conference on Distributed event-based systems
Cluster Computing
Streamsight: a visualization tool for large-scale streaming applications
Proceedings of the 4th ACM symposium on Software visualization
New Data Types and Operations to Support Geo-streams
GIScience '08 Proceedings of the 5th international conference on Geographic Information Science
A tag-based approach for the design and composition of information processing applications
Proceedings of the 23rd ACM SIGPLAN conference on Object-oriented programming systems languages and applications
Fault-tolerant stream processing using a distributed, replicated file system
Proceedings of the VLDB Endowment
Embedding intelligent decision making within complex dynamic environments
Artificial Intelligence Review
Specifying and enforcing high-level semantic obligation policies
Web Semantics: Science, Services and Agents on the World Wide Web
SODA: an optimizing scheduler for large-scale stream-based distributed computer systems
Proceedings of the 9th ACM/IFIP/USENIX International Conference on Middleware
Synergy: sharing-aware component composition for distributed stream processing systems
Proceedings of the ACM/IFIP/USENIX 2006 International Conference on Middleware
XenSocket: a high-throughput interdomain transport for virtual machines
Proceedings of the ACM/IFIP/USENIX 2007 International Conference on Middleware
CLASP: collaborating, autonomous stream processing systems
Proceedings of the ACM/IFIP/USENIX 2007 International Conference on Middleware
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Exploiting the power of relational databases for efficient stream processing
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
CellJoin: a parallel stream join operator for the cell processor
The VLDB Journal — The International Journal on Very Large Data Bases
Online pairing of VoIP conversations
The VLDB Journal — The International Journal on Very Large Data Bases
Auto-vectorization through code generation for stream processing applications
Proceedings of the 23rd international conference on Supercomputing
BNCOD 26 Proceedings of the 26th British National Conference on Databases: Dataspace: The Final Frontier
Proceedings of the 2nd Workshop on High Performance Computational Finance
A code generation approach to optimizing high-performance distributed data stream processing
Proceedings of the 18th ACM conference on Information and knowledge management
Towards secure dataflow processing in open distributed systems
Proceedings of the 2009 ACM workshop on Scalable trusted computing
Tools and strategies for debugging distributed stream processing applications
Software—Practice & Experience
COLA: optimizing stream processing applications via graph partitioning
Proceedings of the 10th ACM/IFIP/USENIX International Conference on Middleware
Planning-based configuration and management of distributed systems
IM'09 Proceedings of the 11th IFIP/IEEE international conference on Symposium on Integrated Network Management
Body sensor data processing using stream computing
Proceedings of the international conference on Multimedia information retrieval
RunTest: assuring integrity of dataflow processing in cloud computing infrastructures
ASIACCS '10 Proceedings of the 5th ACM Symposium on Information, Computer and Communications Security
A semantics-based middleware for utilizing heterogeneous sensor networks
DCOSS'07 Proceedings of the 3rd IEEE international conference on Distributed computing in sensor systems
Visualizing large-scale streaming applications
Information Visualization
XenSocket: a high-throughput interdomain transport for virtual machines
MIDDLEWARE2007 Proceedings of the 8th ACM/IFIP/USENIX international conference on Middleware
CLASP: collaborating, autonomous stream processing systems
MIDDLEWARE2007 Proceedings of the 8th ACM/IFIP/USENIX international conference on Middleware
COLA: optimizing stream processing applications via graph partitioning
Middleware'09 Proceedings of the ACM/IFIP/USENIX 10th international conference on Middleware
Scalable performance of system S for extract-transform-load processing
Proceedings of the 3rd Annual Haifa Experimental Systems Conference
Characterizing contract-based multiagent resource allocation in networks
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics - Special issue on game theory
Workload characterization for operator-based distributed stream processing applications
Proceedings of the Fourth ACM International Conference on Distributed Event-Based Systems
Evaluation of streaming aggregation on parallel hardware architectures
Proceedings of the Fourth ACM International Conference on Distributed Event-Based Systems
An extensible test framework for the Microsoft StreamInsight query processor
Proceedings of the Third International Workshop on Testing Database Systems
Experience in extending query engine for continuous analytics
DaWaK'10 Proceedings of the 12th international conference on Data warehousing and knowledge discovery
Design principles for developing stream processing applications
Software—Practice & Experience - Focus on Selected PhD Literature Reviews in the Practical Aspects of Software Technology
Processing high data rate streams in System S
Journal of Parallel and Distributed Computing
From a stream of relational queries to distributed stream processing
Proceedings of the VLDB Endowment
Data stream analytics as cloud service for mobile applications
OTM'10 Proceedings of the 2010 international conference on On the move to meaningful internet systems: Part II
Experience in Continuous analytics as a Service (CaaaS)
Proceedings of the 14th International Conference on Extending Database Technology
Strategic agents for multi-resource negotiation
Autonomous Agents and Multi-Agent Systems
SpamWatcher: a streaming social network analytic on the IBM wire-speed processor
Proceedings of the 5th ACM international conference on Distributed event-based system
Flow: A Stream Processing System Simulator
PADS '10 Proceedings of the 2010 IEEE Workshop on Principles of Advanced and Distributed Simulation
Journal of Systems and Software
SQL streaming process in query engine net
OTM'11 Proceedings of the 2011th Confederated international conference on On the move to meaningful internet systems - Volume Part I
Synergy: sharing-aware component composition for distributed stream processing systems
Middleware'06 Proceedings of the 7th ACM/IFIP/USENIX international conference on Middleware
Scalable splitting of massive data streams
DASFAA'10 Proceedings of the 15th international conference on Database Systems for Advanced Applications - Volume Part II
A platform for situational awareness in operational BI
Decision Support Systems
A code generation approach for auto-vectorization in the SPADE compiler
LCPC'09 Proceedings of the 22nd international conference on Languages and Compilers for Parallel Computing
Processing flows of information: From data stream to complex event processing
ACM Computing Surveys (CSUR)
Virtualizing stream processing
Middleware'11 Proceedings of the 12th ACM/IFIP/USENIX international conference on Middleware
Highly scalable speech processing on data stream management system
DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications - Volume Part II
Proceedings of the 6th ACM International Conference on Distributed Event-Based Systems
Software—Practice & Experience
Virtualizing stream processing
Proceedings of the 12th International Middleware Conference
Integrating scale out and fault tolerance in stream processing using operator state management
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Real-time analysis for short-term prognosis in intensive care
IBM Journal of Research and Development
Navigating big data with high-throughput, energy-efficient data partitioning
Proceedings of the 40th Annual International Symposium on Computer Architecture
Hi-index | 0.00 |
Stream processing applications have recently gained significant attention in the networking and database community. At the core of these applications is a stream processing engine that performs resource allocation and management to support continuous tracking of queries over collections of physically-distributed and rapidly-updating data streams. While numerous stream processing systems exist, there has been little work on understanding the performance characteristics of these applications in a distributed setup. In this paper, we examine the performance bottlenecks of streaming data applications, in particular the Linear Road stream data management benchmark, in achieving good performance in large-scale distributed environments, using the Stream Processing Core (SPC), a stream processing middleware we have developed. First, we present the design and implementation of the Linear Road benchmark on the SPC middleware. SPC has been designed to scale to tens of thousands of processing nodes, while supporting concurrent applications and multiple simultaneous queries. Second, we identify the main performance bottlenecks in the Linear Road application in achieving scalability and low query response latency. Our results show that data locality, buffer capacity, physical allocation of processing elements to infrastructure nodes, and packaging for transporting streamed data are important factors in achieving good application performance. Though we evaluate our system primarily for the Linear Road application, we believe it also provides useful insights into the overall system behavior for supporting other distributed and large-scale continuous streaming data applications. Finally, we examine how SPC can be used and tuned to enable a very efficient implementation of the Linear Road application in a distributed environment.