A platform for scalable one-pass analytics using MapReduce
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Fault injection-based assessment of partial fault tolerance in stream processing applications
Proceedings of the 5th ACM international conference on Distributed event-based system
Proceedings of the 2nd ACM SIGSPATIAL International Workshop on GeoStreaming
SQL streaming process in query engine net
OTM'11 Proceedings of the 2011th Confederated international conference on On the move to meaningful internet systems - Volume Part I
Siddhi: a second look at complex event processing architectures
Proceedings of the 2011 ACM workshop on Gateway computing environments
Living in the present: on-the-fly information processing in scalable web architectures
Proceedings of the 2nd International Workshop on Cloud Computing Platforms
Partitioned multi-indexing: bringing order to social search
Proceedings of the 21st international conference on World Wide Web
Multimedia search over integrated social and sensor networks
Proceedings of the 21st international conference companion on World Wide Web
Hirundo: a mechanism for automated production of optimized data stream graphs
ICPE '12 Proceedings of the 3rd ACM/SPEC International Conference on Performance Engineering
Virtualizing stream processing
Middleware'11 Proceedings of the 12th ACM/IFIP/USENIX international conference on Middleware
Massively-parallel stream processing under QoS constraints with Nephele
Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing
A scalable complex event processing system and evaluations of its performance
Proceedings of the 6th ACM International Conference on Distributed Event-Based Systems
Proceedings of the WICSA/ECSA 2012 Companion Volume
Muppet: MapReduce-style processing of fast data
Proceedings of the VLDB Endowment
Auto-parallelizing stateful distributed streaming applications
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
SCALLA: A Platform for Scalable One-Pass Analytics Using MapReduce
ACM Transactions on Database Systems (TODS)
Next challenges for adaptive learning systems
ACM SIGKDD Explorations Newsletter
Virtualizing stream processing
Proceedings of the 12th International Middleware Conference
MapReduce-Based data stream processing over large history data
ICSOC'12 Proceedings of the 10th international conference on Service-Oriented Computing
VScope: middleware for troubleshooting time-sensitive data center applications
Proceedings of the 13th International Middleware Conference
Pollux: towards scalable distributed real-time search on microblogs
Proceedings of the 16th International Conference on Extending Database Technology
Database support for processing complex aggregate queries over data streams
Proceedings of the Joint EDBT/ICDT 2013 Workshops
Incremental stream processing using computational conflict-free replicated data types
Proceedings of the 3rd International Workshop on Cloud Data and Platforms
Execution and optimization of continuous queries with cyclops
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Photon: fault-tolerant and scalable joining of continuous data streams
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Integrating scale out and fault tolerance in stream processing using operator state management
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Fast data in the era of big data: Twitter's real-time related query suggestion architecture
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
TimeStream: reliable stream computation in the cloud
Proceedings of the 8th ACM European Conference on Computer Systems
Stream-monitoring with blockmon: convergence of network measurements and data analytics platforms
ACM SIGCOMM Computer Communication Review
Mining big data: current status, and forecast to the future
ACM SIGKDD Explorations Newsletter
Navigating big data with high-throughput, energy-efficient data partitioning
Proceedings of the 40th Annual International Symposium on Computer Architecture
SAMOA: a platform for mining big data streams
Proceedings of the 22nd international conference on World Wide Web companion
StreamHub: a massively parallel architecture for high-performance content-based publish/subscribe
Proceedings of the 7th ACM international conference on Distributed event-based systems
Adaptive online scheduling in storm
Proceedings of the 7th ACM international conference on Distributed event-based systems
Tutorial: stream processing optimizations
Proceedings of the 7th ACM international conference on Distributed event-based systems
Grand challenge: the TechniBall system
Proceedings of the 7th ACM international conference on Distributed event-based systems
Demo: measuring and estimating monetary cost for cloud-based data stream processing
Proceedings of the 7th ACM international conference on Distributed event-based systems
Distributed data management using MapReduce
ACM Computing Surveys (CSUR)
Exploiting application dynamism and cloud elasticity for continuous dataflows
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Scalable Data Processing for Community Sensing Applications
Mobile Networks and Applications
Data-Intensive Cloud Computing: Requirements, Expectations, Challenges, and Solutions
Journal of Grid Computing
Consolidated cluster systems for data centers in the cloud age: a survey and analysis
Frontiers of Computer Science: Selected Publications from Chinese Universities
Cloudy: heterogeneous middleware for in time queries processing
Proceedings of the 17th International Database Engineering & Applications Symposium
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
ACM SIGOPS 24th Symposium on Operating Systems Principles
Discretized streams: fault-tolerant streaming computation at scale
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
DANBI: dynamic scheduling of irregular stream programs for many-core systems
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
A performance analysis of system s, s4, and esper via two level benchmarking
QEST'13 Proceedings of the 10th international conference on Quantitative Evaluation of Systems
A survey on standards for real-time distribution middleware
ACM Computing Surveys (CSUR)
CRUCIBLE: towards unified secure on- and off-line analytics at scale
DISCS-2013 Proceedings of the 2013 International Workshop on Data-Intensive Scalable Computing Systems
MillWheel: fault-tolerant stream processing at internet scale
Proceedings of the VLDB Endowment
Automatic optimization of stream programs via source program operator graph transformations
Distributed and Parallel Databases
Modeling and optimizing large-scale data flows
Future Generation Computer Systems
Semantic-based QoS management in cloud systems: Current status and future challenges
Future Generation Computer Systems
Tutorial: Elastic and Fault Tolerant Event Stream Processing using StreamMine3G
UCC '13 Proceedings of the 2013 IEEE/ACM 6th International Conference on Utility and Cloud Computing
Nephele streaming: stream processing under QoS constraints at scale
Cluster Computing
IBM streams processing language: analyzing big data in motion
IBM Journal of Research and Development
Hi-index | 0.00 |
S4 is a general-purpose, distributed, scalable, partially fault-tolerant, pluggable platform that allows programmers to easily develop applications for processing continuous unbounded streams of data. Keyed data events are routed with affinity to Processing Elements (PEs), which consume the events and do one or both of the following: (1) emit one or more events which may be consumed by other PEs, (2) publish results. The architecture resembles the Actors model, providing semantics of encapsulation and location transparency, thus allowing applications to be massively concurrent while exposing a simple programming interface to application developers. In this paper, we outline the S4 architecture in detail, describe various applications, including real-life deployments. Our design is primarily driven by large scale applications for data mining and machine learning in a production environment. We show that the S4 design is surprisingly flexible and lends itself to run in large clusters built with commodity hardware.