Tools and strategies for debugging distributed stream processing applications
Software—Practice & Experience
A probabilistic filter protocol for continuous queries
QuaCon'09 Proceedings of the 1st international conference on Quality of context
Scalable performance of system S for extract-transform-load processing
Proceedings of the 3rd Annual Haifa Experimental Systems Conference
Evaluating continuous probabilistic queries over imprecise sensor data
DASFAA'10 Proceedings of the 15th international conference on Database Systems for Advanced Applications - Volume Part I
Probabilistic filters: A stream protocol for continuous probabilistic queries
Information Systems
Tutorial: stream processing optimizations
Proceedings of the 7th ACM international conference on Distributed event-based systems
A catalog of stream processing optimizations
ACM Computing Surveys (CSUR)
Hi-index | 0.00 |
High-volume source streams, coupled with fluctuating rates, necessitate adaptive load shedding in data stream processing. When ignored, a continual query (CQ) server may randomly drop items, when its capacity is inadequate to handle the arriving data, and degrade the quality of the query results. To alleviate this problem, filters can be used at the source nodes. However, regular source filtering in itself is not sufficient to prevent random dropping, because the amount of data passing through the filters can still surpass the server's capacity. In this case, intelligent load shedding can be applied by the source filters to minimize the degradation in result quality. In this paper, we introduce a novel type of load-shedding source filters, called Non-uniformly Regulated (NR) sifters. An NR sifter judiciously applies varying amounts of load shedding to different regions of the data space within the sifter. We formulate the problem of constructing NR sifters as an optimization one. NR sifters are compact and quickly configurable, allowing frequent adaptations, and provide fast lookup for deciding if a data item should be dropped. We structure NR sifters as a set of (sifter region, drop threshold) pairs to achieve compactness, develop query consolidation techniques to enable quick construction, and introduce flexible space partitioning mechanisms to realize fast lookup.