Randomized algorithms for optimizing large join queries
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
STOC '94 Proceedings of the twenty-sixth annual ACM symposium on Theory of computing
Matching events in a content-based subscription system
Proceedings of the eighteenth annual ACM symposium on Principles of distributed computing
A case for end system multicast (keynote address)
Proceedings of the 2000 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Continuously adaptive continuous queries over streams
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Forwarding in a content-based network
Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications
An Efficient Multicast Protocol for Content-Based Publish-Subscribe Systems
ICDCS '99 Proceedings of the 19th IEEE International Conference on Distributed Computing Systems
Virtual landmarks for the internet
Proceedings of the 3rd ACM SIGCOMM conference on Internet measurement
Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers
IEEE Transactions on Knowledge and Data Engineering
Adaptive ordering of pipelined stream filters
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
SemCast: Semantic Multicast for Content-Based Data Dissemination
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Adaptive Reorganization of Coherency-Preserving Dissemination Tree for Streaming Data
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Maintaining coherency of dynamic data in cooperating repositories
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
An efficient and resilient approach to filtering and disseminating streaming data
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Towards an internet-scale XML dissemination service
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Multi-scale dissemination of time series data
Proceedings of the 25th International Conference on Scientific and Statistical Database Management
Hi-index | 0.00 |
In a distributed stream processing system, streaming data are continuously disseminated from the sources to the distributed processing servers. To enhance the dissemination efficiency, these servers are typically organized into one or more dissemination trees. In this paper, we focus on the problem of constructing dissemination trees to minimize the average loss of fidelity of the system. We observe that existing heuristic-based approaches can only explore a limited solution space and hence may lead to sub-optimal solutions. On the contrary, we propose an adaptive and cost-based approach. Our cost model takes into account both the processing cost and the communication cost. Furthermore, as a distributed stream processing system is vulnerable to inaccurate statistics, runtime fluctuations of data characteristics, server workloads, and network conditions, we have designed our scheme to be adaptive to these situations: an operational dissemination tree may be incrementally transformed to a more cost-effective one. Our adaptive strategy employs distributed decisions made by the distributed servers independently based on localized statistics collected by each server at runtime. For a relatively static environment, we also propose two static tree construction algorithms relying on apriori system statistics. These static trees can also be used as initial trees in a dynamic environment. We apply our schemes to both single- and multi-object dissemination. Our extensive performance study shows that the adaptive mechanisms are effective in a dynamic context and the proposed static tree construction algorithms perform close to optimal in a static environment.