On saying “Enough already!” in SQL
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
New sampling-based summary statistics for improving approximate query answers
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Combining fuzzy information from multiple systems
Journal of Computer and System Sciences
NiagaraCQ: a scalable continuous query system for Internet databases
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Optimal aggregation algorithms for middleware
PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Continuously adaptive continuous queries over streams
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Minimal probing: supporting expensive predicates for top-k queries
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Optimizing Multi-Feature Queries for Image Databases
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Efficient top-K query calculation in distributed networks
Proceedings of the twenty-third annual ACM symposium on Principles of distributed computing
Progressive Distributed Top-k Retrieval in Peer-to-Peer Networks
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Multiple aggregations over data streams
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
KLEE: a framework for distributed top-k query algorithms
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Monitoring streams: a new class of data management applications
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Hi-index | 0.00 |
This paper addresses the efficient processing of distributed top-k monitoring, which is continuously reporting the k largest values according to a user-specified ranking function over distributed data streams. To minimize communication requirements, the necessary data transmitting must be selected carefully. We study the optimization problem of which objects are necessary to be transmitted and present a new distributed top-k monitoring algorithm to reduce communication cost. In our approach, few objects are transmitted for maintaining the top-k set and communication cost is independent of k. We verify the effectiveness of our approach empirically using both real-world and synthetic data sets. We show that our approach reduces overall communication cost by a factor ranging from 2 to over an order of magnitude compared with the previous approach when k is no lees than 10.