Cluster based rank query over multidimensional data streams

Authors:
Dengcheng He;Yongluan Zhou;Lidan Shou;Gang Chen
Affiliations:
Zhejiang University, Hangzhou, China;University of Southern Denmark, Copenhagen, Denmark;Zhejiang University, Hangzhou, China;Zhejiang University, Hangzhou, China
Venue:
Proceedings of the 18th ACM conference on Information and knowledge management
Year:
2009

Citing 7
Cited 0

Scalability for clustering algorithms revisited

ACM SIGKDD Explorations Newsletter
Space-efficient online computation of quantile summaries

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Continuously Maintaining Quantile Summaries of the Most Recent N Elements over a Data Stream

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Effective Computation of Biased Quantiles over Data Streams

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Power-conserving computation of order-statistics over sensor networks

PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Supporting ranking and clustering as generalized order-by and group-by

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
A framework for clustering evolving data streams

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many data stream monitoring applications involve rank queries and hence a number of efficient evaluation algorithms are proposed recently. Most of these techniques assume that rank queries are executed directly over the whole data space. However, we observe that many applications often require to perform clustering over the data streams before rank queries are run on each cluster. To address the problem, we propose a novel algorithm for integral clustering and ranking processing and we refer to such integrated queries as cluster-based rank queries. The algorithm includes two phases, namely the online phase which maintains the required data structures and statistics, and the query phase which uses these data structures to process queries. Extensive experiments indicate that the proposed algorithm is efficient in both space consumption and query processing.