Continuously maintaining order statistics over data streams: extended abstract

Authors:
Xuemin Lin
Affiliations:
University of New South Wales, Australia
Venue:
ADC '07 Proceedings of the eighteenth conference on Australasian database - Volume 63
Year:
2007

Citing 25
Cited 2

Probabilistic counting algorithms for data base applications

Journal of Computer and System Sciences
Approximate medians and other quantiles in one pass and with limited memory

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Random sampling techniques for space efficient online computation of order statistics of large datasets

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Space-efficient online computation of quantile summaries

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Data-streams and histograms

STOC '01 Proceedings of the thirty-third annual ACM symposium on Theory of computing
Approximate counting of inversions in a data stream

STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Models and issues in data stream systems

Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Maintaining stream statistics over sliding windows: (extended abstract)

SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Counting inversions in lists

SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
Counting Distinct Elements in a Data Stream

RANDOM '02 Proceedings of the 6th International Workshop on Randomization and Approximation Techniques
Continuously Maintaining Quantile Summaries of the Most Recent N Elements over a Data Stream

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Medians and beyond: new aggregation techniques for sensor networks

SenSys '04 Proceedings of the 2nd international conference on Embedded networked sensor systems
Synopsis diffusion for robust aggregation in sensor networks

SenSys '04 Proceedings of the 2nd international conference on Embedded networked sensor systems
Effective Computation of Biased Quantiles over Data Streams

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Power-conserving computation of order-statistics over sensor networks

PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Approximate counts and quantiles over sliding windows

PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Space efficient mining of multigraph streams

Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Holistic aggregates in a networked world: distributed tracking of approximate quantiles

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Tributaries and deltas: efficient and robust aggregation in sensor network streams

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Approximate Processing of Massive Continuous Quantile Queries over High-Speed Data Streams

IEEE Transactions on Knowledge and Data Engineering
Space-efficient Relative Error Order Sketch over Data Streams

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
What's Different: Distributed, Continuous Monitoring of Duplicate-Resilient Aggregates on Data Streams

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Space- and time-efficient deterministic algorithms for biased quantiles over data streams

Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Efficient quantile retrieval on multi-dimensional data

EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
Adaptive spatial partitioning for multidimensional data streams

ISAAC'04 Proceedings of the 15th international conference on Algorithms and Computation

A Streaming Parallel Decision Tree Algorithm

The Journal of Machine Learning Research
An Ω(1/ε log 1/ε) space lower bound for finding ε-approximate quantiles in a data stream

FAW'10 Proceedings of the 4th international conference on Frontiers in algorithmics

Quantified Score

Hi-index	0.00

Visualization

Abstract

A rank query is essentially to find a data element with a given rank against a monotonic order specified on data elements. It has several equivalent variations [8, 17, 30]. Rank queries over data streams have been investigated in the form of quantile computation. A &phis;-quantile (&phis; ∈ (0,1]) of a collection of N data elements is the element with rank [&phis;N] against a monotonic order specified on data elements. Rank and quantile queries have many applications [1, 3, 6, 7, 10, 14-16, 26, 27], including monitoring high speed networks, trends and fleeting opportunities detection in the stock market, sensor data analysis, Web ranking aggregation and log mining, etc. In these applications, they not only play very important roles in the decision making but also have been used in summarizing data distributions of data streams. The following example shows a popular tool to compare the distributions of two data sets (data streams).