Incremental quantile estimation for massive tracking
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Fast, small-space algorithms for approximate histogram maintenance
STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Continuously Maintaining Quantile Summaries of the Most Recent N Elements over a Data Stream
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Approximate counts and quantiles over sliding windows
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
How to summarize the universe: dynamic maintenance of quantiles
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Tracking quantiles of network data streams with dynamic operations
INFOCOM'10 Proceedings of the 29th conference on Information communications
Hi-index | 0.00 |
Network monitoring in cellular networks requires the tracking of quantiles for data distributions of many evolving network measurements (e.g. number of high signaling subscribers per minute). Most quantile estimation algorithms are based on a summary of the empirical data distribution, using either a representative sample or a global approximation of the entire distribution. In contrast, by viewing data as a quantity from a random distribution, the stochastic approximation (SA) for quantile estimation does not keep a global approximation, but rather local approximations at the quantiles of interest, and therefore uses negligible memory even for estimating tail quantiles. However, the current stochastic approximation algorithm for quantile estimation tracks each quantile separately, and this may lead to a violation of the monotone property of quantiles. In this paper, we propose a stochastic approximation technique that enables the simultaneous tracking of multiple quantiles. Our technique maintains the monotone property of different quantiles, and is adaptive to changes in the data distribution. We evaluate its performance using real cellular provider datasets. Our results show that the technique is very efficient.