Approximate medians and other quantiles in one pass and with limited memory
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
On computing correlated aggregates over continual data streams
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Space-efficient online computation of quantile summaries
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
STOC '01 Proceedings of the thirty-third annual ACM symposium on Theory of computing
Maintaining stream statistics over sliding windows: (extended abstract)
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Database Management Systems
Processing complex aggregate queries over data streams
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
A One-Pass Algorithm for Accurately Estimating Quantiles for Disk-Resident Data
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Correlating XML data streams using tree-edit distance embeddings
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Approximate join processing over data streams
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Stream processing of XPath queries with predicates
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Adaptive filters for continuous queries over distributed data streams
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Multi-dimensional regression analysis of time-series data streams
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Approximate frequency counts over data streams
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
StatStream: statistical monitoring of thousands of data streams in real time
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
How to summarize the universe: dynamic maintenance of quantiles
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Spatially-decaying aggregation over a network: model and algorithms
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Effective Computation of Biased Quantiles over Data Streams
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Stabbing the Sky: Efficient Skyline Computation over Sliding Windows
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Approximate counts and quantiles over sliding windows
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Duplicate detection in click streams
WWW '05 Proceedings of the 14th international conference on World Wide Web
Fast and approximate stream mining of quantiles and frequencies using graphics processors
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Approximate Processing of Massive Continuous Quantile Queries over High-Speed Data Streams
IEEE Transactions on Knowledge and Data Engineering
An integrated efficient solution for computing frequent and top-k elements in data streams
ACM Transactions on Database Systems (TODS)
Maintaining stream statistics over multiscale sliding windows
ACM Transactions on Database Systems (TODS)
Spatially-decaying aggregation over a network
Journal of Computer and System Sciences
Variance estimation over sliding windows
Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Continuously maintaining order statistics over data streams: extended abstract
ADC '07 Proceedings of the eighteenth conference on Australasian database - Volume 63
An efficient algorithm for approximate biased quantile computation in data streams
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
On distance to monotonicity and longest increasing subsequence of a data stream
Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms
Efficient temporal counting with bounded error
The VLDB Journal — The International Journal on Very Large Data Bases
Continuously monitoring top-k uncertain data streams: a probabilistic threshold method
Distributed and Parallel Databases
Incremental tracking of multiple quantiles for network monitoring in cellular networks
Proceedings of the 1st ACM workshop on Mobile internet through cellular networks
Cluster based rank query over multidimensional data streams
Proceedings of the 18th ACM conference on Information and knowledge management
Proceedings of the international conference on Multimedia information retrieval
A robust approach to find effective items in distributed data streams
LSMS'07 Proceedings of the Life system modeling and simulation 2007 international conference on Bio-Inspired computational intelligence and applications
Aggregate computation over data streams
APWeb'08 Proceedings of the 10th Asia-Pacific web conference on Progress in WWW research and development
Finding heavy hitters over the sliding window of a weighted data stream
LATIN'08 Proceedings of the 8th Latin American conference on Theoretical informatics
An Ω(1/ε log 1/ε) space lower bound for finding ε-approximate quantiles in a data stream
FAW'10 Proceedings of the 4th international conference on Frontiers in algorithmics
Fast and accurate computation of equi-depth histograms over data streams
Proceedings of the 14th International Conference on Extending Database Technology
Maintaining moving sums over data streams
ADMA'06 Proceedings of the Second international conference on Advanced Data Mining and Applications
Approximate range mode and range median queries
STACS'05 Proceedings of the 22nd annual conference on Theoretical Aspects of Computer Science
Edit distance to monotonicity in sliding windows
ISAAC'11 Proceedings of the 22nd international conference on Algorithms and Computation
Fast computation of approximate biased histograms on sliding windows over data streams
Proceedings of the 25th International Conference on Scientific and Statistical Database Management
Parallel skyline queries over uncertain data streams in cloud computing environments
International Journal of Web and Grid Services
Hi-index | 0.00 |
Statistics over the most recently observed data elementsare often required in applications involving data streams,such as intrusion detection in network monitoring, stockprice prediction in financial markets, web log mining foraccess prediction, and user click stream mining for personalization.Among various statistics, computing quantilesummary is probably most challenging because of its complexity.In this paper, we study the problem of continuouslymaintaining quantile summary of the most recentlyobserved N elements over a stream so that quantile queriescan be answered with a guaranteed precision of 驴N.Wedeveloped a space efficient algorithm for pre-defined Nthat requires only one scan of the input data stream andO({{\log ( \in ^2 N)} \over\in } + {1 \over { \in ^2 }}) space in the worst cases.We alsodeveloped an algorithm that maintains quantile summaries formost recent N elements so that quantile queries on any mostrecent n elements (n 驴 N) can be answered with a guaranteedprecision of 驴n.The worst case space requirement forthis algorithm is only O({{\log ^2 ( \in N)} \over { \in ^2 }}).Our performance studyindicated that not only the actual quantile estimation erroris far below the guaranteed precision but the space requirementis also much less than the given theoretical bound.