The art of computer programming, volume 2 (3rd ed.): seminumerical algorithms
The art of computer programming, volume 2 (3rd ed.): seminumerical algorithms
An optimality proof of the LRU-K page replacement algorithm
Journal of the ACM (JACM)
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Principles of Optimal Page Replacement
Journal of the ACM (JACM)
STOC '01 Proceedings of the thirty-third annual ACM symposium on Theory of computing
Competitive Analysis of Paging
Developments from a June 1996 seminar on Online algorithms: the state of the art
Maintaining variance and k-medians over data stream windows
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Approximate join processing over data streams
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Exploiting k-constraints to reduce memory overhead in continuous queries over data streams
ACM Transactions on Database Systems (TODS)
Multi-dimensional regression analysis of time-series data streams
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Load shedding in a data stream manager
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Memory-limited execution of windowed stream joins
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
GrubJoin: An Adaptive, Multi-Way, Windowed Stream Join with Time Correlation-Aware CPU Load Shedding
IEEE Transactions on Knowledge and Data Engineering
Load shedding for window joins over streams
Journal of Computer Science and Technology
Data-driven memory management for stream join
Information Systems
Combining Multiple Interrelated Streams for Incremental Clustering
SSDBM 2009 Proceedings of the 21st International Conference on Scientific and Statistical Database Management
Evaluating top-k queries over incomplete data streams
Proceedings of the 18th ACM conference on Information and knowledge management
Buffer-preposed qos adaptation framework and load shedding techniques over streams
WISE'06 Proceedings of the 7th international conference on Web Information Systems
Hi-index | 0.00 |
We consider the problem of joining data streams using limited cache memory, with the goal of producing as many result tuples as possible from the cache. Many cache replacement heuristics have been proposed in the past. Their performance often relies on implicit assumptions about the input streams, e.g., that the join attribute values follow a relatively stationary distribution. However, in general and in practice, streams often exhibit more complex behaviors, such as increasing trends and random walks, rendering these "hardwired" heuristics inadequate.In this paper, we propose a framework that is able to exploit known or observed statistical properties of input streams to make cache replacement decisions aimed at maximizing the expected number of result tuples. To illustrate the complexity of the solution space, we show that even an algorithm that considers, at every time step, all possible sequences of future replacement decisions may not be optimal. We then identify a condition between two candidate tuples under which an optimal algorithm would always choose one tuple over the other to replace. We develop a heuristic that behaves consistently with an optimal algorithm whenever this condition is satisfied. We show through experiments that our heuristic outperforms previous ones.As another evidence of the generality of our framework, we show that the classic caching/paging problem for static objects can be reduced to a stream join problem and analyzed under our framework, yielding results that agree with or extend classic ones.