Randomized algorithms
Tracking join and self-join sizes in limited storage
PODS '99 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
STOC '01 Proceedings of the thirty-third annual ACM symposium on Theory of computing
Fast, small-space algorithms for approximate histogram maintenance
STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Sampling from a moving window over streaming data
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Maintaining stream statistics over sliding windows: (extended abstract)
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Processing complex aggregate queries over data streams
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Issues in data stream management
ACM SIGMOD Record
Approximate join processing over data streams
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Load Shedding for Aggregation Queries over Data Streams
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Load shedding in a data stream manager
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Exploiting k-constraints to reduce memory overhead in continuous queries over data streams
ACM Transactions on Database Systems (TODS)
On joining and caching stochastic streams
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
RPJ: producing fast join results on streams through rate-based optimization
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Adaptive load shedding for windowed stream joins
Proceedings of the 14th ACM international conference on Information and knowledge management
Maintaining Sliding Window Skylines on Data Streams
IEEE Transactions on Knowledge and Data Engineering
The CQL continuous query language: semantic foundations and query execution
The VLDB Journal — The International Journal on Very Large Data Bases
State-slice: new paradigm of multi-query optimization of window-based stream queries
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Window-aware load shedding for aggregation queries over data streams
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
ViCo: an adaptive distributed video correlation system
MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
Classification spanning correlated data streams
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Incremental Evaluation of Sliding-Window Queries over Data Streams
IEEE Transactions on Knowledge and Data Engineering
Random Sampling for Continuous Streams with Arbitrary Updates
IEEE Transactions on Knowledge and Data Engineering
GrubJoin: An Adaptive, Multi-Way, Windowed Stream Join with Time Correlation-Aware CPU Load Shedding
IEEE Transactions on Knowledge and Data Engineering
Load shedding for window joins over streams
Journal of Computer Science and Technology
Executing stream joins on the cell processor
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Utility-driven load shedding for xml stream processing
Proceedings of the 17th international conference on World Wide Web
Optimizing away joins on data streams
SSPS '08 Proceedings of the 2nd international workshop on Scalable stream processing system
A survey on algorithms for mining frequent itemsets over data streams
Knowledge and Information Systems
Load Shedding in MavStream: Analysis, Implementation, and Evaluation
BNCOD '08 Proceedings of the 25th British national conference on Databases: Sharing Data, Information and Knowledge
Efficient Processing of Continuous Join Queries Using Distributed Hash Tables
Euro-Par '08 Proceedings of the 14th international Euro-Par conference on Parallel Processing
Out-of-order processing: a new architecture for high-performance stream systems
Proceedings of the VLDB Endowment
Adaptive load diffusion for stream joins
Proceedings of the ACM/IFIP/USENIX 2005 International Conference on Middleware
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
CellJoin: a parallel stream join operator for the cell processor
The VLDB Journal — The International Journal on Very Large Data Bases
Load Shedding for Shared Window Join over Real-Time Data Streams
APWeb/WAIM '09 Proceedings of the Joint International Conferences on Advances in Data and Web Management
Data-driven memory management for stream join
Information Systems
Measuring evolving data streams' behavior through their intrinsic dimension
New Generation Computing
Combining Multiple Interrelated Streams for Incremental Clustering
SSDBM 2009 Proceedings of the 21st International Conference on Scientific and Statistical Database Management
Evaluating top-k queries over incomplete data streams
Proceedings of the 18th ACM conference on Information and knowledge management
Distributed processing of continuous join queries using DHT networks
Proceedings of the 2009 EDBT/ICDT Workshops
Transformation of continuous aggregation join queries over data streams
SSTD'07 Proceedings of the 10th international conference on Advances in spatial and temporal databases
Processing exact results for sliding window joins over data streams using disk storage
International Journal of Intelligent Information and Database Systems
Stratified reservoir sampling over heterogeneous data streams
SSDBM'10 Proceedings of the 22nd international conference on Scientific and statistical database management
A disk-based, adaptive approach to memory-limited computation of windowed stream joins
DEXA'10 Proceedings of the 21st international conference on Database and expert systems applications: Part I
A load shedding framework for XML stream joins
DEXA'10 Proceedings of the 21st international conference on Database and expert systems applications: Part I
Scalable keyword search on large data streams
The VLDB Journal — The International Journal on Very Large Data Bases
Adaptive rate stream processing for smart grid applications on clouds
Proceedings of the 2nd international workshop on Scientific cloud computing
Load shedding for multi-way stream joins based on arrival order patterns
Journal of Intelligent Information Systems
Associated load shedding strategies for computing multi-joins in sensor networks
DASFAA'06 Proceedings of the 11th international conference on Database Systems for Advanced Applications
Towards expressive publish/subscribe systems
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
Load shedding for window joins over streams
WAIM '06 Proceedings of the 7th international conference on Advances in Web-Age Information Management
Adaptive load diffusion for stream joins
Middleware'05 Proceedings of the ACM/IFIP/USENIX 6th international conference on Middleware
Processing flows of information: From data stream to complex event processing
ACM Computing Surveys (CSUR)
Complex event processing with T-REX
Journal of Systems and Software
Overcoming memory limitations in high-throughput event-based applications
Proceedings of the 4th ACM/SPEC International Conference on Performance Engineering
Input data organization for batch processing in time window based computations
Proceedings of the 28th Annual ACM Symposium on Applied Computing
Adaptive stratified reservoir sampling over heterogeneous data streams
Information Systems
Hi-index | 0.00 |
We address the problem of computing approximate answers to continuous sliding-window joins over data streams when the available memory may be insufficient to keep the entire join state. One approximation scenario is to provide a maximum subset of the result, with the objective of losing as few result tuples as possible. An alternative scenario is to provide a random sample of the join result, e.g., if the output of the join is being aggregated. We show formally that neither approximation can be addressed effectively for a sliding-window join of arbitrary input streams. Previous work has addressed only the maximum-subset problem, and has implicitly used a frequency-based model of stream arrival. We address the sampling problem for this model. More importantly, we point out a broad class of applications for which an age-based model of stream arrival is more appropriate, and we address both approximation scenarios under this new model. Finally, for the case of multiple joins being executed with an overall memory constraint, we provide an algorithm for memory allocation across the joins that optimizes a combined measure of approximation in all scenarios considered. All of our algorithms are implemented and experimental results demonstrate their effectiveness.