An overview of data warehousing and OLAP technology
ACM SIGMOD Record
NiagaraCQ: a scalable continuous query system for Internet databases
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Progressive approximate aggregate queries with a multi-resolution tree structure
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Design of Dynamic Data Structures
Design of Dynamic Data Structures
Continuously adaptive continuous queries over streams
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Informix under CONTROL: Online Query Processing
Data Mining and Knowledge Discovery
A Multi-Resolution Relational Data Model
VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
The LHAM log-structured history data access method
The VLDB Journal — The International Journal on Very Large Data Bases
Issues in data stream management
ACM SIGMOD Record
Streaming queries over streaming data
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Monitoring streams: a new class of data management applications
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Approximate frequency counts over data streams
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Load shedding in a data stream manager
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Fault-tolerance in the Borealis distributed stream processing system
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
BRAID: stream mining through group lag correlations
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Load shedding and distributed resource control of stream processing networks
Performance Evaluation
GrubJoin: An Adaptive, Multi-Way, Windowed Stream Join with Time Correlation-Aware CPU Load Shedding
IEEE Transactions on Knowledge and Data Engineering
Skippy: a new snapshot indexing method for time travel in the storage manager
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Replay-based approaches to revision processing in stream query engines
SSPS '08 Proceedings of the 2nd international workshop on Scalable stream processing system
Enriching network security analysis with time travel
Proceedings of the ACM SIGCOMM 2008 conference on Data communication
Historical data storage for large scale sensor networks
Proceedings of the 5th French-Speaking Conference on Mobility and Ubiquity Computing
Towards high performance and high availability clusters of archived stream
APWeb/WAIM'07 Proceedings of the joint 9th Asia-Pacific web and 8th international conference on web-age information management conference on Advances in data and web management
Fast Discovery of Group Lag Correlations in Streams
ACM Transactions on Knowledge Discovery from Data (TKDD)
Efficiently correlating complex events over live and archived data streams
Proceedings of the 5th ACM international conference on Distributed event-based system
Associated load shedding strategies for computing multi-joins in sensor networks
DASFAA'06 Proceedings of the 11th international conference on Database Systems for Advanced Applications
DAPSS: exact subsequence matching for data streams
DASFAA'06 Proceedings of the 11th international conference on Database Systems for Advanced Applications
Evolving triggers for dynamic environments
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
Distributed resource allocation in stream processing systems
DISC'06 Proceedings of the 20th international conference on Distributed Computing
Processing flows of information: From data stream to complex event processing
ACM Computing Surveys (CSUR)
Complex event processing with T-REX
Journal of Systems and Software
Improving Bandwidth Efficiency for Consistent Multistream Storage
ACM Transactions on Storage (TOS)
Pattern discovery in data streams under the time warping distance
The VLDB Journal — The International Journal on Very Large Data Bases
Hi-index | 0.00 |
This paper studies Data Stream Management Systems that combine real-time data streams with historical data, and hence access incoming streams and archived data simultaneously. A significant problem for these systems is the I/O cost of fetching historical data which inhibits processing of the live data streams. Our solution is to reduce the I/O cost for accessing the archive by retrieving only a reduced (summarized or sampled) version of the historical data. This paper does not propose new summarization or sampling techniques, but rather a framework in which multiple resolutions of summarization/sampling can be generated efficiently. The query engine can select the appropriate level of summarization to use depending on the resources currently available. The central research problem studied is whether to generate the multiple representations of archived data eagerly upon data-arrival, lazily at query-time, or in a hybrid fashion. Concrete techniques for each approach are presented, which are tied to a specific data reduction technique (random sampling). The tradeoffs among the three approaches are studied both analytically and experimentally.