Update propagation in a streaming warehouse

Authors:
Theodore Johnson;Vladislav Shkapenyuk
Affiliations:
AT&T Labs-Research;AT&T Labs-Research
Venue:
SSDBM'11 Proceedings of the 23rd international conference on Scientific and statistical database management
Year:
2011

Citing 43
Cited 1

Efficiently updating materialized views

SIGMOD '86 Proceedings of the 1986 ACM SIGMOD international conference on Management of data
Applying update streams in a soft real-time database system

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
View maintenance in a warehousing environment

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Algorithms for deferred view maintenance

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
A framework for supporting data integration using the materialized and virtual approaches

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Maintenance of data cubes and summary tables in a warehouse

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Wave-indices: indexing evolving databases

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
On-line warehouse view maintenance

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Supporting multiple view maintenance policies

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Efficient view maintenance at data warehouses

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Flexible update propagation for weakly consistent replication

Proceedings of the sixteenth ACM symposium on Operating systems principles
Shrinking the warehouse update Window

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Visualizing large-scale telecommunication networks and services (case study)

VIS '99 Proceedings of the conference on Visualization '99: celebrating ten years
Efficient resumption of interrupted warehouse loads

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
How to roll a join: asynchronous incremental view maintenance

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Materialized view selection and maintenance using multi-query optimization

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Efficient integration and aggregation of historical information

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Multiple View Consistency for Data Warehousing

ICDE '97 Proceedings of the Thirteenth International Conference on Data Engineering
Expiring Data in a Warehouse

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Issues in Developing Very Large Data Warehouses

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Computing Iceberg Queries Efficiently

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Concurrency in the Data Warehouse

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Declarative Data Cleaning: Language, Model, and Algorithms

Proceedings of the 27th International Conference on Very Large Data Bases
Aggregate Maintenance for Data Warehousing in Informix Red Brick Vista

Proceedings of the 27th International Conference on Very Large Data Bases
Update Propagation Strategies for Improving the Quality of Data on the Web

Proceedings of the 27th International Conference on Very Large Data Bases
Exploiting Punctuation Semantics in Continuous Data Streams

IEEE Transactions on Knowledge and Data Engineering
A Data-Warehouse/OLAP Framework for Scalable Telecommunication Tandem Traffic Analysis

ICDE '00 Proceedings of the 16th International Conference on Data Engineering
Gigascope: a stream database for network applications

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
No pane, no gain: efficient evaluation of sliding-window aggregates over data streams

ACM SIGMOD Record
Semantics and evaluation techniques for window aggregates in data streams

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Update-pattern-aware modeling and processing of continuous queries

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Optimizing refresh of a set of materialized views

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Mutual Consistency in Real-Time Databases

RTSS '06 Proceedings of the 27th IEEE International Real-Time Systems Symposium
FAS: a freshness-sensitive coordination middleware for a cluster of OLAP components
Incremental maintenance for non-distributive aggregate functions

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Lazy maintenance of materialized views

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Cascadia: A System for Specifying, Detecting, and Managing RFID Events

Proceedings of the 6th international conference on Mobile systems, applications, and services
Scheduling Updates in a Real-Time Stream Warehouse

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Longitudinal study of a building-scale RFID ecosystem

Proceedings of the 7th international conference on Mobile systems, applications, and services
Asynchronous view maintenance for VLSD databases

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Stream warehousing with DataDepot

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Peta-scale data warehousing at Yahoo!

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Continuous analytics over discontinuous streams

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data

Data stream warehousing

Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data

Quantified Score

Hi-index	0.00

Visualization

Abstract

Streaming warehouses are used to monitor complex systems such as data centers, web site complexes, and world-wide networks, gathering and correlating rich collections of events and measurements. Ideally, a streaming warehouse provides both historical data, for deep analysis, and real-time data for rapid response to emerging opportunities or problems. The highly temporal nature of the data and the need to support parallel processing naturally leads to extensive use of horizontal partitioning to manage base tables and layers of materialized views. In this paper, we consider the problem of determining when to propagate updates from base tables to dependent views on a partition-wise basis using autonomous updates. We provide a correctness theory for propagating updates to materialized views, simple algorithms which correctly propagate updates, and examples of algorithms which do not. We extend these results to accommodate needs of production warehouses: repartitioning of tables, mutual consistency, and merge tables. We measure the update propagation delays incurred by two different update propagation algorithms in test and production DataDepot warehouses, and find that only those update propagation algorithms which impose no scheduling restrictions are acceptable for use in a real-time streaming warehouse.