TRAC: toward recency and consistency reporting in a database with distributed data sources

Authors:
Jiansheng Huang;Jeffrey F. Naughton;Miron Livny
Affiliations:
University of Wisconsin at Madison, Madison, WI;University of Wisconsin at Madison, Madison, WI;University of Wisconsin at Madison, Madison, WI
Venue:
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Year:
2006

Citing 9
Cited 1

Read-only transactions in a distributed database

ACM Transactions on Database Systems (TODS)
Adaptive distributed data management with weak consistent replicated data

SAC '96 Proceedings of the 1996 ACM symposium on Applied Computing
Quasi-Copies: Efficient Data Sharing for Information Retrieval Systems

EDBT '88 Proceedings of the International Conference on Extending Database Technology: Advances in Database Technology
Currency-Based Updates to Distributed Materialized Views

Proceedings of the Sixth International Conference on Data Engineering
GAMMA - A High Performance Dataflow Database Machine

VLDB '86 Proceedings of the 12th International Conference on Very Large Data Bases
Lineage Tracing for General Data Warehouse Transformations

Proceedings of the 27th International Conference on Very Large Data Bases
Relaxed currency and consistency: how to say "good enough" in SQL

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Using latency-recency profiles for data delivery on the web

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Balancing performance and data freshness in web database servers

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29

K-relevance: a spectrum of relevance for data sources impacting a query

Proceedings of the 2007 ACM SIGMOD international conference on Management of data

Quantified Score

Hi-index	0.00

Visualization

Abstract

Distributed computing environments, including workflows in computational grids, present challenges for monitoring, as the state of the system may be captured only in logs distributed throughout the system. One approach to monitoring such systems is to "sniff" these distributed logs and to store their transformed content in a DBMS. This centralizes the state and exposes it for querying; unfortunately, it also creates uncertainty with respect to the recency and consistency of the data. Previous related work has focused on allowing queries to express currency and consistency constraints, which are then enforced by "pulling" data from the distributed sources on demand, or by requiring synchronous updates of a centralized data store. In some instances this is impossible due to legacy system issues or inefficient as the system scales to large numbers of processors. Accordingly, we propose that instead of enforcing consistency and recency, such monitoring systems should report these properties along with query results, with the hope that this will allow the data to be appropriately interpreted. We present techniques for reporting consistency and recency for queries and evaluate them with respect to efficiency and precision. Finally, we describe our prototype implementation and present experimental results of our techniques.