Consistency-aware evaluation of OLAP queries in replicated data warehouses

  • Authors:
  • Javier García-García;Carlos Ordonez

  • Affiliations:
  • Universidad Nacional Autónoma de México UNAM,, Mexico City, Mexico;University of Houston, Houston, TX, USA

  • Venue:
  • Proceedings of the ACM twelfth international workshop on Data warehousing and OLAP
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

OLAP tools for distributed data warehouses generally assume underlying replicated tables are up to date. Unfortunately, maintaining updated replicas is difficult due to the inherent tradeoff between consistency and availability. In this paper, we propose techniques to evaluate OLAP queries in distributed data warehouses assuming a lazy replication model. Considering that it may be admissible to evaluate OLAP queries with slightly outdated replicated tables, our technique first efficiently computes the degree of obsolescence of replicated local tables and when such result is acceptable, given an error threshold, then the query is evaluated locally, avoiding the transmission of large tables over the network. Otherwise, the query can be remotely evaluated less efficiently with the master copy of tables, provided they are stored at a single site. Inconsistency measurement is computed by adapting distributed set reconciliation algorithms to efficiently compute the symmetric difference between the master and replicated tables. Our improved distributed database algorithm has linear communication complexity and cubic time complexity in the size of the symmetric difference, which is expected to be small in a replicated data warehouse. Our technique is independent of the method employed to propagate data warehouse insertions, deletions and updates. We present experiments simulating distributed databases, with different CPU and transmission speeds, showing our method is effective to decide if the query should be evaluated either locally or remotely.