Approximate Query Answering Using Data Warehouse Striping

  • Authors:
  • Jorge R. Bernardino;Pedro S. Furtado;Henrique C. Madeira

  • Affiliations:
  • Polytechnic of Coimbra, ISEC, DEIS, Apt. 10057, P-3030-601 Coimbra, Portugal. jorge@isec.pt;University of Coimbra, DEI, Pólo II, P-3030-290 Coimbra, Portugal. pnf@dei.uc.pt;University of Coimbra, DEI, Pólo II, P-3030-290 Coimbra, Portugal. henrique@dei.uc.pt

  • Venue:
  • Journal of Intelligent Information Systems - Special issue on data warehousing and knowledge discovery
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents and evaluates a simple but very effective method to implement large data warehouses on an arbitrary number of computers, achieving very high query execution performance and scalability. The data is distributed and processed in a potentially large number of autonomous computers using our technique called data warehouse striping (DWS). The major problem of DWS technique is that it would require a very expensive cluster of computers with fault tolerant capabilities to prevent a fault in a single computer to stop the whole system. In this paper, we propose a radically different approach to deal with the problem of the unavailability of one or more computers in the cluster, allowing the use of DWS with a very large number of inexpensive computers. The proposed approach is based on approximate query answering techniques that make it possible to deliver an approximate answer to the user even when one or more computers in the cluster are not available. The evaluation presented in the paper shows both analytically and experimentally that the approximate results obtained this way have a very small error that can be negligible in most of the cases.