P2P OLAP: Data model, implementation and case study

  • Authors:
  • Alejandro A. Vaisman;Mauricio Minuto Espil;Martín Paradela

  • Affiliations:
  • Universidad de Buenos Aires, 1428 Buenos Aires, Argentina;Universidad Católica Argentina, Argentina;Universidad de Buenos Aires, 1428 Buenos Aires, Argentina

  • Venue:
  • Information Systems
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

It is a common situation nowadays that business groups own different companies that operate in an autonomous way. Nevertheless, these companies must be requested to provide the headquarters with summarized information for decision-making. An architecture for cooperative interchange of decision-making information seems to be a natural solution for this problem. We propose the use of a peer-to-peer (P2P) architecture for addressing the problem of processing OLAP data in a distributed environment, in a way that all companies involved can maintain full autonomy over the use of its own data resources. In a scenario like this, data exchange between peers occurs when one of them, in the role of a local peer, receives a query and, for answering it, requests data available in other nodes, denoted acquaintances. No global schema is assumed to exist for any data under this computing paradigm. Henceforth, data provided by an acquaintance of a local peer must be adapted, in a manner that answers to queries posed by local peer users conform the view those users have of their data. Because multidimensional data normally consist of a collection of views of aggregated data, a careful translation process is needed in this case, in order to transform any summary concept that appears in a peer acquaintance into a summary concept meaningful to the requesting peer. We first present a model for multidimensional data distributed in a P2P network, and a query rewriting technique, that allows a local peer to propagate OLAP queries among its acquaintances, obtaining a meaningful and correct answer. Mappings are performed using a novel technique called revise and map, based on belief revision concepts. Revising a dimension instance allows to produce consistent aggregations when an OLAP query is answered at more than one node. We then describe an implementation of a P2P system for answering OLAP queries over a network of data warehouses. We apply our proposal to a real-world case study of an insurance group. Finally, we report the results of an experimental evaluation of our implementation, and discuss the issues that must be accounted for in this setting.