Benefit and cost of query answering in PDMS

  • Authors:
  • Armin Roth;Felix Naumann

  • Affiliations:
  • Humboldt-Universität zu Berlin, Berlin, Germany;Humboldt-Universität zu Berlin, Berlin, Germany

  • Venue:
  • DBISP2P'05/06 Proceedings of the 2005/2006 international conference on Databases, information systems, and peer-to-peer computing
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Peer data management systems (PDMS) are a natural extension to integrated information systems. They consist of a dynamic set of autonomous peers, each of which can mediate between heterogenous schemas of other peers. A new data source joins a PDMS by defining a semantic mapping to one or more other peers, thus forming a network of peers. Queries submitted to a peer are answered with data residing at that peer and by data that is reached along paths of mappings through the network of peers. However, without optimization methods query reformulation in PDMS is very inefficient due to redundancy in mapping paths. We present a decentral strategy that guides peers in their decision along which further mappings the query should be sent. The strategy uses statistics of the peers own data and statistics of mappings to neighboring peers to predict whether it is worthwhile to send the query to that neighbor-- or whether the query plan should be pruned at this point. These decisions are guided by a benefit and cost model, trading off the amount of data a neighbor will pass back, and the execution cost of that step. Thus, we allow a high scale-up of PDMS in the number of participating peers.