Processing and optimization of complex queries in schema-based p2p-networks

  • Authors:
  • Hadhami Dhraief;Alfons Kemper;Wolfgang Nejdl;Christian Wiesner

  • Affiliations:
  • Information Systems Institute, University of Hannover, Germany;Computer Science Department, Technical University of Munich, Germany;L3S Research Center, University of Hannover, Germany;Computer Science Department, University of Passau, Germany

  • Venue:
  • DBISP2P'04 Proceedings of the Second international conference on Databases, Information Systems, and Peer-to-Peer Computing
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Peer-to-Peer infrastructures are emerging as one of the important data management infrastructures in the World Wide Web. So far, however, most work has focused on simple P2P networks which tackle efficient query distribution to a large set of peers but assume that each query can be answered completely at each peer. For queries which need data from more than one peer to be executed this is clearly insufficient. Unfortunately, though quite a few database techniques can be re-used in the P2P context, P2P data management infrastructures pose additional challenges caused by the dynamic nature of these networks. In P2P networks, we can assume neither global knowledge about data distribution, nor the suitableness of static topologies and static query plans for these networks. Unlike in traditional distributed database systems, we cannot assume complete information schema and allocation schema instances but rather work with distributed schema information which can only direct query processing tasks from one node to one or more neighboring nodes. In this paper we first describe briefly our super-peer based topology and schema-aware distributed routing indices extended with suitable statistics and describe how this information is extracted and updated. Second we show how these indices facilitate the distribution and dynamic expansion of query plans. Third we propose a set of transformation rules to optimize query plans and discuss different optimization strategies in detail, enabling efficient distributed query processing in a schema-based P2P network.