Query workload-aware overlay construction using histograms

  • Authors:
  • Georgia Koloniari;Yannis Petrakis;Evaggelia Pitoura;Thodoris Tsotsos

  • Affiliations:
  • University of Ioannina, Greece;University of Ioannina, Greece;University of Ioannina, Greece;University of Ioannina, Greece

  • Venue:
  • Proceedings of the 14th ACM international conference on Information and knowledge management
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Peer-to-peer(p2p) systems over an efficient means of data sharing among a dynamically changing set of a large number of a tonomous nodes.Each node in a p2p system is connected with a small number of other nodes thus creating an overlay network of nodes. A query posed at a node is routed through the overlay network towards nodes hosting data items that satisfy it. In this paper, we consider building overlays that exploit the query workload so that nodes are clustered based on their results to a given query workload. The motivation is to create overlays where nodes that match a large number of similar queries are a fewlinks apart. Query frequency is also taken into account so that popular queries have a greater effect on the formation of the overlay than unpopular ones. We focus on range selection queries and se histograms to estimate the query results of each node. Then, nodes are clustered based on the similarity of their histograms. To this end,we introd ce a workload-aware edit distance metric between histograms that takes into account the query workload. Our experimental results show that workload-aware overlays increase the percentage of query results returned for a given number of nodes visited as compared to both random (i.e., unclustered)overlays and non workload-aware clustered overlays (i.e., overlays that cluster nodes based solely on the nodes' content).