Scalable distributed aggregate computations through collaboration

  • Authors:
  • Leonidas Galanis;David J. DeWitt

  • Affiliations:
  • Oracle USA, Redwood Shores, CA;University of Wisconsin – Madison, Madison, WI

  • Venue:
  • DEXA'05 Proceedings of the 16th international conference on Database and Expert Systems Applications
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Computing aggregates over distributed data sets constitutes an interesting class of distributed queries. Recent advances in peer-to-peer discovery of data sources and query processing techniques have made such queries feasible and potentially more frequent. The concurrent execution of multiple and often identical distributed aggregate queries can place a high burden on the data sources. This paper identifies the scalability bottlenecks that can arise in large peer-to-peer networks from the execution of large numbers of aggregate computations and proposes a solution. In our approach peers are assigned the role of aggregate computation maintainers, which leads to a substantial decrease in requests to the data sources and also avoids duplicate computation by the sites that submit identical aggregate queries. Moreover, a framework is presented that facilitates the collaboration of peers in maintaining aggregate query results. Experimental evaluation of our design demonstrates that it achieves very good performance and scales to thousands of peers.