Scalable distributed aggregate computations through collaboration

Authors:
Leonidas Galanis;David J. DeWitt
Affiliations:
Oracle USA, Redwood Shores, CA;University of Wisconsin – Madison, Madison, WI
Venue:
DEXA'05 Proceedings of the 16th international conference on Database and Expert Systems Applications
Year:
2005

Citing 17
Cited 0

OceanStore: an architecture for global-scale persistent storage

ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Chord: A scalable peer-to-peer lookup service for internet applications

Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
A scalable content-addressable network

Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility

SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Wide-area cooperative storage with CFS

SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
A peer-to-peer agent auction

Proceedings of the first international joint conference on Autonomous agents and multiagent systems: part 1
Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems

Middleware '01 Proceedings of the IFIP/ACM International Conference on Distributed Systems Platforms Heidelberg
SCRIBE: The Design of a Large-Scale Event Notification Infrastructure

NGC '01 Proceedings of the Third International COST264 Workshop on Networked Group Communication
Tapestry: An Infrastructure for Fault-tolerant Wide-area Location and

Tapestry: An Infrastructure for Fault-tolerant Wide-area Location and
Toward network data independence

ACM SIGMOD Record
Gossip-Based Computation of Aggregate Information

FOCS '03 Proceedings of the 44th Annual IEEE Symposium on Foundations of Computer Science
Towards a data-centric internet

Towards a data-centric internet
XMark: a benchmark for XML data management

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
The data-centric revolution in networking

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Querying the internet with PIER

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Locating data sources in large distributed systems

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Willow: DHT, aggregation, and publish/subscribe in one protocol

IPTPS'04 Proceedings of the Third international conference on Peer-to-Peer Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Computing aggregates over distributed data sets constitutes an interesting class of distributed queries. Recent advances in peer-to-peer discovery of data sources and query processing techniques have made such queries feasible and potentially more frequent. The concurrent execution of multiple and often identical distributed aggregate queries can place a high burden on the data sources. This paper identifies the scalability bottlenecks that can arise in large peer-to-peer networks from the execution of large numbers of aggregate computations and proposes a solution. In our approach peers are assigned the role of aggregate computation maintainers, which leads to a substantial decrease in requests to the data sources and also avoids duplicate computation by the sites that submit identical aggregate queries. Moreover, a framework is presented that facilitates the collaboration of peers in maintaining aggregate query results. Experimental evaluation of our design demonstrates that it achieves very good performance and scales to thousands of peers.