Querying the internet with PIER

  • Authors:
  • Ryan Huebsch;Joseph M. Hellerstein;Nick Lanham;Boon Thau Loo;Scott Shenker;Ion Stoica

  • Affiliations:
  • EECS Computer Science Division, UC Berkeley;EECS Computer Science Division, UC Berkeley;EECS Computer Science Division, UC Berkeley;EECS Computer Science Division, UC Berkeley;International Computer Science Institute;EECS Computer Science Division, UC Berkeley

  • Venue:
  • VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
  • Year:
  • 2003

Quantified Score

Hi-index 0.02

Visualization

Abstract

The database research community prides itself on scalable technologies. Yet database systems traditionally do not excel on one important scalability dimension: the degree of distribution. This limitation has hampered the impact of database technologies on massively distributed systems like the Internet. In this paper, we present the initial design of PIER, a massively distributed query engine based on overlay networks, which is intended to bring database query processing facilities to new, widely distributed environments. We motivate the need for massively distributed queries, and argue for a relaxation of certain traditional database research goals in the pursuit of scalability and widespread adoption. We present simulation results showing PIER gracefully running relational queries across thousands of machines, and show results from the same software base in actual deployment on a large experimental cluster.