How best to build web-scale data managers?

  • Authors:
  • Philip A. Bernstein;Daniel J. Abadi;Michael J. Cafarella;Joseph M. Hellerstein;Donald Kossmann;Samuel Madden

  • Affiliations:
  • Microsoft;Yale;U. of Washington;U.C. Berkeley;ETH Züric;M.I.T.

  • Venue:
  • Proceedings of the VLDB Endowment
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many of the largest database-driven web sites use custom web-scale data managers (WDMs). On the surface, these WDMs are being applied to problems that are well-suited for relational database systems. Some examples are the following: • Map-Reduce [5], Hadoop [7], and Dryad [9] are used to process queries on large data sets using sequential scan and aggregation. Hive [8] is a data warehouse built on Hadoop. • Google's Bigtable [3] is used to store a replicated table of rows of semi-structured data. • Amazon's Dynamo [6] is used to store partitioned, replicated databases of key-value pairs. Cassandra [2] is similar. • Object caching systems are used instead of a persistent store, such as memcached [10], Oracle's Coherence, and Microsoft's Velocity project.