Qserv: a distributed shared-nothing database for the LSST catalog

  • Authors:
  • Daniel L. Wang;Serge M. Monkewitz;Kian-Tat Lim;Jacek Becla

  • Affiliations:
  • SLAC National Accelerator Laboratory, Menlo Park, CA;California Institute of Technology, Pasadena, CA;SLAC National Accelerator Laboratory, Menlo Park, CA;SLAC National Accelerator Laboratory, Menlo Park, CA

  • Venue:
  • State of the Practice Reports
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

The LSST project will provide public access to a database catalog that, in its final year, is estimated to include 26 billion stars and galaxies in dozens of trillion detections in multiple petabytes. Because we are not aware of an existing open-source database implementation that has been demonstrated to efficiently satisfy astronomers' spatial self-joining and cross-matching queries at this scale, we have implemented Qserv, a distributed shared-nothing SQL database query system. To speed development, Qserv relies on two successful open-source software packages: the MySQL RDBMS and the Xrootd distributed file system. We describe Qserv's design, architecture, and ability to scale to LSST's data requirements. We illustrate its potential with test results on a 150-node cluster using 55 billion rows and 30 terabytes of simulated data. These results demonstrate the soundness of Qserv's approach and the scale it achieves on today's hardware.