The sqlLoader Data-Loading Pipeline

  • Authors:
  • Alex Szalay;Ani R. Thakar;Jim Gray

  • Affiliations:
  • Johns Hopkins University;Johns Hopkins University;Microsoft Research

  • Venue:
  • Computing in Science and Engineering
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Using a database management system (DBMS) is essential to ensure the data integrity and reliability of large, multidimensional data sets. However, loading multiterabyte data into a DBMS is a time-consuming and error-prone task that the authors have tried to automate by developing the sqlLoader pipeline—a distributed workflow system for data loading.