The open connectome project data cluster: scalable analysis and vision for high-throughput neuroscience

  • Authors:
  • Randal Burns;Kunal Lillaney;Daniel R. Berger;Logan Grosenick;Karl Deisseroth;R. Clay Reid;William Gray Roncal;Priya Manavalan;Davi D. Bock;Narayanan Kasthuri;Michael Kazhdan;Stephen J. Smith;Dean Kleissas;Eric Perlman;Kwanghun Chung;Nicholas C. Weiler;Jeff Lichtman;Alexander S. Szalay;Joshua T. Vogelstein;R. Jacob Vogelstein

  • Affiliations:
  • Johns Hopkins University;Johns Hopkins University;Massachusetts Institute of Technology;Stanford University;Stanford University;Allen Institute for Brain Science;Johns Hopkins University;Johns Hopkins University;Janelia Farm Research Campus, Howard Hughes Medical Institute;Harvard University;Duke University;Stanford University;Johns Hopkins University;Janelia Farm Research Campus, Howard Hughes Medical Institute;Stanford University;Stanford University;Harvard University;Johns Hopkins University;Duke University;Johns Hopkins University

  • Venue:
  • Proceedings of the 25th International Conference on Scientific and Statistical Database Management
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

We describe a scalable database cluster for the spatial analysis and annotation of high-throughput brain imaging data, initially for 3-d electron microscopy image stacks, but for time-series and multi-channel data as well. The system was designed primarily for workloads that build connectomes---neural connectivity maps of the brain---using the parallel execution of computer vision algorithms on high-performance compute clusters. These services and open-science data sets are publicly available at openconnecto.me. The system design inherits much from NoSQL scale-out and data-intensive computing architectures. We distribute data to cluster nodes by partitioning a spatial index. We direct I/O to different systems---reads to parallel disk arrays and writes to solid-state storage---to avoid I/O interference and maximize throughput. All programming interfaces are RESTful Web services, which are simple and stateless, improving scalability and usability. We include a performance evaluation of the production system, highlighting the effectiveness of spatial data organization.