DataONE member node pilot integration with TeraGrid?

  • Authors:
  • Nicholas C. Dexter;John W. Cobb;Dave Vieglais;Matthew B. Jones;Mike Lowe

  • Affiliations:
  • University of Tennessee, Knoxville;Oak Ridge National Laboratory;University of Kansas;University of California, Santa Barbara;University Indianapolis

  • Venue:
  • Proceedings of the 2011 TeraGrid Conference: Extreme Digital Discovery
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

The NSF DataONE [1] DataNet project and the NSF Tera-Grid [2] project have initiated a pilot collaboration to deploy and operate the DataONE Member Node software stack on TeraGrid infrastructure. The appealing feature of this collaboration is that it opens up the possibility to add large scale computing as an adjunct to DataONE data, metadata, and workflow manipulation and analysis tools. Additionally, DataONE data archive and curation services are exposed as an option for large scale computing and storage efforts such as TeraGrid/XSEDE. With this joint effort, DataONE also brings an open, persistent, robust, and secure method for accessing Earth sciences data collected by science communities such as The National Evolutionary Synthesis Center's Dryad [3], The Ecological Society of America's Ecological Archive [4], NASA's Distributed Active Archive Center at the Oak Ridge National Laboratory [5], the USGS's National Biological Information Infrastructure [6], the Fire Research & Management Exchange System [7], the Long Term Ecological Research Network [8], and the Knowledge Network for Biocomplexity [9]. Beginning with an April 1st, 2011, allocation, the DataONE Core Cyberinfrastructure Team has been working with the IU Quarry [10] virtual hosting service, and more generally with the TeraGrid data area, on this pilot implementation. The implementation includes multiple virtual servers in order to test different reference implementations of the common DataONE Member Node RESTful web-service functions [11]. These implementations include implementation as a Metacat server [12], as well as a Python Generic Member Node developed by DataONE [13]. The implementations will also mount TeraGrid-wide global storage services (DC-WAN [14] and Albedo [15]) and thus allow integration of input and output of large scale computational runs with wide area archival data and metadata services.