Data and Metadata Collections for Scientific Applications

  • Authors:
  • Arcot Rajasekar;Reagan Moore

  • Affiliations:
  • -;-

  • Venue:
  • HPCN Europe 2001 Proceedings of the 9th International Conference on High-Performance Computing and Networking
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

The internet has provided a means to share scientific data across groups and disciplines for integrated research extending beyond the local computing environment. But the organization and curation of data pose challenges due to their sensitive nature (where data needs to be protected from unauthorized usage) as well as their heterogeneity and large volume, both in size and number. Moreover, the importance of metadata is coming to the fore, as a means of not only discovering datasets of interest but also for organizational purposes. SDSC has developed data management systems to facilitate use of published digital objects. The associated infrastructure includes persistent archives for managing technology evolution, data handling systems for collection-based access to data, collection management systems for organizing information catalogs, digital library services for manipulating data sets, and data grids for federating multiple collections. The infrastructure components provide systems for digital object management, information management, and knowledge management. We discuss examples of the application of the technology, including distributed collections and data grids for astronomical sky surveys, high energy physics data collections, ecology, and art image digital libraries.