Data access and integration in the ISPIDER proteomics grid

  • Authors:
  • Lucas Zamboulis;Hao Fan;Khalid Belhajjame;Jennifer Siepen;Andrew Jones;Nigel Martin;Alexandra Poulovassilis;Simon Hubbard;Suzanne M. Embury;Norman W. Paton

  • Affiliations:
  • School of Computer Science and Information Systems, Univ. of London, Birkbeck;School of Computer Science and Information Systems, Univ. of London, Birkbeck;Faculty of Life Sciences, University of Manchester;Faculty of Life Sciences, University of Manchester;Faculty of Life Sciences, University of Manchester;School of Computer Science and Information Systems, Univ. of London, Birkbeck;School of Computer Science and Information Systems, Univ. of London, Birkbeck;Faculty of Life Sciences, University of Manchester;School of Computer Science, University of Manchester;School of Computer Science, University of Manchester

  • Venue:
  • DILS'06 Proceedings of the Third international conference on Data Integration in the Life Sciences
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Grid computing has great potential for supporting the integration of complex, fast changing biological data repositories to enable distributed data analysis. One scenario where Grid computing has such potential is provided by proteomics resources which are rapidly being developed with the emergence of affordable, reliable methods to study the proteome. The protein identifications arising from these methods derive from multiple repositories which need to be integrated to enable uniform access to them. A number of technologies exist which enable these resources to be accessed in a Grid environment, but the independent development of these resources means that significant data integration challenges, such as heterogeneity and schema evolution, have to be met. This paper presents an architecture which supports the combined use of Grid data access (OGSA-DAI), Grid distributed querying (OGSA-DQP) and data integration (AutoMed) software tools to support distributed data analysis. We discuss the application of this architecture for the integration of several autonomous proteomics data resources.