Sharing mass spectrometry data in a grid-based distributed proteomics laboratory

Authors:
P. Veltri;M. Cannataro;G. Tradigo
Affiliations:
Universití Magna Grícia di Catanzaro, Italy;Universití Magna Grícia di Catanzaro, Italy;Universití Magna Grícia di Catanzaro, Italy
Venue:
Information Processing and Management: an International Journal
Year:
2007

Citing 6
Cited 3

Mysql

Mysql
Views in a large-scale XML repository

The VLDB Journal — The International Journal on Very Large Data Bases
Cost-based labeling of groups of mass spectra

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Preprocessing of Mass Spectrometry Proteomics Data on the Grid

CBMS '05 Proceedings of the 18th IEEE Symposium on Computer-Based Medical Systems
Globus toolkit version 4: software for service-oriented systems

NPC'05 Proceedings of the 2005 IFIP international conference on Network and Parallel Computing
Introduction to OGSA-DAI services

SAG'04 Proceedings of the First international conference on Scientific Applications of Grid Computing

Scalable biomedical and bioinformatics applications

Proceedings of the 3rd international conference on Scalable information systems
The deployment and evaluation of a bioinformatics grid platform - The HUST_Bio_Grid

Computers and Electrical Engineering
Services, standards, and technologies for high performance computational proteomics

ISPA'07 Proceedings of the 2007 international conference on Frontiers of High Performance Computing and Networking

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data produced by mass spectrometry (MS) have been using in proteomics experiments to identify proteins or patterns in clinical samples that may be responsible for human diseases. MS-based proteomics is becoming a powerful, widely used technique to identify different molecular targets in different pathological contexts. Moreover, MS samples contain a huge amount of data; retrieving such information requires accessing to large volumes of data, thus an efficient organization is necessary both to reduce access time and to allow efficient knowledge extraction. Bioinformatics laboratories have been using more than one mass spectrometer to improve efficiency, largely increasing the volume of data obtained by experiments. Moreover, experimental data is enriched by observations and descriptions added by specialists through metadata. Thus, information retrieval of spectra data (and metadata describing them) inside a laboratory and among different laboratories requires large and scalable storage solutions, and high performance computational platforms. We present a software system for managing, sharing, and querying MS data in a distributed laboratory, using a spectra data management system, called SpecDB, where information retrieval is performed by using computational grid facilities. Information retrieval can be conducted either locally, by considering portions of spectra data, or in a distributed scenario, exploiting metadata and annotations about spectra datasets stored on the grid.