High performance cDNA sequence analysis using grid technology

  • Authors:
  • G. A. Trombetti;I. Merelli;L. Milanesi

  • Affiliations:
  • Istituto di Tecnologie Biomediche, Consiglio Nazionale delle Ricerche, via F.lli Cervi 93, Segrate (Milano), Italy and Università degli Studi di Bologna, Facoltà di Ingegneria, v.le del ...;Istituto di Tecnologie Biomediche, Consiglio Nazionale delle Ricerche, via F.lli Cervi 93, Segrate (Milano), Italy;Istituto di Tecnologie Biomediche, Consiglio Nazionale delle Ricerche, via F.lli Cervi 93, Segrate (Milano), Italy

  • Venue:
  • Journal of Parallel and Distributed Computing
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Innovative DNA sequencers, relying on pyrosequencing, are now being produced, which cut down costs and speed up sequencing by an order of magnitude. Hence the capability of handling high throughput sequencing is becoming increasingly important for Bioinformatics. This study concerns the development of a high performance pipeline for analyzing cDNA sequences produced by a high throughput pyrosequencer. Mainly, this analysis system has been developed by us to map the sequenced cDNA strands against a cDNA database for studying different mutations that can influence the genes functionality. The pipeline supports heterozygous organisms. In order to use a high throughput pyrosequencer fruitfully, the related bioinformatics analysis requires high performance. Hence we implemented our analysis system leveraging the European EGEE project infrastructure: a network of several computational resources and storage facilities distributed at different sites. The results of this high performance pipeline are stored into an output database directly from the grid sites using the Web Services technology. By querying this database it is possible to inspect the analysis results to detect different mutations in the cDNA sequences, as well as other meaningful biological parameters and information.