The difficult interpretation of transcriptome data: the case of the GATC regulatory network

  • Authors:
  • Alessandra Riva;Marie-Odile Delorme;Tony Chevalier;Nicolas Guilhot;Corinne HéNaut;Alain HéNaut

  • Affiliations:
  • CNRS, Laboratoire Génome et Informatique, Tour Evry 2, 523 Place des Terrasses, 91034 Evry cedex, France;CNRS, Laboratoire Génome et Informatique, Tour Evry 2, 523 Place des Terrasses, 91034 Evry cedex, France;METabolic EXplorer S.A., Biopôle Clermont-Limagne, 63 360 Saint-Beauzire, France;METabolic EXplorer S.A., Biopôle Clermont-Limagne, 63 360 Saint-Beauzire, France;METabolic EXplorer S.A., Biopôle Clermont-Limagne, 63 360 Saint-Beauzire, France;CNRS, Laboratoire Génome et Informatique, Tour Evry 2, 523 Place des Terrasses, 91034 Evry cedex, France

  • Venue:
  • Computational Biology and Chemistry
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Genomic analyses on part of Escherichia coli's chromosome had suggested the existence of a GATC regulated network. This has recently been confirmed through a transcriptome analysis. Two hypotheses about the molecular control mechanism have been proposed-(i) the GATC network regulation is caused by the presence of GATC clusters within the coding sequences; the regulation is the direct consequence of the clusters' hemi-methylation and therefore their elevated melting temperature, (ii) the regulation is caused by the presence of GATCs in the non-coding 500bp upstream regions of the affected genes; it is the consequence of an interaction with a regulatory protein like Fnr or CAP. An analysis of the transcriptome data has not allowed us to decide between the two hypotheses. We have therefore taken a classic genomic approach, analyzing the statistical distribution of GATC along the chromosome, using a realistic model of the chromosome as theoretical reference. We observe no particular distribution of GATC in the non-coding upstream regions; however, we confirm the presence of GATC clusters within the genes. In order to verify that the particular distribution observed in E. coli is not a statistical artefact, but has a physiological role, we have carried out the same analysis on Salmonella, making the hypothesis that the genes containing a GATC clusters should be largely the same in the two bacteria. This has been indeed observed, showing that the genes containing a GATC cluster are part of a regulation network. The present is a case study, which demonstrates that the analysis of transcriptome data does not always permit to identify the primary cause of a phenomenon observed; on the other hand, a classic genomic approach linked with a comparative study of related genomes may allow this identification.