Explanatory Analysis of the Metabolome Using Genetic Programming of Simple, Interpretable Rules

  • Authors:
  • Helen E. Johnson;Richard J. Gilbert;Michael K. Winson;Royston Goodacre;Aileen R. Smith;Jem J. Rowland;Michael A. Hall;Douglas B. Kell

  • Affiliations:
  • Institute of Biological Sciences, University of Wales, Aberyswyth, Ceredigion SY23 3DD, UKhej93@aber.ac.uk;Institute of Biological Sciences, University of Wales, Aberyswyth, Ceredigion SY23 3DD, UKrcg@aber.ac.uk;Institute of Biological Sciences, University of Wales, Aberyswyth, Ceredigion SY23 3DD, UKmkw@aber.ac.uk;Institute of Biological Sciences, University of Wales, Aberyswyth, Ceredigion SY23 3DD, UKrrg@aber.ac.uk;Institute of Biological Sciences, University of Wales, Aberyswyth, Ceredigion SY23 3DD, UKars@aber.ac.uk;Department of Computer Science, University of Wales, Aberyswyth, Ceredigion SY23 3DD, UKjjr@aber.ac.uk;Institute of Biological Sciences, University of Wales, Aberystwyth, Ceredigion SY23 3DD, UKmzh@aber.ac.uk;Institute of Biological Sciences, University of Wales, Aberyswyth, Ceredigion SY23 3DD, UKdbk@aber.ac.uk

  • Venue:
  • Genetic Programming and Evolvable Machines
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

Genetic programming, in conjunction with advanced analytical instruments, is a novel tool for the investigation of complex biological systems at the whole-tissue level. In this study, samples from tomato fruit grown hydroponically under both high- and low-salt conditions were analysed using Fourier-transform infrared spectroscopy (FTIR), with the aim of identifying spectral and biochemical features linked to salinity in the growth environment. FTIR spectra of whole tissue extracts are not amenable to direct visual analysis, so numerical modelling methods were used to generate models capable of classifying the samples based on their spectral characteristics. Genetic programming (GP) provided models with a better prediction accuracy to the conventional data modelling methods used, whilst being much easier to interpret in terms of the variables used. Examination of the GP-derived models showed that there were a small number of spectral regions that were consistently being used. In particular, the spectral region containing absorbances potentially due to a cyanide/nitrile functional group was identified as discriminatory. The explanatory power of the GP models enabled a chemical interpretation of the biochemical differences to be proposed. The combination of FTIR and GP is therefore a powerful and novel analytical tool that, in this study, improves our understanding of the biochemistry of salt tolerance in tomato plants.