A multi-step approach to time series analysis and gene expression clustering

  • Authors:
  • R. Amato;A. Ciaramella;N. Deniskina;C. Del Mondo;D. Di Bernardo;C. Donalek;G. Longo;G. Mangano;G. Miele;G. Raiconi;A. Staiano;R. Tagliaferri

  • Affiliations:
  • Dipartimento di Scienze Fisiche, University of Naples 'Federico II' Naples, ITALY;Dipartimento di Matematica e Informatica, University of Salerno Fisciano, Salerno, ITALY;Dipartimento di Scienze Fisiche, University of Naples 'Federico II' Naples, ITALY;Dipartimento di Scienze Fisiche, University of Naples 'Federico II' Naples, ITALY;Telethon Institute of Genetics and Medicine Naples, ITALY;Dipartimento di Scienze Fisiche, University of Naples 'Federico II' Naples, ITALY;Dipartimento di Scienze Fisiche, University of Naples 'Federico II' Naples, ITALY;Dipartimento di Scienze Fisiche, University of Naples 'Federico II' Naples, ITALY;Dipartimento di Scienze Fisiche, University of Naples 'Federico II' Naples, ITALY;Dipartimento di Matematica e Informatica, University of Salerno Fisciano, Salerno, ITALY;Dipartimento di Scienze Fisiche, University of Naples 'Federico II' Naples, ITALY;Dipartimento di Matematica e Informatica, University of Salerno Fisciano, Salerno, ITALY

  • Venue:
  • Bioinformatics
  • Year:
  • 2006

Quantified Score

Hi-index 3.84

Visualization

Abstract

Motivation: The huge growth in gene expression data calls for the implementation of automatic tools for data processing and interpretation. Results: We present a new and comprehensive machine learning data mining framework consisting in a non-linear PCA neural network for feature extraction, and probabilistic principal surfaces combined with an agglomerative approach based on Negentropy aimed at clustering gene microarray data. The method, which provides a user-friendly visualization interface, can work on noisy data with missing points and represents an automatic procedure to get, with no a priori assumptions, the number of clusters present in the data. Cell-cycle dataset and a detailed analysis confirm the biological nature of the most significant clusters. Availability: The software described here is a subpackage part of the ASTRONEURAL package and is available upon request from the corresponding author. Contact: robtag@unisa.it Supplementary information: Supplementary data are available at Bioinformatics online.