Classification using functional data analysis for temporal gene expression data

  • Authors:
  • Xiaoyan Leng;Hans-Georg Müller

  • Affiliations:
  • Wake Forest University School of Medicine, Public Health Sciences, Section on Biostatistics Medical Center Blvd., MRI-3, Winston-Salem, NC 27157, USA;Department of Statistics, University of California, One Shields Avenue Davis, CA 95616, USA

  • Venue:
  • Bioinformatics
  • Year:
  • 2006

Quantified Score

Hi-index 3.84

Visualization

Abstract

Motivation: Temporal gene expression profiles provide an important characterization of gene function, as biological systems are predominantly developmental and dynamic. We propose a method of classifying collections of temporal gene expression curves in which individual expression profiles are modeled as independent realizations of a stochastic process. The method uses a recently developed functional logistic regression tool based on functional principal components, aimed at classifying gene expression curves into known gene groups. The number of eigenfunctions in the classifier can be chosen by leave-one-out cross-validation with the aim of minimizing the classification error. Results: We demonstrate that this methodology provides low-error-rate classification for both yeast cell-cycle gene expression profiles and Dictyostelium cell-type specific gene expression patterns. It also works well in simulations. We compare our functional principal components approach with a B-spline implementation of functional discriminant analysis for the yeast cell-cycle data and simulations. This indicates comparative advantages of our approach which uses fewer eigenfunctions/base functions. The proposed methodology is promising for the analysis of temporal gene expression data and beyond. Availability: MATLAB programs are available upon request. Contact: ileng@wfubmc.edu Supplementary information: Supplementary materials are available on the journal's website.