A robust unified approach to analyzing methylation and gene expression data

  • Authors:
  • Abbas Khalili;Tim Huang;Shili Lin

  • Affiliations:
  • Department of Statistics, The Ohio State University, Columbus, OH 43210, United States;Division of Human Cancer Genetics, Comprehensive Cancer Center, The Ohio State University, Columbus, OH 43210, United States;Department of Statistics, The Ohio State University, Columbus, OH 43210, United States and Mathematical Biosciences Institute, The Ohio State University, Columbus, OH 43210, United States

  • Venue:
  • Computational Statistics & Data Analysis
  • Year:
  • 2009

Quantified Score

Hi-index 0.03

Visualization

Abstract

Microarray technology has made it possible to investigate expression levels, and more recently methylation signatures, of thousands of genes simultaneously, in a biological sample. Since more and more data from different biological systems or technological platforms are being generated at an incredible rate, there is an increasing need to develop statistical methods that are applicable to multiple data types and platforms. Motivated by such a need, a flexible finite mixture model that is applicable to methylation, gene expression, and potentially data from other biological systems, is proposed. Two major thrusts of this approach are to allow for a variable number of components in the mixture to capture non-biological variation and small biases, and to use a robust procedure for parameter estimation and probe classification. The method was applied to the analysis of methylation signatures of three breast cancer cell lines. It was also tested on three sets of expression microarray data to study its power and type I error rates. Comparison with a number of existing methods in the literature yielded very encouraging results; lower type I error rates and comparable/better power were achieved based on the limited study. Furthermore, the method also leads to more biologically interpretable results for the three breast cancer cell lines.