A mixture model approach for the analysis of microarray gene expression data
Computational Statistics & Data Analysis
Developing a feature weight self-adjustment mechanism for a K-means clustering algorithm
Computational Statistics & Data Analysis
Adaptive clustering for time series: Application for identifying cell cycle expressed genes
Computational Statistics & Data Analysis
An efficient hybrid data clustering method based on K-harmonic means and Particle Swarm Optimization
Expert Systems with Applications: An International Journal
Which Distance for the Identification and the Differentiation of Cell-Cycle Expressed Genes?
IDA '09 Proceedings of the 8th International Symposium on Intelligent Data Analysis: Advances in Intelligent Data Analysis VIII
Pattern Recognition Letters
A novel hybrid K-harmonic means and gravitational search algorithm approach for clustering
Expert Systems with Applications: An International Journal
A review on particle swarm optimization algorithms and their applications to data clustering
Artificial Intelligence Review
Parametric Estimation of the Local False Discovery Rate for Identifying Genetic Associations
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Hi-index | 0.03 |
DNA microarrays make it possible to study simultaneously the expression of thousands of genes in a biological sample. Univariate clustering techniques have been used to discover target genes with differential expression between two experimental conditions. Because of possible loss of information due to use of univariate summary statistics, it may be more effective to use multivariate statistics. We present multivariate normal mixture model based clustering analyses to detect differential gene expression between two conditions. Deviating from the general mixture model and model-based clustering, we propose mixture models with specific mean and covariance structures that account for special features of two-condition microarray experiments. Explicit updating formulas in the EM algorithm for three such models are derived. The methods are applied to a real dataset to compare the expression levels of 1176 genes of rats with and without pneumococcal middle-ear infection to illustrate the performance and usefulness of this approach. About 10 genes and 20 genes are found to be differentially expressed in a six-dimensional modeling and a bivariate modeling, respectively. Two simulation studies are conducted to compare the performance of univariate and multivariate methods. Depending on data, neither method can always dominate the other. The results suggest that multivariate normal mixture models can be useful alternatives to univariate methods to detect differential gene expression in exploratory data analysis.