Supervised cluster analysis for microarray data based on multivariate Gaussian mixture

Authors:
Yi Qu;Shizhong Xu
Affiliations:
Department of Botany and Plant Sciences, University of California, Riverside, CA 92521, USA;Department of Botany and Plant Sciences, University of California, Riverside, CA 92521, USA
Venue:
Bioinformatics
Year:
2004

Citing 0
Cited 7

On Weight Design of Maximum Weighted Likelihood and an Extended EM Algorithm

IEEE Transactions on Knowledge and Data Engineering
A supervised growing neural gas algorithm for cluster analysis

International Journal of Hybrid Intelligent Systems
A supervised growing neural gas algorithm for cluster analysis

International Journal of Hybrid Intelligent Systems
Assessing agreement of clustering methods with gene expression microarray data

Computational Statistics & Data Analysis
Mixture-model cluster analysis using information theoretical criteria

Intelligent Data Analysis
A novel clustering method for analysis of gene microarray expression data

BioDM'06 Proceedings of the 2006 international conference on Data Mining for Biomedical Applications
Sample-space-based feature extraction and class preserving projection for gene expression data

International Journal of Data Mining and Bioinformatics

Quantified Score

Hi-index	3.84

Visualization

Abstract

Motivation: Grouping genes having similar expression patterns is called gene clustering, which has been proved to be a useful tool for extracting underlying biological information of gene expression data. Many clustering procedures have shown success in microarray gene clustering; most of them belong to the family of heuristic clustering algorithms. Model-based algorithms are alternative clustering algorithms, which are based on the assumption that the whole set of microarray data is a finite mixture of a certain type of distributions with different parameters. Application of the model-based algorithms to unsupervised clustering has been reported. Here, for the first time, we demonstrated the use of the model-based algorithm in supervised clustering of microarray data. Results: We applied the proposed methods to real gene expression data and simulated data. We showed that the supervised model-based algorithm is superior over the unsupervised method and the support vector machines (SVM) method. Availability: The program written in the SAS language implementing methods I--III in this report is available upon request. The software of SVMs is available in the website http://svm.sdsc.edu/cgi-bin/nph-SVMsubmit.cgi