A Mixed Factors Model for Dimension Reduction and Extraction of a Group Structure in Gene Expression Data

Authors:
Ryo Yoshida;Tomoyuki Higuchi;Seiya Imoto
Affiliations:
Graduate University for Advanced Studies;Institute of Statistical Mathematics;University of Tokyo
Venue:
CSB '04 Proceedings of the 2004 IEEE Computational Systems Bioinformatics Conference
Year:
2004

Citing 4
Cited 8

Mixtures of probabilistic principal component analyzers

Neural Computation
Context-specific Bayesian clustering for gene expression data

RECOMB '01 Proceedings of the fifth annual international conference on Computational biology
Pattern Recognition and Neural Networks

Pattern Recognition and Neural Networks
Finding Regulatory Elements Using Joint Likelihoods for Sequence and Expression Profile Data

Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology

Penalized factor mixture analysis for variable selection in clustered data

Computational Statistics & Data Analysis
2010 Special Issue: Visualization of multi-neuron activity by simultaneous optimization of clustering and dimension reduction

Neural Networks
Bayesian Learning in Sparse Graphical Factor Models via Variational Mean-Field Annealing

The Journal of Machine Learning Research
A penalized likelihood estimation on transcriptional module-based clustering

ICCSA'05 Proceedings of the 2005 international conference on Computational Science and Its Applications - Volume Part III
Theoretical and practical considerations on the convergence properties of the Fisher-EM algorithm

Journal of Multivariate Analysis
Bayesian mixtures of common factor analyzers: Model, variational inference, and applications

Signal Processing
Using conditional independence for parsimonious model-based Gaussian clustering

Statistics and Computing
Model-based clustering of high-dimensional data: A review

Computational Statistics & Data Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

When we cluster tissue samples on the basis of genes, the number of observations to be grouped is much smaller than the dimension of feature vector. In such a case, the applicability of conventional model-based clustering is limited since the high dimensionality of feature vector leads to overfitting during the density estimation process. To overcome such difficulty, we attempt a methodological extension of the factor analysis. Our approach enables us not only to prevent from the occurrence of overfitting, but also to handle the issues of clustering, data compression and extracting a set of genes to be relevant to explain the group structure. The potential usefulness are demonstrated with the application to the leukemia dataset.