Biomarker identification by knowledge-driven multilevel ICA and motif analysis
International Journal of Data Mining and Bioinformatics
Investigating Topic Models' Capabilities in Expression Microarray Data Classification
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Hi-index | 3.84 |
Motivation: An important issue in stem cell biology is to understand how to direct differentiation towards a specific cell type. To elucidate the mechanism, previous studies have focused on identifying the responsible gene regulators, which have, however, failed to provide a systemic view of regulatory modules. To obtain a unified description of the regulatory modules, we characterized major stem cell species by employing a co-clustering latent variable model (LVM). The LVM-based method allowed us to elucidate the cell type-specific transcription factors, using genomic sequences as well as expression profiles. Results: We used a list of genes enriched in each of 21 stem cell subpopulations, and their upstream genomic sequences. The LVM-based study allowed us to uncover the regulatory modules for each stem cell cluster, e.g. GABP and E2F for the proliferation phase, and Ap2α and Ap2γ for the quiescence phase. Furthermore, the identities of the stem cell clusters were well revealed by the constituent genes that were directly targeted by the modules. Consequently, our analytical framework was demonstrated to be useful through a detailed case study of stem cell differentiation and can be applied to problems with similar characteristics. Contact:btzhang@bi.snu.ac.kr, rhseong@snu.ac.kr Supplementary Information: Supplementary data are available at http://bi.snu.ac.kr/Publications/LVM_SC/.