Mining phenotypes and informative genes from gene expression data

Authors:
Chun Tang;Aidong Zhang;Jian Pei
Affiliations:
State University of New York at Buffalo, Buffalo, NY;State University of New York at Buffalo, Buffalo, NY;State University of New York at Buffalo, Buffalo, NY
Venue:
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2003

Citing 3
Cited 9

Automatic subspace clustering of high dimensional data for data mining applications

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Biclustering of Expression Data

Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
d-Clusters: Capturing Subspace Correlation in a Large Data Set

ICDE '02 Proceedings of the 18th International Conference on Data Engineering

Mining coherent gene clusters from gene-sample-time microarray data

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Redundancy based feature selection for microarray data

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Cluster Analysis for Gene Expression Data: A Survey

IEEE Transactions on Knowledge and Data Engineering
Computational aspects of mining maximal frequent patterns

Theoretical Computer Science
Multiclass Cancer Classification Using Semisupervised Ellipsoid ARTMAP and Particle Swarm Optimization with Gene Expression Data

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Direct integration of microarrays for selecting informative genes and phenotype classification

Information Sciences: an International Journal
Efficiently mining local conserved clusters from gene expression data

Neurocomputing
Subspace clustering of microarray data based on domain transformation

VDMB'06 Proceedings of the First international conference on Data Mining and Bioinformatics
Gene Co-Adaboost: a semi-supervised approach for classifying gene expression data

Proceedings of the 2nd ACM Conference on Bioinformatics, Computational Biology and Biomedicine

Quantified Score

Hi-index	0.00

Visualization

Abstract

Mining microarray gene expression data is an important research topic in bioinformatics with broad applications. While most of the previous studies focus on clustering either genes or samples, it is interesting to ask whether we can partition the complete set of samples into exclusive groups (called phenotypes) and find a set of informative genes that can manifest the phenotype structure. In this paper, we propose a new problem of simultaneously mining phenotypes and informative genes from gene expression data. Some statistics-based metrics are proposed to measure the quality of the mining results. Two interesting algorithms are developed: the heuristic search and the mutual reinforcing adjustment method. We present an extensive performance study on both real-world data sets and synthetic data sets. The mining results from the two proposed methods are clearly better than those from the previous methods. They are ready for the real-world applications. Between the two methods, the mutual reinforcing adjustment method is in general more scalable, more effective and with better quality of the mining results.