Mining multiple phenotype structures underlying gene expression profiles

Authors:
Chun Tang;Aidong Zhang
Affiliations:
State University of New York at Buffalo, Buffalo, NY;State University of New York at Buffalo, Buffalo, NY
Venue:
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Year:
2003

Citing 9
Cited 1

Automatic subspace clustering of high dimensional data for data mining applications

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
OPTICS: ordering points to identify the clustering structure

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Context-specific Bayesian clustering for gene expression data

RECOMB '01 Proceedings of the fifth annual international conference on Computational biology
Analysis of gene expression profiles: class discovery and leaf ordering

Proceedings of the sixth annual international conference on Computational biology
An iterative strategy for pattern discovery in high-dimensional data sets

Proceedings of the eleventh international conference on Information and knowledge management
CLARANS: A Method for Clustering Objects for Spatial Data Mining

IEEE Transactions on Knowledge and Data Engineering
Biclustering of Expression Data

Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
Center CLICK: A Clustering Algorithm with Applications to Gene Expression Analysis

Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
d-Clusters: Capturing Subspace Correlation in a Large Data Set

ICDE '02 Proceedings of the 18th International Conference on Data Engineering

Fully automatic cross-associations

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

DNA microarray technology is now widely used in basic biomedical research for mRNA expression profiling and are increasingly being used to explore patterns of gene expression in clinical research. Automatically detecting phenotype structures from gene expression profiles can provide deep insight into the nature of many diseases as well as lead in the development of new drugs. While most of the previous studies focus on only mining empirical phenotype structure which the experiment controls, it is also interesting to detect possible hidden phenotype structures underlying gene expression profiles.Since the number of samples is usually limited, such data sets are very sparse in high-dimensional gene space. Furthermore, most of the genes of interest are buried in large amount of noise. Unsupervised phenotype structure discovery of such sparse high-dimensional data sets present interesting but challenging problems. In this paper, we propose the model of simultaneously mining both empirical and hidden phenotype structures from gene expression data. We demonstrate the effectiveness and efficiency of the proposed method on various real-world data sets.