Data-Fusion in Clustering Microarray Data: Balancing Discovery and Interpretability

Authors:
Rafal Kustra;Adam Zagdanski
Affiliations:
-;-
Venue:
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Year:
2010

Citing 18
Cited 0

On Clustering Validation Techniques

Journal of Intelligent Information Systems
An Information-Theoretic Definition of Similarity

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Incorporating Ontology-Driven Similarity Knowledge into Functional Genomics: An Exploratory Study

BIBE '04 Proceedings of the 4th IEEE Symposium on Bioinformatics and Bioengineering
Stability-based validation of clustering solutions

Neural Computation
FatiGO: a web tool for finding significant associations of Gene Ontology terms with groups of genes

Bioinformatics
GOstat: find statistically overrepresented Gene Ontologies within a group of genes

Bioinformatics
A statistical framework for genomic data fusion

Bioinformatics
Predicting gene function through systematic analysis and quality assessment of high-throughput data

Bioinformatics
Clustering of diverse genomic data using information fusion

Bioinformatics
A knowledge-driven approach to cluster validity assessment

Bioinformatics
Computational cluster validation in post-genomic data analysis

Bioinformatics
Ontological analysis of gene expression data: current tools, limitations, and open problems

Bioinformatics
Correlation between Gene Expression and GO Semantic Similarity

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Resampling Method for Unsupervised Estimation of Cluster Validity

Neural Computation
Incorporating Gene Ontology in Clustering Gene Expression Data

CBMS '06 Proceedings of the 19th IEEE Symposium on Computer-Based Medical Systems
Data Analysis and Visualization in Genomics and Proteomics

Data Analysis and Visualization in Genomics and Proteomics
Cluster structure inference based on clustering stability with applications to microarray data analysis

EURASIP Journal on Applied Signal Processing
Using information content to evaluate semantic similarity in a taxonomy

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 1

Quantified Score

Hi-index	0.00

Visualization

Abstract

While clustering genes remains one of the most popular exploratory tools for expression data, it often results in a highly variable and biologically uninformative clusters. This paper explores a data fusion approach to clustering microarray data. Our method, which combined expression data and Gene Ontology (GO)-derived information, is applied on a real data set to perform genome-wide clustering. A set of novel tools is proposed to validate the clustering results and pick a fair value of infusion coefficient. These tools measure stability, biological relevance, and distance from the expression-only clustering solution. Our results indicate that a data-fusion clustering leads to more stable, biologically relevant clusters that are still representative of the experimental data.