On Clustering Validation Techniques
Journal of Intelligent Information Systems
An Information-Theoretic Definition of Similarity
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Incorporating Ontology-Driven Similarity Knowledge into Functional Genomics: An Exploratory Study
BIBE '04 Proceedings of the 4th IEEE Symposium on Bioinformatics and Bioengineering
Stability-based validation of clustering solutions
Neural Computation
A statistical framework for genomic data fusion
Bioinformatics
Clustering of diverse genomic data using information fusion
Bioinformatics
A knowledge-driven approach to cluster validity assessment
Bioinformatics
Correlation between Gene Expression and GO Semantic Similarity
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Resampling Method for Unsupervised Estimation of Cluster Validity
Neural Computation
Incorporating Gene Ontology in Clustering Gene Expression Data
CBMS '06 Proceedings of the 19th IEEE Symposium on Computer-Based Medical Systems
Data Analysis and Visualization in Genomics and Proteomics
Data Analysis and Visualization in Genomics and Proteomics
EURASIP Journal on Applied Signal Processing
Using information content to evaluate semantic similarity in a taxonomy
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 1
Hi-index | 0.00 |
While clustering genes remains one of the most popular exploratory tools for expression data, it often results in a highly variable and biologically uninformative clusters. This paper explores a data fusion approach to clustering microarray data. Our method, which combined expression data and Gene Ontology (GO)-derived information, is applied on a real data set to perform genome-wide clustering. A set of novel tools is proposed to validate the clustering results and pick a fair value of infusion coefficient. These tools measure stability, biological relevance, and distance from the expression-only clustering solution. Our results indicate that a data-fusion clustering leads to more stable, biologically relevant clusters that are still representative of the experimental data.