C4.5: programs for machine learning
C4.5: programs for machine learning
The nature of statistical learning theory
The nature of statistical learning theory
Tissue classification with gene expression profiles
RECOMB '00 Proceedings of the fourth annual international conference on Computational molecular biology
Machine Learning
Cluster ensembles --- a knowledge reuse framework for combining multiple partitions
The Journal of Machine Learning Research
Improving classification of microarray data using prototype-based feature selection
ACM SIGKDD Explorations Newsletter
Meta-clustering of gene expression data and literature-based information
ACM SIGKDD Explorations Newsletter
Orange: from experimental machine learning to interactive data mining
PKDD '04 Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Statistical Comparisons of Classifiers over Multiple Data Sets
The Journal of Machine Learning Research
Analyzing gene expression data in terms of gene sets
Bioinformatics
Techniques for clustering gene expression data
Computers in Biology and Medicine
A review of feature selection techniques in bioinformatics
Bioinformatics
A novel signaling pathway impact analysis
Bioinformatics
Integrating Multiple-Platform Expression Data through Gene Set Features
ISBRA '09 Proceedings of the 5th International Symposium on Bioinformatics Research and Applications
A study of cross-validation and bootstrap for accuracy estimation and model selection
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Using Gene Ontology annotations in exploratory microarray clustering to understand cancer etiology
Pattern Recognition Letters
Hi-index | 0.00 |
The availability of a great range of prior biological knowledge about the roles and functions of genes and gene-gene interactions allows us to simplify the analysis of gene expression data to make it more robust, compact, and interpretable. Here, we objectively analyze the applicability of functional clustering for the identification of groups of functionally related genes. The analysis is performed in terms of gene expression classification and uses predictive accuracy as an unbiased performance measure. Features of biological samples that originally corresponded to genes are replaced by features that correspond to the centroids of the gene clusters and are then used for classifier learning. Using 10 benchmark data sets, we demonstrate that functional clustering significantly outperforms random clustering without biological relevance. We also show that functional clustering performs comparably to gene expression clustering, which groups genes according to the similarity of their expression profiles. Finally, the suitability of functional clustering as a feature extraction technique is evaluated and discussed.