MDAI '07 Proceedings of the 4th international conference on Modeling Decisions for Artificial Intelligence
New gene selection method for multiclass tumor classification by class centroid
Journal of Biomedical Informatics
Computers in Biology and Medicine
A novel method to robust tumor classification based on MACE filter
ICIC'09 Proceedings of the Intelligent computing 5th international conference on Emerging intelligent computing technology and applications
A Weighted Principal Component Analysis and Its Application to Gene Expression Data
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Robust Classification Method of Tumor Subtype by Using Correlation Filters
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Robust classification using l2,1-norm based regression model
Pattern Recognition
Feature weighted minimum distance classifier with multi-class confidence estimation
AI'06 Proceedings of the 19th Australian joint conference on Artificial Intelligence: advances in Artificial Intelligence
Hi-index | 3.84 |
Motivation: Classification of biological samples by microarrays is a topic of much interest. A number of methods have been proposed and successfully applied to this problem. It has recently been shown that classification by nearest centroids provides an accurate predictor that may outperform much more complicated methods. The 'Prediction Analysis of Microarrays' (PAM) approach is one such example, which the authors strongly motivate by its simplicity and interpretability. In this spirit, I seek to assess the performance of classifiers simpler than even PAM. Results: I surprisingly show that the modified t-statistics and shrunken centroids employed by PAM tend to increase misclassification error when compared with their simpler counterparts. Based on these observations, I propose a classification method called 'Classification to Nearest Centroids' (ClaNC). ClaNC ranks genes by standard t-statistics, does not shrink centroids and uses a class-specific gene-selection procedure. Because of these modifications, ClaNC is arguably simpler and easier to interpret than PAM, and it can be viewed as a traditional nearest centroid classifier that uses specially selected genes. I demonstrate that ClaNC error rates tend to be significantly less than those for PAM, for a given number of active genes. Availability: Point-and-click software is freely available at http://students.washington.edu/adabney/clanc Contact: adabney@u.washington.edu Supplementary Information: http://students.washington.edu/adabney/clanc/supplement.pdf