Wrappers for feature subset selection
Artificial Intelligence - Special issue on relevance
Linear Separability of Gene Expression Data Sets
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
A new gene subset selection approach based on linearly separating gene pairs
ICCABS '11 Proceedings of the 2011 IEEE 1st International Conference on Computational Advances in Bio and Medical Sciences
Hi-index | 0.00 |
The concept of linear separability of gene expression data sets with respect to two classes has been recently studied in the literature. The problem is to efficiently find all pairs of genes which induce a linear separation of the data. It has been suggested that an underlying molecular mechanism relates together the two genes of a separating pair to the phenotype under study, such as a specific cancer. In this paper we study the Containment Angle (CA) defined on the unit circle for a linearly separating gene-pair (LS-pair) as an alternative to the paired t-test ranking function for gene selection. Using the CA we also show empirically that a given classifier's error is related to the degree of linear separability of a given data set. Finally we propose gene subset selection methods based on the CA ranking function for LS-pairs and a ranking function for linearly separation genes (LS-genes), and which select only among LS-genes and LS-pairs. Our methods give better results in terms of subset sizes and classification accuracy when compared to a well-performing method, on many data sets.