Differentially Expressed Gene Identification Based on Separability Index

  • Authors:
  • Meir Perez;Jonathan Featherston;David M. Rubin;Tshilidzi Marwala;Lesley E. Scott;Wendy Stevens

  • Affiliations:
  • -;-;-;-;-;-

  • Venue:
  • ICMLA '09 Proceedings of the 2009 International Conference on Machine Learning and Applications
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

The identification of differentially expressed genes is central to microarray data analysis. Presented in this paper is an approach to differentially expressed gene identification based on a Separability Index (SI). Features are selected by identifying the optimal number of top ranking genes which result in maximum class separability. The approach was implemented on a training dataset comprising 400 samples from three types of cancers: colon, breast and lung cancer. The top 4222 genes resulted in a maximum separability of 91%. These genes were then used to classify a testing dataset comprising 250 samples, using a K-nearest neighbour (K-NN) classifier, achieving an accuracy of 92%. This outperformed a K-NN classifier trained on features selected based on p