Strangeness-based feature weighting and classification of gene expression profiles

  • Authors:
  • Haifeng Shao;Bei Yu;Joseph Nadeau

  • Affiliations:
  • Case Western Reserve Univ., Cleveland, Ohio;Northwestern University, Evanston, IL;Case Western Reserve Univ., Cleveland, Ohio

  • Venue:
  • Proceedings of the 2008 ACM symposium on Applied computing
  • Year:
  • 2008

Quantified Score

Hi-index 0.01

Visualization

Abstract

Achieving high classification accuracy is a major challenge in the diagnosis of cancer types based on gene expression profiles. These profiles are notoriously noisy in that a large number of genes might be irrelevant to or weakly associated with disease phenotypes such as tumors. Assigning different weights to genes could decrease or diminish the influences of those "noisy" signals, and thereby improve classification accuracy. We propose an intuitive and simple approach to cancer classification with feature weighting. Our strangeness-based feature weighting method learns weights for different genes based on their classification performance. Those genes with large weights can be used as discriminative genes. We demonstrate that our implementation of k-NN classifier achieved high classification accuracy on two benchmark cancer data sets. In the case of relatively low accuracy, the proposed method could be used as a feature filter. With combined feature weighting and AdaBoost, we achieved a better classification accuracy (100%) than using strangeness-based k-NN alone.