Boosting and Microarray Data

  • Authors:
  • Philip M. Long;Vinsensius Berlian Vega

  • Affiliations:
  • Genome Institute of Singapore. gislongp@nus.edu.sg;Genome Institute of Singapore

  • Venue:
  • Machine Learning
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

We have found one reason why AdaBoost tends not to perform well on gene expression data, and identified simple modifications that improve its ability to find accurate class prediction rules. These modifications appear especially to be needed when there is a strong association between expression profiles and class designations. Cross-validation analysis of six microarray datasets with different characteristics suggests that, suitably modified, boosting provides competitive classification accuracy in general.Sometimes the goal in a microarray analysis is to find a class prediction rule that is not only accurate, but that depends on the level of expression of few genes. Because boosting makes an effort to find genes that are complementary sources of evidence of the correct classification of a tissue sample, it appears especially useful for such gene-efficient class prediction. This appears particularly to be true when there is a strong association between expression profiles and class designations, which is often the case for example when comparing tumor and normal samples.