Selective voting in convex-hull ensembles improves classification accuracy

  • Authors:
  • Ralph L. Kodell;Chuanlei Zhang;Eric R. Siegel;Radhakrishnan Nagarajan

  • Affiliations:
  • Department of Biostatistics, #781, University of Arkansas for Medical Sciences, 4301 W. Markham St., Little Rock, AR 72205, United States;Department of Biostatistics, #781, University of Arkansas for Medical Sciences, 4301 W. Markham St., Little Rock, AR 72205, United States;Department of Biostatistics, #781, University of Arkansas for Medical Sciences, 4301 W. Markham St., Little Rock, AR 72205, United States;Division of Biomedical Informatics, #782, University of Arkansas for Medical Sciences, 4301 W. Markham St., Little Rock, AR 72205, United States

  • Venue:
  • Artificial Intelligence in Medicine
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Objective: Classification algorithms can be used to predict risks and responses of patients based on genomic and other high-dimensional data. While there is optimism for using these algorithms to improve the treatment of diseases, they have yet to demonstrate sufficient predictive ability for routine clinical practice. They generally classify all patients according to the same criteria, under an implicit assumption of population homogeneity. The objective here is to allow for population heterogeneity, possibly unrecognized, in order to increase classification accuracy and further the goal of tailoring therapies on an individualized basis. Methods and materials: A new selective-voting algorithm is developed in the context of a classifier ensemble of two-dimensional convex hulls of positive and negative training samples. Individual classifiers in the ensemble are allowed to vote on test samples only if those samples are located within or behind pruned convex hulls of training samples that define the classifiers. Results: Validation of the new algorithm's increased accuracy is carried out using two publicly available datasets having cancer as the outcome variable and expression levels of thousands of genes as predictors. Selective voting leads to statistically significant increases in accuracy from 86.0% to 89.8% (p