Ensemble classification of paired data

  • Authors:
  • Werner Adler;Alexander Brenning;Sergej Potapov;Matthias Schmid;Berthold Lausen

  • Affiliations:
  • Department of Biometry and Epidemiology, University of Erlangen-Nuremberg, Waldstrasse 6, 91054 Erlangen, Germany;Department of Geography, University of Waterloo, 200 University Ave. W, Waterloo, Ont., Canada N2L 3G1;Department of Biometry and Epidemiology, University of Erlangen-Nuremberg, Waldstrasse 6, 91054 Erlangen, Germany;Department of Biometry and Epidemiology, University of Erlangen-Nuremberg, Waldstrasse 6, 91054 Erlangen, Germany;Department of Mathematical Sciences, University of Essex, Wivenhoe Park, Colchester, CO4 3SQ, UK

  • Venue:
  • Computational Statistics & Data Analysis
  • Year:
  • 2011

Quantified Score

Hi-index 0.03

Visualization

Abstract

In many medical applications, data are taken from paired organs or from repeated measurements of the same organ or subject. Subject based as opposed to observation based evaluation of these data results in increased efficiency of the estimation of the misclassification rate. A subject based approach for classification in the generation of bootstrap samples of bagging and bundling methods is analyzed. A simulation model is used to compare the performance of different strategies to create the bootstrap samples which are used to grow individual trees. The proposed approach is compared to linear discriminant analysis, logistic regression, random forests and gradient boosting. Finally, the simulation results are applied to glaucoma diagnosis using both eyes of glaucoma patients and healthy controls. It is demonstrated that the proposed subject based resampling reduces the misclassification rate.