Bayesian mixed-effects inference on classification performance in hierarchical data sets

Authors:
Kay H. Brodersen;Christoph Mathys;Justin R. Chumbley;Jean Daunizeau;Cheng Soon Ong;Joachim M. Buhmann;Klaas E. Stephan
Affiliations:
Machine Learning Lab., Dept. of Comp. Sci., ETH Zurich, Switzerland and Lab. for Social and Neural Systems Res., Dept. of Economics, Univ. of Zurich, Zurich, Switzerland and Translational Neuromod ...;Laboratory for Social and Neural Systems Research, Department of Economics, University of Zurich, Zurich, Switzerland and Translational Neuromodeling Unit, Institute for Biomedical Engineering, ET ...;Laboratory for Social and Neural Systems Research, Department of Economics, University of Zurich, Zurich, Switzerland;Laboratory for Social and Neural Systems Research, Department of Economics, University of Zurich, Zurich, Switzerland and Translational Neuromodeling Unit, Institute for Biomedical Engineering, ET ...;Machine Learning Laboratory, Department of Computer Science, ETH Zurich, Zurich, Switzerland;Machine Learning Laboratory, Department of Computer Science, ETH Zurich, Zurich, Switzerland;Lab. for Social and Neural Sys. Res., Dept. of Economics, Univ. of Zurich, Switzerland, Wellcome Trust Centre for Neuroimaging, Univ. College London, London, United Kingdom and Translational Neuro ...
Venue:
The Journal of Machine Learning Research
Year:
2012

Citing 15
Cited 0

A Bayesian Method for the Induction of Probabilistic Networks from Data

Machine Learning
Bayesian interpolation

Neural Computation
Tutorial on Practical Prediction Theory for Classification

The Journal of Machine Learning Research
The class imbalance problem: A systematic study

Intelligent Data Analysis
Classification based upon gene expression data

Bioinformatics
Email Spam Filtering: A Systematic Review

Foundations and Trends in Information Retrieval
Bayesian Networks and Decision Graphs

Bayesian Networks and Decision Graphs
SMOTE: synthetic minority over-sampling technique

Journal of Artificial Intelligence Research
A study of cross-validation and bootstrap for accuracy estimation and model selection

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Improving Bayesian credibility intervals for classifier error rates using maximum entropy empirical priors

Artificial Intelligence in Medicine
The Balanced Accuracy and Its Posterior Distribution

ICPR '10 Proceedings of the 2010 20th International Conference on Pattern Recognition
The Binormal Assumption on Precision-Recall Curves

ICPR '10 Proceedings of the 2010 20th International Conference on Pattern Recognition
LIBSVM: A library for support vector machines

ACM Transactions on Intelligent Systems and Technology (TIST)
Bayesian hypothesis testing for pattern discrimination in brain decoding

Pattern Recognition
Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images

IEEE Transactions on Pattern Analysis and Machine Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Classification algorithms are frequently used on data with a natural hierarchical structure. For instance, classifiers are often trained and tested on trial-wise measurements, separately for each subject within a group. One important question is how classification outcomes observed in individual subjects can be generalized to the population from which the group was sampled. To address this question, this paper introduces novel statistical models that are guided by three desiderata. First, all models explicitly respect the hierarchical nature of the data, that is, they are mixed-effects models that simultaneously account for within-subjects (fixed-effects) and across-subjects (random-effects) variance components. Second, maximum-likelihood estimation is replaced by full Bayesian inference in order to enable natural regularization of the estimation problem and to afford conclusions in terms of posterior probability statements. Third, inference on classification accuracy is complemented by inference on the balanced accuracy, which avoids inflated accuracy estimates for imbalanced data sets. We introduce hierarchical models that satisfy these criteria and demonstrate their advantages over conventional methods usingMCMC implementations for model inversion and model selection on both synthetic and empirical data. We envisage that our approach will improve the sensitivity and validity of statistical inference in future hierarchical classification studies.