A New Measure of Classifier Performance for Gene Expression Data

Authors:
Blaise Hanczar;Avner Bar-Hen
Affiliations:
University Paris Descartes, Paris;University Paris Descartes, Paris
Venue:
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Year:
2012

Citing 5
Cited 0

A comprehensive evaluation of multicategory classification methods for microarray gene expression cancer diagnosis

Bioinformatics
An introduction to ROC analysis

Pattern Recognition Letters - Special issue: ROC analysis in pattern recognition
Measuring classifier performance: a coherent alternative to the area under the ROC curve

Machine Learning
Small-sample precision of ROC-related estimates

Bioinformatics
Small-sample precision of ROC-related estimates

Bioinformatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

One of the major aims of many microarray experiments is to build discriminatory diagnosis and prognosis models. A large number of supervised methods have been proposed in literature for microarray-based classification for this purpose. Model evaluation and comparison is a critical issue and, the most of the time, is based on the classification cost. This classification cost is based on the costs of false positives and false negative, that are generally unknown in diagnostics problems. This uncertainty may highly impact the evaluation and comparison of the classifiers. We propose a new measure of classifier performance that takes account of the uncertainty of the error. We represent the available knowledge about the costs by a distribution function defined on the ratio of the costs. The performance of a classifier is therefore computed over the set of all possible costs weighted by their probability distribution. Our method is tested on both artificial and real microarray data sets. We show that the performance of classifiers is very depending of the ratio of the classification costs. In many cases, the best classifier can be identified by our new measure whereas the classic error measures fail.