The C-loss function for pattern classification

Authors:
Abhishek Singh;Rosha Pokharel;Jose Principe
Affiliations:
-;-;-
Venue:
Pattern Recognition
Year:
2014

Citing 17
Cited 0

The nature of statistical learning theory

The nature of statistical learning theory
An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization

Machine Learning
Soft Margins for AdaBoost

Machine Learning
A Tutorial on Support Vector Machines for Pattern Recognition

Data Mining and Knowledge Discovery
Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)
Linear Dimensionality Reduction via a Heteroscedastic Extension of LDA: The Chernoff Criterion

IEEE Transactions on Pattern Analysis and Machine Intelligence
Pruning Training Sets for Learning of Object Categories

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Trading convexity for scalability

ICML '06 Proceedings of the 23rd international conference on Machine learning
Pattern Recognition and Machine Learning (Information Science and Statistics)

Pattern Recognition and Machine Learning (Information Science and Statistics)
Linear dimensionality reduction by maximizing the Chernoff distance in the transformed space

Pattern Recognition
The correntropy MACE filter

Pattern Recognition
Using correntropy as a cost function in linear adaptive filters

IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Information theoretic learning with adaptive kernels

Signal Processing
Correntropy: Properties and Applications in Non-Gaussian Signal Processing

IEEE Transactions on Signal Processing
Generalized correlation function: definition, properties, and application to blind equalization

IEEE Transactions on Signal Processing - Part I
A Pitch Detector Based on a Generalized Correlation Function

IEEE Transactions on Audio, Speech, and Language Processing
Heteroscedastic linear feature extraction based on sufficiency conditions

Pattern Recognition

Quantified Score

Hi-index	0.01

Visualization

Abstract

This paper presents a new loss function for neural network classification, inspired by the recently proposed similarity measure called Correntropy. We show that this function essentially behaves like the conventional square loss for samples that are well within the decision boundary and have small errors, and L"0 or counting norm for samples that are outliers or are difficult to classify. Depending on the value of the kernel size parameter, the proposed loss function moves smoothly from convex to non-convex and becomes a close approximation to the misclassification loss (ideal 0-1 loss). We show that the discriminant function obtained by optimizing the proposed loss function in the neighborhood of the ideal 0-1 loss function to train a neural network is immune to overfitting, more robust to outliers, and has consistent and better generalization performance as compared to other commonly used loss functions, even after prolonged training. The results also show that it is a close competitor to the SVM. Since the proposed method is compatible with simple gradient based online learning, it is a practical way of improving the performance of neural network classifiers.