Asymptotic properties of the Fisher kernel

Authors:
Koji Tsuda;Shotaro Akaho;Motoaki Kawanabe;Klaus-Robert Müller
Affiliations:
Max Planck Institute for Biological Cybernetics, 72076 Tübingen, Germany, and AIST Computational Biology Research Center, Koto-ku, Tokyo, 135-0064, Japan;AIST Neuroscience Research Institute, Tsukuba, 305-8568, Japan;Fraunhofer FIRST, 12489 Berlin, Germany;Fraunhofer FIRST, 12489 Berlin, Germany, and University of Potsdam, 14482 Potsdam, Germany
Venue:
Neural Computation
Year:
2004

Citing 15
Cited 3

What size net gives valid generalization?

Neural Computation
Statistical theory of learning curves under entropic loss criterion

Neural Computation
The nature of statistical learning theory

The nature of statistical learning theory
Rigorous learning curve bounds from statistical mechanics

Machine Learning - Special issue on COLT '94
Exploiting generative models in discriminative classifiers

Proceedings of the 1998 conference on Advances in neural information processing systems II
An introduction to support Vector Machines: and other kernel-based learning methods

An introduction to support Vector Machines: and other kernel-based learning methods
A Probabilistic Framework for the Hierarchic Organisation and Classification of Document Collections

Journal of Intelligent Information Systems
A new discriminative kernel from probabilistic models

Neural Computation
Improving Short-Text Classification using Unlabeled Data for Classification Problems

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
The Leave-One-Out Kernel

ICANN '02 Proceedings of the International Conference on Artificial Neural Networks
New Methods for Splice Site Recognition

ICANN '02 Proceedings of the International Conference on Artificial Neural Networks
Maximum entropy discrimination

Maximum entropy discrimination
Algebraic Analysis for Nonidentifiable Learning Machines

Neural Computation
A numerical study on learning curves in stochastic multilayer feedforward networks

Neural Computation
An introduction to kernel-based learning algorithms

IEEE Transactions on Neural Networks

VC dimension and inner product space induced by Bayesian networks

International Journal of Approximate Reasoning
Identifying the potential for failure of businesses in the technology, pharmaceutical and banking sectors using kernel-based machine learning methods

SMC'09 Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics
Face recognition based on multi-class mapping of Fisher scores

Pattern Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

This letter analyzes the Fisher kernel from a statistical point of view. The Fisher kernel is a particularly interesting method for constructing a model of the posterior probability that makes intelligent use of unlabeled data (i.e., of the underlying data density). It is important to analyze and ultimately understand the statistical properties of the Fisher kernel. To this end, we first establish sufficient conditions that the constructed posterior model is realizable (i.e., it contains the true distribution). Realizability immediately leads to consistency results. Subsequently, we focus on an asymptotic analysis of the generalization error, which elucidates the learning curves of the Fisher kernel and how unlabeled data contribute to learning. We also point out that the squared or log loss is theoretically preferable--because both yield consistent estimators--to other losses such as the exponential loss, when a linear classifier is used together with the Fisher kernel. Therefore, this letter underlines that the Fisher kernel should be viewed not as a heuristics but as a powerful statistical tool with well-controlled statistical properties.