Optimising Kernel Parameters and Regularisation Coefficients for Non-linear Discriminant Analysis

Authors:
Tonatiuh Peña Centeno;Neil D. Lawrence
Affiliations:
-;-
Venue:
The Journal of Machine Learning Research
Year:
2006

Citing 15
Cited 13

Introduction to statistical pattern recognition (2nd ed.)

Introduction to statistical pattern recognition (2nd ed.)
The nature of statistical learning theory

The nature of statistical learning theory
Machine learning, neural and statistical classification

Machine learning, neural and statistical classification
Prediction with Gaussian processes: from linear regression to linear prediction and beyond

Learning in graphical models
Least Squares Support Vector Machine Classifiers

Neural Processing Letters
Proximal support vector machine classifiers

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
NETLAB: algorithms for pattern recognition

NETLAB: algorithms for pattern recognition
Bayesian Learning for Neural Networks

Bayesian Learning for Neural Networks
Neural Networks for Pattern Recognition

Neural Networks for Pattern Recognition
Pattern Recognition and Neural Networks

Pattern Recognition and Neural Networks
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond

Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
Bayesian framework for least-squares support vector machine classifiers, Gaussian processes, and kernel fisher discriminant analysis

Neural Computation
Estimating a Kernel Fisher Discriminant in the Presence of Label Noise

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Shrinkage estimator generalizations of Proximal Support Vector Machines

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Generalized Discriminant Analysis Using a Kernel Approach

Neural Computation

Kernel Maximum a Posteriori Classification with Error Bound Analysis

Neural Information Processing
GEP-Induced Expression Trees as Weak Classifiers

ICDM '08 Proceedings of the 8th industrial conference on Advances in Data Mining: Medical Applications, E-Commerce, Marketing, and Theoretical Aspects
Sparse multinomial kernel discriminant analysis (sMKDA)

Pattern Recognition
Implicit emotional tagging of multimedia using EEG signals and brain computer interface

WSM '09 Proceedings of the first SIGMM workshop on Social media
Proximal support vector machine using local information

Neurocomputing
Multiclass probabilistic kernel discriminant analysis

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Determine the Kernel Parameter of KFDA Using a Minimum Search Algorithm

ICIC '07 Proceedings of the 3rd International Conference on Intelligent Computing: Advanced Intelligent Computing Theories and Applications. With Aspects of Artificial Intelligence
The kernelHMM: learning kernel combinations in structured output domains

Proceedings of the 29th DAGM conference on Pattern recognition
On Over-fitting in Model Selection and Subsequent Selection Bias in Performance Evaluation

The Journal of Machine Learning Research
Two ensemble classifiers constructed from GEP-induced expression trees

KES-AMSTA'10 Proceedings of the 4th KES international conference on Agent and multi-agent systems: technologies and applications, Part II
Constructing nonlinear discriminants from multiple data views

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I
Cellular GEP-induced classifiers

ICCCI'10 Proceedings of the Second international conference on Computational collective intelligence: technologies and applications - Volume PartI
Cellular gene expression programming classifier learning

Transactions on computational collective intelligence V

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we consider a novel Bayesian interpretation of Fisher's discriminant analysis. We relate Rayleigh's coefficient to a noise model that minimises a cost based on the most probable class centres and that abandons the 'regression to the labels' assumption used by other algorithms. Optimisation of the noise model yields a direction of discrimination equivalent to Fisher's discriminant, and with the incorporation of a prior we can apply Bayes' rule to infer the posterior distribution of the direction of discrimination. Nonetheless, we argue that an additional constraining distribution has to be included if sensible results are to be obtained. Going further, with the use of a Gaussian process prior we show the equivalence of our model to a regularised kernel Fisher's discriminant. A key advantage of our approach is the facility to determine kernel parameters and the regularisation coefficient through the optimisation of the marginal log-likelihood of the data. An added bonus of the new formulation is that it enables us to link the regularisation coefficient with the generalisation error.