Practical methods of optimization; (2nd ed.)
Practical methods of optimization; (2nd ed.)
Introduction to statistical pattern recognition (2nd ed.)
Introduction to statistical pattern recognition (2nd ed.)
Feature extraction by non parametric mutual information maximization
The Journal of Machine Learning Research
On the Choice of Smoothing Parameters for Parzen Estimators of Probability Density Functions
IEEE Transactions on Computers
Bayesian classifiers based on kernel density estimation: Flexible classifiers
International Journal of Approximate Reasoning
A Bayesian approach to bandwidth selection for multivariate kernel density estimation
Computational Statistics & Data Analysis
A nonparametric estimation of the entropy for absolutely continuous distributions (Corresp.)
IEEE Transactions on Information Theory
Discriminative components of data
IEEE Transactions on Neural Networks
Bayesian predictive kernel discriminant analysis
Pattern Recognition Letters
Hi-index | 0.10 |
In machine learning and statistics, kernel density estimators are rarely used on multivariate data due to the difficulty of finding an appropriate kernel bandwidth to overcome overfitting. However, the recent advances on information-theoretic learning have revived the interest on these models. With this motivation, in this paper we revisit the classical statistical problem of data-driven bandwidth selection by cross-validation maximum likelihood for Gaussian kernels. We find a solution to the optimization problem under both the spherical and the general case where a full covariance matrix is considered for the kernel. The fixed-point algorithms proposed in this paper obtain the maximum likelihood bandwidth in few iterations, without performing an exhaustive bandwidth search, which is unfeasible in the multivariate case. The convergence of the methods proposed is proved. A set of classification experiments are performed to prove the usefulness of the obtained models in pattern recognition.