Generalization error analysis for polynomial kernel methods: algebraic geometrical approach

Authors:
Kazushi Ikeda
Affiliations:
Graduate School of Informatics, Kyoto University, Kyoto, Japan
Venue:
ICANN/ICONIP'03 Proceedings of the 2003 joint international conference on Artificial neural networks and neural information processing
Year:
2003

Citing 10
Cited 2

A theory of the learnable

Communications of the ACM
What size net gives valid generalization?

Neural Computation
Calculation of the learning curve of Bayes optimal classification algorithm for learning a perceptron with noise

COLT '91 Proceedings of the fourth annual workshop on Computational learning theory
Four types of learning curves

Neural Computation
A universal theorem on learning curves

Neural Networks
Statistical theory of learning curves under entropic loss criterion

Neural Computation
Advances in kernel methods: support vector learning

Advances in kernel methods: support vector learning
An introduction to support Vector Machines: and other kernel-based learning methods

An introduction to support Vector Machines: and other kernel-based learning methods
Advances in Large Margin Classifiers

Advances in Large Margin Classifiers
Ideals, Varieties, and Algorithms: An Introduction to Computational Algebraic Geometry and Commutative Algebra, 3/e (Undergraduate Texts in Mathematics)

Ideals, Varieties, and Algorithms: An Introduction to Computational Algebraic Geometry and Commutative Algebra, 3/e (Undergraduate Texts in Mathematics)

An asymptotic statistical theory of polynomial kernel methods

Neural Computation
An asymptotic statistical analysis of support vector machines with soft margins

Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

The generalization properties of learning classifiers with a polynomial kernel function are examined here. We first show that the generalization error of the learning machine depends on the properties of the separating curve, that is, the intersection of the input surface and the true separating hyperplane in the feature space. When the input space is one-dimensional, the problem is decomposed to as many one-dimensional problems as the number of the intersecting points. Otherwise, the generalization error is determined by the class of the separating curve. Next, we consider how the class of the separating curve depends on the true separating function. The class is maximum when the true separating polynomial function is irreducible and smaller otherwise. In either case, the class depends only on the true function and does not on the dimension of the feature space. The results imply that the generalization error does not increase even when the dimension of the feature space gets larger and that the so-called overmodeling does not occur in the kernel learning.