SVM with random labels

Authors:
Bruno Apolloni;Simone Bassis;Dario Malchiodi
Affiliations:
University of Milan, Department of Computer Science, Milan, Italy;University of Milan, Department of Computer Science, Milan, Italy;University of Milan, Department of Computer Science, Milan, Italy
Venue:
KES'07/WIRN'07 Proceedings of the 11th international conference, KES 2007 and XVII Italian workshop on neural networks conference on Knowledge-based intelligent information and engineering systems: Part III
Year:
2007

Citing 3
Cited 0

Support-Vector Networks

Machine Learning
PAC learning of concept classes through the boundaries of their items

Theoretical Computer Science
Advances in kernel methods: support vector learning

Advances in kernel methods: support vector learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

We devise an SVM for partitioning a sample space affected by random binary labels. In the hypothesis that a smooth, possibly symmetric, conditional label distribution graduates the passage from the all 0-label domain to the all 1-label domain and under other regularity conditions, the algorithm supplies an estimate of the above probabilities. Within the Algorithmic Inference framework, the randomness of the labels maintains the main features of the binary classification problem, yet adding a further dimension to the search space. Namely the new dimension of each point in the original space hosts the uniform seeds accounting for the randomness of the labels, so that the problem becomes that of separating the points in the augmented space. We solve it with a new kind of bootstrap technique. As for error bounds of the proposed algorithm, we obtain confidence intervals that are up to an order narrower than those supplied in the literature. This benefit comes from the fact that: (i) we devise a special algorithm to take into account the random profile of the labels; (ii) we know the number of support vectors really employed, as an ancillary output of the learning procedure; and (iii) we can appreciate confidence intervals of misclassifying probability exactly in function of the cardinality of these vectors.We numerically check these results by measuring the coverage of the confidence intervals.