Analysis of co-training algorithm with very small training sets

Authors:
Luca Didaci;Giorgio Fumera;Fabio Roli
Affiliations:
Department of Electrical and Electronic Engineering, University of Cagliari, Cagliari, Italy;Department of Electrical and Electronic Engineering, University of Cagliari, Cagliari, Italy;Department of Electrical and Electronic Engineering, University of Cagliari, Cagliari, Italy
Venue:
SSPR'12/SPR'12 Proceedings of the 2012 Joint IAPR international conference on Structural, Syntactic, and Statistical Pattern Recognition
Year:
2012

Citing 3
Cited 1

Combining labeled and unlabeled data with co-training

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Semi-supervised learning with very few labeled training examples

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
When Does Cotraining Work in Real Data?

IEEE Transactions on Knowledge and Data Engineering

Analysis of unsupervised template update in biometric recognition systems

Pattern Recognition Letters

Quantified Score

Hi-index	0.00

Visualization

Abstract

Co-training is a well known semi-supervised learning algorithm, in which two classifiers are trained on two different views (feature sets): the initially small training set is iteratively updated with unlabelled samples classified with high confidence by one of the two classifiers. In this paper we address an issue that has been overlooked so far in the literature, namely, how co-training performance is affected by the size of the initial training set, as it decreases to the minimum value below which a given learning algorithm can not be applied anymore. In this paper we address this issue empirically, testing the algorithm on 24 real datasets artificially splitted in two views, using two different base classifiers. Our results show that a very small training set, even made up of one only labelled sample per class, does not adversely affect co-training performance.