Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope
International Journal of Computer Vision
Beyond the point cloud: from transductive to semi-supervised learning
ICML '05 Proceedings of the 22nd international conference on Machine learning
Learning from labeled and unlabeled data on a directed graph
ICML '05 Proceedings of the 22nd international conference on Machine learning
Label Propagation through Linear Neighborhoods
IEEE Transactions on Knowledge and Data Engineering
Towards Scalable Dataset Construction: An Active Learning Approach
ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part I
Prototype vector machine for large scale semi-supervised learning
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Extracting structures in image collections for object recognition
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part I
IEEE Transactions on Pattern Analysis and Machine Intelligence
Evaluating knowledge transfer and zero-shot learning in a large-scale setting
CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Visual and semantic similarity in ImageNet
CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Hi-index | 0.00 |
Internet data sources provide us with large image datasets which are mostly without any explicit labeling. This setting is ideal for semi-supervised learning which seeks to exploit labeled data as well as a large pool of unlabeled data points to improve learning and classification. While we have made considerable progress on the theory and algorithms, we have seen limited success to translate such progress to the large scale datasets which these methods are inspired by. We investigate the computational complexity of popular graph-based semi-supervised learning algorithms together with different possible speed-ups. Our findings lead to a new algorithm that scales up to 40 times larger datasets in comparison to previous approaches and even increases the classification performance. Our method is based on the key insights that by employing a density-based measure unlabeled data points can be selected similar to an active learning scheme. This leads to a compact graph resulting in an improved performance up to 11.6% at reduced computational costs.