Original Contribution: Stacked generalization
Neural Networks
The nature of statistical learning theory
The nature of statistical learning theory
VisualSEEk: a fully automated content-based image query system
MULTIMEDIA '96 Proceedings of the fourth ACM international conference on Multimedia
Video Google: A Text Retrieval Approach to Object Matching in Videos
ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Distinctive Image Features from Scale-Invariant Keypoints
International Journal of Computer Vision
CVPRW '04 Proceedings of the 2004 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'04) Volume 12 - Volume 12
Histograms of Oriented Gradients for Human Detection
CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Efficient Image Matching with Distributions of Local Invariant Features
CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 2 - Volume 02
The Pyramid Match Kernel: Discriminative Classification with Sets of Image Features
ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories
CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Representing shape with a spatial pyramid kernel
Proceedings of the 6th ACM international conference on Image and video retrieval
SURF: speeded up robust features
ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part I
Image classification for content-based indexing
IEEE Transactions on Image Processing
Hi-index | 0.00 |
Recent research in image recognition has shown that combining multiple descriptors is a very useful way to improve classification performance. Furthermore, the use of spatial pyramids that compute descriptors at multiple spatial resolution levels generally increases the discriminative power of the descriptors. In this paper we focus on combination methods that combine multiple descriptors at multiple spatial resolution levels. A possible problem of the naive solution to create one large input vector for a machine learning classifier such as a support vector machine, is that the input vector becomes of very large dimensionality, which can increase problems of overfitting and hinder generalization performance. Therefore we propose the use of stacking support vector machines where at the first layer each support vector machine receives the input constructed by each single descriptor and is trained to compute the right output class. A second layer support vector machine is then used to combine the class probabilities of all trained first layer support vector models to learn the right output class given these reduced input vectors. We have performed experiments on 20 classes from the Caltech object database with 10 different single descriptors at 3 different resolutions. The results show that our 2-layer stacking approach outperforms the naive approach that combines all descriptors directly in a very large single input vector.