Three remarks on the support vector method of function estimation
Advances in kernel methods
Automatically Labeling Video Data Using Multi-class Active Learning
ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Semantic video classification by integrating flexible mixture model with adaptive EM algorithm
MIR '03 Proceedings of the 5th ACM SIGMM international workshop on Multimedia information retrieval
Multimodal concept-dependent active learning for image retrieval
Proceedings of the 12th annual ACM international conference on Multimedia
Semi-Supervised Cross Feature Learning for Semantic Concept Detection in Videos
CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Semi-automatic video annotation based on active learning with multiple complementary predictors
Proceedings of the 7th ACM SIGMM international workshop on Multimedia information retrieval
Online multi-label active annotation: towards large-scale content-based video search
MM '08 Proceedings of the 16th ACM international conference on Multimedia
NUS-WIDE: a real-world web image database from National University of Singapore
Proceedings of the ACM International Conference on Image and Video Retrieval
A novel method for semantic video concept learning using web images
MM '11 Proceedings of the 19th ACM international conference on Multimedia
Hi-index | 0.01 |
This paper exploits the criteria to optimize the training set construction for video annotation. Most existing learning-based semantic annotation approaches require a large training set to achieve good generalization capacity, in which a considerable amount of labor-intensively manual labeling is desirable. However, it is observed that the generalization capacity of a classifier highly depends on the geometrical distribution rather than the size of the training data. We argue that a training set which includes most temporal and spatial distribution of the whole data will achieve a satisfying performance even in the case of limited size of training set. In order to capture the geometrical distribution characteristics of a given video collection, we propose the following four metrics for constructing an optimal training set, including Salience Time Dispersiveness Spatial Dispersiveness and Diversity. Moreover, based on these metrics, we propose a set of optimization rules to capture the most distribution information of the whole data for a training set with a given size. Experimental results demonstrate that these rules are effective for training set construction for video annotation, and significantly outperform random training set selection as well.