Multi-Layer Multi-Instance Learning for Video Concept Detection

Authors:
Zhiwei Gu;Tao Mei;Xian-Sheng Hua;Jinhui Tang;Xiuqing Wu
Affiliations:
Dept. of Electron. Eng. & Inf. Sci., Univ. of Sci. & Technol. of China, Hefei;-;-;-;-
Venue:
IEEE Transactions on Multimedia
Year:
2008

Citing 0
Cited 9

Concept-Based Video Retrieval

Foundations and Trends in Information Retrieval
A novel horror scene detection scheme on revised multiple instance learning model

MMM'11 Proceedings of the 17th international conference on Advances in multimedia modeling - Volume Part II
Contextual Video Recommendation by Multimodal Relevance and User Feedback

ACM Transactions on Information Systems (TOIS)
Ensemble multi-instance multi-label learning approach for video annotation task

MM '11 Proceedings of the 19th ACM international conference on Multimedia
Multi-graph multi-instance learning for object-based image and video retrieval

Proceedings of the 2nd ACM International Conference on Multimedia Retrieval
Semi-supervised multi-instance multi-label learning for video annotation task

Proceedings of the 20th ACM international conference on Multimedia
Local image tagging via graph regularized joint group sparsity

Pattern Recognition
Marginalized multi-layer multi-instance kernel for video concept detection

Signal Processing
Temporal-Spatial refinements for video concept fusion

ACCV'12 Proceedings of the 11th Asian conference on Computer Vision - Volume Part III

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a novel learning-based method, called ldquomulti-layer multi-instance (MLMI) learning,rdquo for video concept detection. Most of existing methods have treated video as a flat data sequence and have not investigated the intrinsic hierarchy structure of the video content deeply. However, video is essentially a kind of media with ML structure. For example, a video can be represented by a hierarchical structure including, from large to small, shot, frame, and region, where each pair of contiguous layers fits the typical MI setting. We call such a ML structure and the MI relations embedded in the structure as the MLMI setting. In this paper, we systematically study both ML structure and MI relations embedded in video content by formulating video concept detection as a MLMI learning problem. Specifically, we first construct a MLMI kernel to simultaneously model such ML structure and MI relations. To deal with the ambiguity propagation problem which is introduced by weak labeling and ML structure, we then propose a regularization framework which takes hyper-bag prediction error, sublayer prediction error, inter-layer inconsistency measure, and classifier complexity into consideration. We have applied the proposed MLMI learning method to concept detection task over TRECVid 2005 development corpus, and report better performance to vector-based and the state-of-the-art MI learning methods.