Marginalized multi-layer multi-instance kernel for video concept detection

Authors:
Zheng-Jun Zha;Tao Mei;Richang Hong;Zhiwei Gu
Affiliations:
School of Computing, National University of Singapore, Singapore;Microsoft Research Asia, Beijing, China;Google Inc., USA;Hefei University of Technology, China
Venue:
Signal Processing
Year:
2013

Citing 25
Cited 0

Unsupervised Segmentation of Color-Texture Regions in Images and Video

IEEE Transactions on Pattern Analysis and Machine Intelligence
Multi-Instance Kernels

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
On the detection of semantic concepts at TRECVID

Proceedings of the 12th annual ACM international conference on Multimedia
MILES: Multiple-Instance Learning via Embedded Instance Selection

IEEE Transactions on Pattern Analysis and Machine Intelligence
Correlative multi-label video annotation

Proceedings of the 15th international conference on Multimedia
Video annotation by graph-based learning with neighborhood similarity

Proceedings of the 15th international conference on Multimedia
Multi-layer multi-instance kernel for video concept detection

Proceedings of the 15th international conference on Multimedia
Semi-supervised kernel density estimation for video annotation

Computer Vision and Image Understanding
Graph-based semi-supervised learning with multiple labels

Journal of Visual Communication and Image Representation
Concept-Based Video Retrieval

Foundations and Trends in Information Retrieval
Marginalized multi-instance kernels

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Short-term audio-visual atoms for generic video concept classification

MM '09 Proceedings of the 17th ACM international conference on Multimedia
Visual query suggestion

MM '09 Proceedings of the 17th ACM international conference on Multimedia
Unified video annotation via multigraph learning

IEEE Transactions on Circuits and Systems for Video Technology
Beyond distance measurement: constructing neighborhood similarity for video annotation

IEEE Transactions on Multimedia - Special section on communities and media computing
Visual query suggestion: Towards capturing user intent in internet image search

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
LIBSVM: A library for support vector machines

ACM Transactions on Intelligent Systems and Technology (TIST)
Multiple feature hashing for real-time large scale near-duplicate video retrieval

MM '11 Proceedings of the 19th ACM international conference on Multimedia
A Multimedia Retrieval Framework Based on Semi-Supervised Ranking and Relevance Feedback

IEEE Transactions on Pattern Analysis and Machine Intelligence
Video Annotation Based on Kernel Linear Neighborhood Propagation

IEEE Transactions on Multimedia
Multi-Layer Multi-Instance Learning for Video Concept Detection

IEEE Transactions on Multimedia
Representations of Keypoint-Based Semantic Concept Detection: A Comprehensive Study

IEEE Transactions on Multimedia
Sequence Multi-Labeling: A Unified Video Annotation Scheme With Spatial and Temporal Context

IEEE Transactions on Multimedia
Modality Mixture Projections for Semantic Video Event Detection

IEEE Transactions on Circuits and Systems for Video Technology
Interactive Video Indexing With Statistical Active Learning

IEEE Transactions on Multimedia

Quantified Score

Hi-index	0.08

Visualization

Abstract

Video concept detection has been extensively studied in recent years. Most of the existing video concept detection approaches have treated video as a flat data sequence. However, video is essentially a kind of media with hierarchical structure, including multiple layers (e.g., video shot, frame, and region) and multiple instance relationship embedded in each pair of contiguous layers. In this paper, we propose a novel kernel, termed marginalized multi-layer multi-instance (MarMLMI) kernel for video concept detection. Different from most existing methods, the proposed MarMLMI kernel exploits the hierarchical structure of video, i.e., both the multi-layer structure and the multi-instance relationship. Furthermore, the instance label ambiguity in multi-instance setting is addressed by using the technology of marginalized kernel. We perform video concept detection on a real-world video corpus: the TREC video retrieval evaluation (TRECVID) benchmark and compare the proposed MarMLMI kernel to representative existing approaches. The experimental results demonstrate the effectiveness of the proposed MarMLMI kernel.