Learning frame relevance for video classification

Authors:
Hua Wang;Feiping Nie;Heng Huang;Yi Yang
Affiliations:
University of Texas at Arlington, Arlington, TX, USA;University of Texas at Arlington, Arlington, TX, USA;University of Texas at Arlington, Arlington, TX, USA;Carnegie Mellon University, Pittsburgh, PA, USA
Venue:
MM '11 Proceedings of the 19th ACM international conference on Multimedia
Year:
2011

Citing 6
Cited 0

Solving the multiple instance problem with axis-parallel rectangles

Artificial Intelligence
BoosTexter: A Boosting-based Systemfor Text Categorization

Machine Learning - Special issue on information retrieval
Transductive Inference for Text Classification using Support Vector Machines

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Multi-instance learning by treating instances as non-I.I.D. samples

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Supervised manifold learning for image and video classification

Proceedings of the international conference on Multimedia
Drosophila Gene Expression Pattern Annotation through Multi-Instance Multi-Label Learning

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Traditional video classification methods typically require a large number of labeled training video frames to achieve satisfactory performance. However, in the real world, we usually only have sufficient labeled video clips (such as tagged online videos) but lack labeled video frames. In this paper, we formalize the video classification problem as a Multi-Instance Learning (MIL) problem, an emerging topic in machine learning in recent years, which only needs bag (video clip) labels. To solve the problem, we propose a novel Parameterized Class-to-Bag (P-C2B) Distance method to learn the relative importance of a training instance with respect to its labeled classes, such that the instance level labeling ambiguity in MIL is tackled and the frame relevances of training video data with respect to the semantic concepts of interest are given. Promising experimental results have demonstrated the effectiveness of the proposed method.