Double fusion for multimedia event detection

Authors:
Zhen-zhong Lan;Lei Bao;Shoou-I Yu;Wei Liu;Alexander G. Hauptmann
Affiliations:
School of Computer Science, Carnegie Mellon University, Pittsburgh, PA;School of Computer Science, Carnegie Mellon University, Pittsburgh, PA;School of Computer Science, Carnegie Mellon University, Pittsburgh, PA;School of Computer Science, Carnegie Mellon University, Pittsburgh, PA;School of Computer Science, Carnegie Mellon University, Pittsburgh, PA
Venue:
MMM'12 Proceedings of the 18th international conference on Advances in Multimedia Modeling
Year:
2012

Citing 14
Cited 11

Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope

International Journal of Computer Vision
An Overview and Comparison of Voting Methods for Pattern Recognition

IWFHR '02 Proceedings of the Eighth International Workshop on Frontiers in Handwriting Recognition (IWFHR'02)
Space-time Interest Points

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Discriminative model fusion for semantic concept detection and annotation in video

MULTIMEDIA '03 Proceedings of the eleventh ACM international conference on Multimedia
Distinctive Image Features from Scale-Invariant Keypoints

International Journal of Computer Vision
Early versus late fusion in semantic video analysis

Proceedings of the 13th annual ACM international conference on Multimedia
Early versus late fusion in semantic video analysis

Proceedings of the 13th annual ACM international conference on Multimedia
Efficient co-regularised least squares regression

ICML '06 Proceedings of the 23rd international conference on Machine learning
Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Evaluation campaigns and TRECVid

MIR '06 Proceedings of the 8th ACM international workshop on Multimedia information retrieval
Classifier fusion for SVM-based multimedia semantic indexing

ECIR'07 Proceedings of the 29th European conference on IR research
L2 regularization for learning kernels

UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
Can High-Level Concepts Fill the Semantic Gap in Video Retrieval? A Case Study With Broadcast News

IEEE Transactions on Multimedia
Harmonizing Hierarchical Manifolds for Multimedia Document Semantics Understanding and Cross-Media Retrieval

IEEE Transactions on Multimedia

Classifier-specific intermediate representation for multimedia tasks

Proceedings of the 2nd ACM International Conference on Multimedia Retrieval
Knowledge adaptation for ad hoc multimedia event detection with few exemplars

Proceedings of the 20th ACM international conference on Multimedia
Detection bank: an object detection based video representation for multimedia event recognition

Proceedings of the 20th ACM international conference on Multimedia
Querying for video events by semantic signatures from few examples

Proceedings of the 21st ACM international conference on Multimedia
Fall detection in multi-camera surveillance videos: experimentations and observations

Proceedings of the 1st ACM international workshop on Multimedia indexing and information retrieval for healthcare
Content-Based Multimedia Retrieval Using Feature Correlation Clustering and Fusion

International Journal of Multimedia Data Engineering & Management
Multi-Max-Margin Support Vector Machine for multi-source human action recognition

Neurocomputing
E-LAMP: integration of innovative ideas for multimedia event detection

Machine Vision and Applications
Multimedia event detection with multimodal feature fusion and temporal concept localization

Machine Vision and Applications
Coloring Action Recognition in Still Images

International Journal of Computer Vision
Hierarchical Hidden Markov Model in detecting activities of daily living in wearable videos for studies of dementia

Multimedia Tools and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Multimedia Event Detection is a multimedia retrieval task with the goal of finding videos of a particular event in an internet video archive, given example videos and descriptions. We focus here on mining features of example videos to learn the most characteristic features, which requires a combination of multiple complementary types of features. Generally, early fusion and late fusion are two popular combination strategies. The former one fuses features before performing classification and the latter one combines output of classifiers from different features. In this paper, we introduce a fusion scheme named double fusion, which combines early fusion and late fusion together to incorporate their advantages. Results are reported on TRECVID MED 2010 and 2011 data sets. For MED 2010, we get a mean minimal normalized detection cost (MNDC) of 0.49, which exceeds the state of the art performance by more than 12 percent.