Leveraging high-level and low-level features for multimedia event detection

Authors:
Lu Jiang;Alexander G. Hauptmann;Guang Xiang
Affiliations:
Carnegie Mellon University, Pittsburgh, PA, USA;Carnegie Mellon University, Pittsburgh, PA, USA;Carnegie Mellon University, Pittsburgh, PA, USA
Venue:
Proceedings of the 20th ACM international conference on Multimedia
Year:
2012

Citing 12
Cited 5

Space-time Interest Points

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Optimal multimodal fusion for multimedia data analysis

Proceedings of the 12th annual ACM international conference on Multimedia
Early versus late fusion in semantic video analysis

Proceedings of the 13th annual ACM international conference on Multimedia
Early versus late fusion in semantic video analysis

Proceedings of the 13th annual ACM international conference on Multimedia
Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Classification in Networked Data: A Toolkit and a Univariate Case Study

The Journal of Machine Learning Research
Representing shape with a spatial pyramid kernel

Proceedings of the 6th ACM international conference on Image and video retrieval
Latent semantic fusion model for image retrieval and annotation

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Late fusion of heterogeneous methods for multimedia image retrieval

MIR '08 Proceedings of the 1st ACM international conference on Multimedia information retrieval
Kernel Codebooks for Scene Categorization

ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part III
Cautious inference in collective classification

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
A new approach to cross-modal multimedia retrieval

Proceedings of the international conference on Multimedia

Recommendations for video event recognition using concept vocabularies

Proceedings of the 3rd ACM conference on International conference on multimedia retrieval
Learning latent spatio-temporal compositional model for human action recognition

Proceedings of the 21st ACM international conference on Multimedia
Content-Based Multimedia Retrieval Using Feature Correlation Clustering and Fusion

International Journal of Multimedia Data Engineering & Management
E-LAMP: integration of innovative ideas for multimedia event detection

Machine Vision and Applications
Multimedia event detection with multimodal feature fusion and temporal concept localization

Machine Vision and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper addresses the challenge of Multimedia Event Detection by proposing a novel method for high-level and low-level features fusion based on collective classification. Generally, the method consists of three steps: training a classifier from low-level features; encoding high-level features into graphs; and diffusing the scores on the established graph to obtain the final prediction. The final prediction is derived from multiple graphs each of which corresponds to a high-level feature. The paper investigates two graph construction methods using logarithmic and exponential loss functions, respectively and two collective classification algorithms, i.e. Gibbs sampling and Markov random walk. The theoretical analysis demonstrates that the proposed method converges and is computationally scalable and the empirical analysis on TRECVID 2011 Multimedia Event Detection dataset validates its outstanding performance compared to state-of-the-art methods, with an added benefit of interpretability.