IEEE Transactions on Pattern Analysis and Machine Intelligence
Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope
International Journal of Computer Vision
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
A MFoM learning approach to robust multiclass multi-label text categorization
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Multiple kernel learning, conic duality, and the SMO algorithm
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Histograms of Oriented Gradients for Human Detection
CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Text classification with kernels on the multinomial manifold
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Predicting good probabilities with supervised learning
ICML '05 Proceedings of the 22nd international conference on Machine learning
Multimedia semantic indexing using model vectors
ICME '03 Proceedings of the 2003 International Conference on Multimedia and Expo - Volume 1
The challenge problem for automated detection of 101 semantic concepts in multimedia
MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
A New Baseline for Image Annotation
ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part III
On the importance of modeling temporal information in music tag annotation
ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Optimal Classifier Fusion in a Non-Bayesian Probabilistic Framework
IEEE Transactions on Pattern Analysis and Machine Intelligence
Score normalization in multimodal biometric systems
Pattern Recognition
Evaluating Color Descriptors for Object and Scene Recognition
IEEE Transactions on Pattern Analysis and Machine Intelligence
Object Detection with Discriminatively Trained Part-Based Models
IEEE Transactions on Pattern Analysis and Machine Intelligence
Topic models for image annotation and text illustration
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Audio-based semantic concept classification for consumer video
IEEE Transactions on Audio, Speech, and Language Processing
Explicit and implicit concept-based video retrieval with bipartite graph propagation model
Proceedings of the international conference on Multimedia
Towards a universal detector by mining concepts with small semantic gaps
Proceedings of the international conference on Multimedia
What does classifying more than 10,000 image categories tell us?
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part V
Robust fusion: extreme value theory for recognition score normalization
ECCV'10 Proceedings of the 11th European conference on computer vision conference on Computer vision: Part III
LIBSVM: A library for support vector machines
ACM Transactions on Intelligent Systems and Technology (TIST)
Audio-visual grouplet: temporal audio-visual interactions for general video concept classification
MM '11 Proceedings of the 19th ACM international conference on Multimedia
Double fusion for multimedia event detection
MMM'12 Proceedings of the 18th international conference on Advances in Multimedia Modeling
CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
A Survey on Visual Content-Based Video Indexing and Retrieval
IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
Multimodal feature fusion for robust event detection in web videos
CVPR '12 Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Linear dependency modeling for feature fusion
ICCV '11 Proceedings of the 2011 International Conference on Computer Vision
Leveraging high-level and low-level features for multimedia event detection
Proceedings of the 20th ACM international conference on Multimedia
Local expert forest of score fusion for video event classification
ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part V
Explicit performance metric optimization for fusion-based video retrieval
ECCV'12 Proceedings of the 12th international conference on Computer Vision - Volume Part III
Special issue on Multimedia Event Detection
Machine Vision and Applications
Hi-index | 0.00 |
We present a system for multimedia event detection. The developed system characterizes complex multimedia events based on a large array of multimodal features, and classifies unseen videos by effectively fusing diverse responses. We present three major technical innovations. First, we explore novel visual and audio features across multiple semantic granularities, including building, often in an unsupervised manner, mid-level and high-level features upon low-level features to enable semantic understanding. Second, we show a novel Latent SVM model which learns and localizes discriminative high-level concepts in cluttered video sequences. In addition to improving detection accuracy beyond existing approaches, it enables a unique summary for every retrieval by its use of high-level concepts and temporal evidence localization. The resulting summary provides some transparency into why the system classified the video as it did. Finally, we present novel fusion learning algorithms and our methodology to improve fusion learning under limited training data condition. Thorough evaluation on a large TRECVID MED 2011 dataset showcases the benefits of the presented system.