Exploiting generative models in discriminative classifiers
Proceedings of the 1998 conference on Advances in neural information processing systems II
IEEE Transactions on Pattern Analysis and Machine Intelligence
Distinctive Image Features from Scale-Invariant Keypoints
International Journal of Computer Vision
International Journal of Computer Vision
Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories
CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Evaluation campaigns and TRECVid
MIR '06 Proceedings of the 8th ACM international workshop on Multimedia information retrieval
Local invariant feature detectors: a survey
Foundations and Trends® in Computer Graphics and Vision
Video Event Recognition Using Kernel Methods with Multilevel Temporal Alignment
IEEE Transactions on Pattern Analysis and Machine Intelligence
Large-scale content-based audio retrieval from text queries
MIR '08 Proceedings of the 1st ACM international conference on Multimedia information retrieval
Foundations and Trends in Information Retrieval
Comparing compact codebooks for visual categorization
Computer Vision and Image Understanding
Computer
Evaluating Color Descriptors for Object and Scene Recognition
IEEE Transactions on Pattern Analysis and Machine Intelligence
Object Detection with Discriminatively Trained Part-Based Models
IEEE Transactions on Pattern Analysis and Machine Intelligence
Event detection and recognition for semantic annotation of video
Multimedia Tools and Applications
The ICSI-SRI spring 2006 meeting recognition system
MLMI'06 Proceedings of the Third international conference on Machine Learning for Multimodal Interaction
Action recognition by dense trajectories
CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Can High-Level Concepts Fill the Semantic Gap in Video Retrieval? A Case Study With Broadcast News
IEEE Transactions on Multimedia
SUPER: towards real-time event recognition in internet videos
Proceedings of the 2nd ACM International Conference on Multimedia Retrieval
Semantic Model Vectors for Complex Video Event Recognition
IEEE Transactions on Multimedia
Action bank: A high-level representation of activity in video
CVPR '12 Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Multimodal feature fusion for robust event detection in web videos
CVPR '12 Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Aggregating Local Image Descriptors into Compact Codes
IEEE Transactions on Pattern Analysis and Machine Intelligence
Detection bank: an object detection based video representation for multimedia event recognition
Proceedings of the 20th ACM international conference on Multimedia
Recommendations for video event recognition using concept vocabularies
Proceedings of the 3rd ACM conference on International conference on multimedia retrieval
Large-scale web video event classification by use of Fisher Vectors
WACV '13 Proceedings of the 2013 IEEE Workshop on Applications of Computer Vision (WACV)
Special issue on Multimedia Event Detection
Machine Vision and Applications
Hi-index | 0.00 |
Multimedia event detection (MED) is a challenging problem because of the heterogeneous content and variable quality found in large collections of Internet videos. To study the value of multimedia features and fusion for representing and learning events from a set of example video clips, we created SESAME, a system for video SEarch with Speed and Accuracy for Multimedia Events. SESAME includes multiple bag-of-words event classifiers based on single data types: low-level visual, motion, and audio features; high-level semantic visual concepts; and automatic speech recognition. Event detection performance was evaluated for each event classifier. The performance of low-level visual and motion features was improved by the use of difference coding. The accuracy of the visual concepts was nearly as strong as that of the low-level visual features. Experiments with a number of fusion methods for combining the event detection scores from these classifiers revealed that simple fusion methods, such as arithmetic mean, perform as well as or better than other, more complex fusion methods. SESAME's performance in the 2012 TRECVID MED evaluation was one of the best reported.