Fundamentals of speech recognition
Fundamentals of speech recognition
Semantic analysis for video contents extraction—spotting by association in news video
MULTIMEDIA '97 Proceedings of the fifth ACM international conference on Multimedia
A robust audio classification and segmentation method
MULTIMEDIA '01 Proceedings of the ninth ACM international conference on Multimedia
A user attention model for video summarization
Proceedings of the tenth ACM international conference on Multimedia
Video summarization and retrieval using singular value decomposition
Multimedia Systems
Distinctive Image Features from Scale-Invariant Keypoints
International Journal of Computer Vision
Scalable Recognition with a Vocabulary Tree
CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
The trecvid 2007 BBC rushes summarization evaluation pilot
Proceedings of the international workshop on TRECVID video summarization
A unified approach to shot change detection and camera motion characterization
IEEE Transactions on Circuits and Systems for Video Technology
A simplified approach to rushes summarization
TVS '08 Proceedings of the 2nd ACM TRECVid Video Summarization Workshop
Rushes summarization by IRIM consortium: redundancy removal and multi-feature fusion
TVS '08 Proceedings of the 2nd ACM TRECVid Video Summarization Workshop
THU-intel at rushes summarization of TRECVID 2008
TVS '08 Proceedings of the 2nd ACM TRECVid Video Summarization Workshop
A User Experience Model for Home Video Summarization
MMM '09 Proceedings of the 15th International Multimedia Modeling Conference on Advances in Multimedia Modeling
A framework for video abstraction systems analysis and modelling from an operational point of view
Multimedia Tools and Applications
Video summarization with visual and semantic features
PCM'10 Proceedings of the 11th Pacific Rim conference on Advances in multimedia information processing: Part I
Hi-index | 0.00 |
This paper presents a video summarization technique for rushes that employs high-level feature fusion to identify segments for inclusion. It aims to capture distinct video events using a variety of features: k-means based weighting, speech, camera motion, significant differences in HSV color space, and a dynamic time warping (DTW) based feature that suppresses repeated scenes. The feature functions are used to drive a weighted k-means based clustering to identify visually distinct, important segments that constitute the final summary. The optimal weights corresponding to the individual features are obtained using a gradient descent algorithm that maximizes the recall of ground truth events from representative training videos. Analysis reveals a lengthy computation time but high quality results (60% average recall over 42 test videos) as based on manually-judged inclusion ofdistinct shots. The summaries were judged relatively easy to view and had an average amount of redundancy.