Time-Compressing Speech: ASR Transcripts Are an Effective Way to Support Gist Extraction
MLMI '08 Proceedings of the 5th international workshop on Machine Learning for Multimodal Interaction
A generic virtual content insertion system based on visual attention analysis
MM '08 Proceedings of the 16th ACM international conference on Multimedia
Dynamic video summarization using two-level redundancy detection
Multimedia Tools and Applications
IEEE Transactions on Multimedia - Special issue on integration of context and content
Hierarchical modeling and adaptive clustering for real-time summarization of rush videos
IEEE Transactions on Multimedia
Spatial-temporal video browsing for mobile environment based on visual attention analysis
ICME'09 Proceedings of the 2009 IEEE international conference on Multimedia and Expo
Motion attention based frame-level bit allocation scheme for H.264
Proceedings of the First International Conference on Internet Multimedia Computing and Service
A semantic framework for video genre classification and event analysis
Image Communication
Perceptual-based quality assessment for audio-visual services: A survey
Image Communication
Activity-driven content adaptation for effective video summarization
Journal of Visual Communication and Image Representation
Automated aesthetic enhancement of videos
Proceedings of the international conference on Multimedia
EURASIP Journal on Advances in Signal Processing
Two-layer average-to-peak ratio based saliency detection
Image Communication
Mode dependent loop filter for intra prediction coding in H.264/AVC
Journal of Visual Communication and Image Representation
Hi-index | 0.00 |
In this paper, we propose a generic framework to human perception analysis in video understanding based on multiple visual cues. Video features that prominently influence human perception, such as motion, contrast, special scenes, and statistical rhythm, are first extracted and modeled. A perception curve that corresponds to human perception change is then constructed from these individual models using linear or priority based fusion approach. As an important application of the perceptive analysis framework, a feasible scheme for video summarization is implemented in order to demonstrate the validity, robustness, and generality of the proposed framework. The frames that correspond to the peak points in these individual models and the fusion curve are extracted as multilevel summarizations that include video keywords, keyframes, and dynamic segments. The subjective evaluations from a supplementary volunteer study on video summarizations indicate that the analysis framework is effective and offer a promising approach to semantic video management, access, and understanding