Automatic Caption Localization in Compressed Video
IEEE Transactions on Pattern Analysis and Machine Intelligence
On face detection in the compressed domain
MULTIMEDIA '00 Proceedings of the eighth ACM international conference on Multimedia
Detection of text captions in compressed domain video
MULTIMEDIA '00 Proceedings of the 2000 ACM workshops on Multimedia
MPEG Video Compression Standard
MPEG Video Compression Standard
Fast scene change detection using direct feature extraction fromMPEG compressed videos
IEEE Transactions on Multimedia
A highly efficient system for automatic face region detection in MPEG video
IEEE Transactions on Circuits and Systems for Video Technology
Motion Activity Based Semantic Video Similarity Retrieval
PCM '02 Proceedings of the Third IEEE Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing
Hi-index | 0.00 |
In this paper, we propose a novel approach to generate the table of video content based on shot description by motion activity and closed caption in MPEG-2 video streams. Videos are segmented into shots by GOP-based approach and shot identification is used to identify segmented shots. The specific shots of interest are selected and the proposed approach of closed caption detection is used to detect captions in these shots. In order to speed up in scene change detection, instead of examining scene cut frame by frame, GOP-based approach first checks video streams GOP by GOP and then finds out the actual scene boundaries in the frame level. The segmented shots containing closed caption are identified by the proposed object-based motion activity descriptor. The algorithm of SOM (Self-Organization Map) is used to filter out noise in the process of caption localization. While captions are localized in the recognized shots, we create the table of video content based on the hierarchical structure of story unit, consecutive shots and captioned frames. The experimental results show the effectiveness of the proposed approach and reveal the feasibility of the hierarchical structuring of video content.