Video Browsing and Retrieval Based on Multimodal Integration

Authors:
Yingying Zhu;Dongru Zhou
Affiliations:
-;-
Venue:
WI '03 Proceedings of the 2003 IEEE/WIC International Conference on Web Intelligence
Year:
2003

Citing 0
Cited 3

The virtual tele-tASK professor: semantic search in recorded lectures

Proceedings of the 38th SIGCSE technical symposium on Computer science education
Towards to an automatic semantic annotation for multimedia learning objects

Proceedings of the international workshop on Educational multimedia and multimedia education
Video abstraction based on the visual attention model and online clustering

Image Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

The rapid growth of multimedia data requires more effective content-based video browsing and retrieval. In this paper, we present a system developed for video browsing and retrieval based on multimedia integration. First, a basic structure of the system is defined. Second, arobust scene segmentation method is presented, which analyzes audio and visual information and accounts for their inter-relations and coincidence to semantically identify video scenes. We then extract text from key frames with video OCR technique and extract text transcriptions by speech recognition to classify video scenes and form the full-text indices. Finally, naturallanguage understanding technique is used to automatically classify video scenes on the basis of the texts obtained from close caption, video OCR process and speech recognition. In this way, we have developed the content-based video database system which integrates multimodality to browse and retrieve video data. The experimental results show that multimodal integration is effective for video scene segmentation. Our system built on the idea of multimodal integration makes content-based browsing and retrieval of video data, key-frame-based video abstract and search by keywords practical.