Story segmentation in news videos using visual and text cues

Authors:
Yun Zhai;Alper Yilmaz;Mubarak Shah
Affiliations:
School of Computer Science, University of Central Florida, Orlando, Florida;School of Computer Science, University of Central Florida, Orlando, Florida;School of Computer Science, University of Central Florida, Orlando, Florida
Venue:
CIVR'05 Proceedings of the 4th international conference on Image and Video Retrieval
Year:
2005

Citing 7
Cited 6

Segmentation of video by clustering and graph analysis

Computer Vision and Image Understanding
The LIMSI Broadcast News transcription system

Speech Communication - Special issue on automatic transcription of broadcast news data
Motion-Based Video Representation for Scene Change Detection

International Journal of Computer Vision
Video Scene Segmentation via Continuous Video Coherence

CVPR '98 Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Robust Real-Time Face Detection

International Journal of Computer Vision
Scene Determination Based on Video and Audio Features

ICMCS '99 Proceedings of the IEEE International Conference on Multimedia Computing and Systems - Volume 2
Automated high-level movie segmentation for advanced video-retrieval systems

IEEE Transactions on Circuits and Systems for Video Technology

Tracking news stories across different sources

Proceedings of the 13th annual ACM international conference on Multimedia
Movie scene segmentation using background information

Pattern Recognition
Vlogging: A survey of videoblogging technology on the web

ACM Computing Surveys (CSUR)
Cross-lingual retrieval of identical news events by near-duplicate video segment detection

MMM'08 Proceedings of the 14th international conference on Advances in multimedia modeling
On-line video abstract generation of multimedia news

Multimedia Tools and Applications
A semi-automatic text-based semantic video annotation system for Turkish facilitating multilingual retrieval

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we present a framework for segmenting the news programs into different story topics. The proposed method utilizes both visual and text information of the video. We represent the news video by a Shot Connectivity Graph (SCG), where the nodes in the graph represent the shots in the video, and the edges between nodes represent the transitions between shots. The cycles in the graph correspond to the story segments in the news program. We first detect the cycles in the graph by finding the anchor persons in the video. This provides us with the coarse segmentation of the news video. The initial segmentation is later refined by the detections of the weather and sporting news, and the merging of similar stories. For the weather detection, the global color information of the images and the motion of the shots are considered. We have used the text obtained from automatic speech recognition (ASR) for detecting the potential sporting shots to form the sport stories. Adjacent stories with similar semantic meanings are further merged based on the visual and text similarities. The proposed framework has been tested on a widely used data set provided by NIST, which contains the ground truth of the story boundaries, and competitive evaluation results have been obtained.