A detection-based approach to broadcast news video story segmentation

  • Authors:
  • Chengyuan Ma;Byungki Byun; Ilseo Kim;Chin-Hui Lee

  • Affiliations:
  • School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, 30332, USA;School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, 30332, USA;School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, 30332, USA;School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, 30332, USA

  • Venue:
  • ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

A detection-based paradigm decomposes a complex system into small pieces, solves each subproblem one by one, and combines the collected evidence to obtain a final solution. In this study of video story segmentation, a set of key events are first detected from heterogeneous multimedia signal sources, including a large scale concept ontology for images, text generated from automatic speech recognition systems, features extracted from audio track, and high-level video transcriptions. Then a discriminative evidence fusion scheme is investigated. We use the maximum figure-of-merit learning approach to directly optimize the performance metrics used in system evaluation, such as precision, recall, and F1 measure. Some experimental evaluations conducted on the TRECVID 2003 dataset demonstrate the effectiveness of the proposed detection-based paradigm. The proposed framework facilitates flexible combination and extensions of event detector design and evidence fusion to enable other related video applications.