Unsupervised event segmentation of news content with multimodal cues

Authors:
Mattia Broilo;Eric Zavesky;Andrea Basso;Francesco G. B. De Natale
Affiliations:
DISI Unitn, Trento, Italy;AT&T Labs Research, Middletown, NJ, USA;AT&T Labs Research, Middletown, NJ, USA;DISI Unitn, Trento, Italy
Venue:
Proceedings of the 3rd international workshop on Automated information extraction in media production
Year:
2010

Citing 11
Cited 1

Story Segmentation and Detection of Commercials in Broadcast News Video

ADL '98 Proceedings of the Advances in Digital Libraries Conference
Robust Real-Time Face Detection

International Journal of Computer Vision
Data Mining: Concepts and Techniques

Data Mining: Concepts and Techniques
Unsupervised discovery of multilevel statistical video structures using hierarchical hidden Markov models

ICME '03 Proceedings of the 2003 International Conference on Multimedia and Expo - Volume 3 (ICME '03) - Volume 03
An Unsupervised Algorithm for Anchor Shot Detection

ICPR '06 Proceedings of the 18th International Conference on Pattern Recognition - Volume 02
Evaluation campaigns and TRECVid

MIR '06 Proceedings of the 8th ACM international workshop on Multimedia information retrieval
The challenge problem for automated detection of 101 semantic concepts in multimedia

MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
A Novel Anchorperson Detection Algorithm Based on Spatio-temporal Slice

ICIAP '07 Proceedings of the 14th International Conference on Image Analysis and Processing
Anchor shot detection with diverse style backgrounds based on spatial-temporal slice analysis

MMM'10 Proceedings of the 16th international conference on Advances in Multimedia Modeling
Major Cast Detection in Video Using Both Speaker and Face Information

IEEE Transactions on Multimedia
Unsupervised video-shot segmentation and model-free anchorperson detection for news video story parsing

IEEE Transactions on Circuits and Systems for Video Technology

3rd international workshop on automated information extraction in media production

Proceedings of the international conference on Multimedia

Quantified Score

Hi-index	0.00

Visualization

Abstract

In the age of content snacking and mobisodes (mobile episodes) the paradigm of media consumption is radically changing. Media consumption is moving from monolithic, prepackaged, well-edited, and elaborate content presentation to a continuous feed of brief segments as singleton episodes and few-minutes videos, that are often supported by or initiated via tweets and status updates. In these updates, attention spans are small and the content packaging is less relevant with respect to the dynamic, 'streaming' aspect of information. This trend has a profound influence on the segmentation requirements that are needed to make this stream of possible information. In this paper, we present a novel method to automatically extract structured content, events, where events include major cast interviews, dialogs, background segments, etc. from news video in an unsupervised fashion. Two key ideas differentiate this unsupervised method from the others: the type of information that we use to find events and the method utilized to combine this information for coherent multimedia events. The proposed system exploits audio, visual appearance, detected faces, and mid-level semantic concepts from every video shot, but instead of combining everything together, the framework clusters them independently and by applying coherence rules assembles the multimedia events. Additionally, we discuss the effect of segmentation errors in practical retrieval and content consumption tasks.