A utility framework for the automatic generation of audio-visual skims

Authors:
Hari Sundaram;Lexing Xie;Shih-Fu Chang
Affiliations:
Columbia University, New York, New York;Columbia University, New York, New York;Columbia University, New York, New York
Venue:
Proceedings of the tenth ACM international conference on Multimedia
Year:
2002

Citing 12
Cited 31

Elements of information theory

Elements of information theory
Fundamentals of speech recognition

Fundamentals of speech recognition
Evolving video skims into useful multimedia abstractions

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Video Manga: generating semantically meaningful video summaries

MULTIMEDIA '99 Proceedings of the seventh ACM international conference on Multimedia (Part 1)
Auto-summarization of audio-video presentations

MULTIMEDIA '99 Proceedings of the seventh ACM international conference on Multimedia (Part 1)
An introduction to support Vector Machines: and other kernel-based learning methods

An introduction to support Vector Machines: and other kernel-based learning methods
Learning video browsing behavior and its application in the generation of video previews

MULTIMEDIA '01 Proceedings of the ninth ACM international conference on Multimedia
A robust audio classification and segmentation method

MULTIMEDIA '01 Proceedings of the ninth ACM international conference on Multimedia
Constrained Utility Maximization for Generating Visual Skims

CBAIVL '01 Proceedings of the IEEE Workshop on Content-based Access of Image and Video Libraries (CBAIVL'01)
Construction and Evaluation of a Robust Multifeature Speech/Music Discriminator

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2
Segmentation, structure detection and summarization of multimedia sequences

Segmentation, structure detection and summarization of multimedia sequences
Computable scenes and structures in films

IEEE Transactions on Multimedia

AVE: automated home video editing

MULTIMEDIA '03 Proceedings of the eleventh ACM international conference on Multimedia
Computational models for experiences in the arts, and multimedia

ETP '03 Proceedings of the 2003 ACM SIGMM workshop on Experiential telepresence
Capturing experience: a matter of contextualising events

ETP '03 Proceedings of the 2003 ACM SIGMM workshop on Experiential telepresence
Context and Memory in Multimedia Content Analysis

IEEE MultiMedia
The networked home as a user-centric multimedia system

Proceedings of the 2004 ACM workshop on Next-generation residential broadband challenges
Concept-oriented video skimming via semantic video classification

Proceedings of the 12th annual ACM international conference on Multimedia
IMCE: Integrated media creation environment

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Content-based multimedia information retrieval: State of the art and challenges

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Automatic summarization of music videos

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Community annotation and remix: a research platform and pilot deployment

Proceedings of the 1st ACM international workshop on Human-centered multimedia
Metadata handling: A video perspective

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
A narrative-based abstraction framework for story-oriented video

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
A framework for multimedia content abstraction and its application to rushes exploration

Proceedings of the 6th ACM international conference on Image and video retrieval
Attention-based video summarisation in rushes collection

Proceedings of the international workshop on TRECVID video summarization
Information dense summaries for review of patient performance in biofeedback rehabilitation

Proceedings of the 15th international conference on Multimedia
Information dense summaries for review of patient performance in biofeedback rehabilitation

Proceedings of the 15th international conference on Multimedia
Incorporating feature hierarchy and boosting to achieve more effective classifier training and concept-oriented video summarization and skimming

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Video summarisation: A conceptual framework and survey of the state of the art

Journal of Visual Communication and Image Representation
An overview of video shot clustering and summarization techniques for mobile applications

MobiMedia '06 Proceedings of the 2nd international conference on Mobile multimedia communications
Automatic personalized video abstraction for sports videos using metadata

Multimedia Tools and Applications
SVC adaptation: Standard tools and supporting methods

Image Communication
Movie story intensity representation through audiovisual tempo analysis

Multimedia Tools and Applications
Multimedia surrogates for video gisting: Toward combining spoken words and imagery

Information Processing and Management: an International Journal
Multimedia content analysis: the next wave

CIVR'03 Proceedings of the 2nd international conference on Image and video retrieval
Interactively browsing movies in terms of action, foreshadowing and resolution

Proceedings of the 10th annual joint conference on Digital libraries
Human-centered attention models for video summarization

International Conference on Multimodal Interfaces and the Workshop on Machine Learning for Multimodal Interaction
Understanding how webcasts are used as sources of information

Journal of the American Society for Information Science and Technology
Extracting key frames from consumer videos using bi-layer group sparsity

MM '11 Proceedings of the 19th ACM international conference on Multimedia
Sequence-kernel based sparse representation for amateur video summarization

J-MRE '11 Proceedings of the 2011 joint ACM workshop on Modeling and representing events
Movie-in-a-Minute: automatically generated video previews

PCM'04 Proceedings of the 5th Pacific Rim Conference on Advances in Multimedia Information Processing - Volume Part II
On-line video abstract generation of multimedia news

Multimedia Tools and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we present a novel algorithm for generating audio-visual skims from computable scenes. Skims are useful for browsing digital libraries, and for on-demand summaries in set-top boxes. A computable scene is a chunk of data that exhibits consistencies with respect to chromaticity, lighting and sound. There are three key aspects to our approach: (a) visual complexity and grammar, (b) robust audio segmentation and (c) an utility model for skim generation. We define a measure of visual complexity of a shot, and map complexity to the minimum time for comprehending the shot. Then, we analyze the underlying visual grammar, since it makes the shot sequence meaningful. We segment the audio data into four classes, and then detect significant phrases in the speech segments. The utility functions are defined in terms of complexity and duration of the segment. The target skim is created using a general constrained utility maximization procedure that maximizes the information content and the coherence of the resulting skim. The objective function is constrained due to multimedia synchronization constraints, visual syntax and by penalty functions on audio and video segments. The user study results indicate that the optimal skims show statistically significant differences with other skims with compression rates up to 90%.