Audiovisual integration with Segment Models for tennis video parsing

Authors:
Manolis Delakis;Guillaume Gravier;Patrick Gros
Affiliations:
IRISA/University of Rennes 1, Campus de Beaulieu, 35042 Rennes, France;IRISA/CNRS, Campus de Beaulieu, 35042 Rennes, France;IRISA/INRIA, Campus de Beaulieu, 35042 Rennes, France
Venue:
Computer Vision and Image Understanding
Year:
2008

Citing 15
Cited 5

Robust regression and outlier detection

Robust regression and outlier detection
The Hierarchical Hidden Markov Model: Analysis and Applications

Machine Learning
New enhancements to cut, fade, and dissolve detection processes in video segmentation

MULTIMEDIA '00 Proceedings of the eighth ACM international conference on Multimedia
Multi-Modal Dialog Scene Detection Using Hidden Markov Models for Content-Based Multimedia Indexing

Multimedia Tools and Applications
Hidden Markov Model Parsing of Video Programs

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97) -Volume 4 - Volume 4
Structure analysis of soccer video with domain knowledge and hidden Markov models

Pattern Recognition Letters - Video computing
Multimodal Video Indexing: A Review of the State-of-the-art

Multimedia Tools and Applications
Modeling Individual and Group Actions in Meetings: A Two-Layer HMM Framework

CVPRW '04 Proceedings of the 2004 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'04) Volume 7 - Volume 07
Layered representations for learning and inferring office activity from multiple sensory channels

Computer Vision and Image Understanding - Special issue on event detection in video
Automatic Analysis of Multimodal Group Actions in Meetings

IEEE Transactions on Pattern Analysis and Machine Intelligence
Early versus late fusion in semantic video analysis

Proceedings of the 13th annual ACM international conference on Multimedia
Early versus late fusion in semantic video analysis

Proceedings of the 13th annual ACM international conference on Multimedia
Audiovisual integration for tennis broadcast structuring

Multimedia Tools and Applications
Content-based video indexing of TV broadcast news using hidden Markov models

ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 06
New approaches to audio-visual segmentation of TV news for automatic topic retrieval

ICASSP '01 Proceedings of the Acoustics, Speech, and Signal Processing, 2001. on IEEE International Conference - Volume 03

The IMMED project: wearable video monitoring of people with age dementia

Proceedings of the international conference on Multimedia
Content-based scene detection and analysis method for automatic classification of TV sports news

RSCTC'10 Proceedings of the 7th international conference on Rough sets and current trends in computing
Video structure analysis for content-based indexing and categorisation of TV sports news

International Journal of Intelligent Information and Database Systems
Toward next generation coaching tools for court based racquet sports

Proceedings of the 20th ACM international conference on Multimedia
Hierarchical Hidden Markov Model in detecting activities of daily living in wearable videos for studies of dementia

Multimedia Tools and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Automatic video content analysis is an emerging research subject with numerous applications to large video databases and personal video recording systems. The aim of this study is to fuse multimodal information in order to automatically parse the underlying structure of tennis broadcasts. The frame-based observation distributions of Hidden Markov Models are too strict in modeling heterogeneous audiovisual data. We propose instead the use of segmental features, of the framework of Segment Models, to overcome this limitation and extend the synchronization points to the segment boundaries. Considering each segment as a video scene, auditory and visual features collected inside the scene boundaries can thus be sampled and modeled with their native sampling rates and models. Experimental results on a corpus of 15-h tennis video demonstrated a performance superiority of Segment Models with synchronous audiovisual fusion over Hidden Markov Models. Results though with asynchronous fusion are less optimistic.