An audio stream classification and optimal segmentation for multimedia applications

Authors:
Konstantin Biatov;Joachim Koehler
Affiliations:
Fraunhofer Institute for Media Communication, Sankt Augustin, Germany;Fraunhofer Institute for Media Communication, Sankt Augustin, Germany
Venue:
MULTIMEDIA '03 Proceedings of the eleventh ACM international conference on Multimedia
Year:
2003

Citing 1
Cited 2

Real-time discrimination of broadcast speech/music

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 02

PodCred: a framework for analyzing podcast preference

Proceedings of the 2nd ACM workshop on Information credibility on the web
Structured audio player: supporting radio archive workflows with automatically generated structure metadata

Large Scale Semantic Access to Content (Text, Image, Video, and Sound)

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we investigate on-line zero-crossing based audio stream segmentation and classification into speech and other segments. We consider such segments as applause, noise of the auditorium, and silence. We demonstrate that the features extracted from zero-crossing are stable and valid to be used for speech and other signal discrimination and classification and don't require large amount of data for the training. We describe the optimal segmentation of unlimited audio signals using results of the frames classification. We demonstrate that using optimal segmentation is better than using traditional sliding window technique.