Speech/music discrimination via energy density analysis

Authors:
Stanisław Kacprzak;Mariusz Ziółko
Affiliations:
Department of Electronics, AGH University of Science and Technology, Kraków, Poland;Department of Electronics, AGH University of Science and Technology, Kraków, Poland
Venue:
SLSP'13 Proceedings of the First international conference on Statistical Language and Speech Processing
Year:
2013

Citing 7
Cited 0

A robust audio classification and segmentation method

MULTIMEDIA '01 Proceedings of the ninth ACM international conference on Multimedia
Construction and Evaluation of a Robust Multifeature Speech/Music Discriminator

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2
Real-time discrimination of broadcast speech/music

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 02
A comparison of features for speech, music discrimination

ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 01
A wavelet-based parameterization for speech/music discrimination

Computer Speech and Language
Automatic Speech Corpus Construction from Broadcasting Speech Databases

CIS '10 Proceedings of the 2010 International Conference on Computational Intelligence and Security
Low-complexity F0-based speech/nonspeech discrimination approach for digital hearing aids

Multimedia Tools and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we suggest to apply a new feature, called Minimum Energy Density (MED), in discrimination of audio signals between speech and music. Our method is based on the analysis of local energy for 1 or 2.5 seconds audio signals. An elementary analysis of the probability for the power distribution is an effective tool supporting the decision making system. We compare our feature with Percentage of Low Energy Frames (LEF), Modified Low Energy Ratio (MLER) and examine their efficiency for two separate speech/music corpora.