An Efficient Approach for Classification of Speech and Music

Authors:
Ei Mon Swe;Moe Pwint
Affiliations:
University of Computer Studies Yangon, Myanmar;University of Computer Studies Yangon, Myanmar
Venue:
PCM '08 Proceedings of the 9th Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing
Year:
2008

Citing 3
Cited 0

Speech/music segmentation using entropy and dynamism features in a HMM classification framework

Speech Communication
Construction and Evaluation of a Robust Multifeature Speech/Music Discriminator

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2
A speech/music discriminator based on RMS and zero-crossings

IEEE Transactions on Multimedia

Quantified Score

Hi-index	0.00

Visualization

Abstract

A new method to classify an audio segment into speech and music related to the automatic transcription of broadcast news is presented. To discriminate between speech and music, sample entropy (SampEn ), a time complexity measure, mainly operates as a feature. SampEn is a variant of the approximate entropy (ApEn ) that measures the regularity of time series. The basic idea is to label a given audio into speech or music depending on its regularity. Based on the SampEn sequence calculated over a window, the regularity of a given audio stream is measured. The effectiveness of the proposed method is tested on experiments, including broadcast news shows from BBC radio stations, WBAI news, UN news and music genres with different temporal distributions. Results show the robustness of the proposed method achieving high discrimination accuracy for all tested experiments.