Speech/music discrimination for multimedia applications

Authors:
K. El-Maleh;M. Klein;G. Petrucci;P. Kabal
Affiliations:
Dept. of Electr. & Comput. Eng., McGill Univ., Montreal, Que., Canada;-;-;-
Venue:
ICASSP '00 Proceedings of the Acoustics, Speech, and Signal Processing, 2000. on IEEE International Conference - Volume 04
Year:
2000

Citing 0
Cited 28

Pause concepts for audio segmentation at different semantic levels

MULTIMEDIA '01 Proceedings of the ninth ACM international conference on Multimedia
A robust audio classification and segmentation method

MULTIMEDIA '01 Proceedings of the ninth ACM international conference on Multimedia
Support Vector Machine Learning for Music Discrimination

PCM '02 Proceedings of the Third IEEE Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing
Speech/music segmentation using entropy and dynamism features in a HMM classification framework

Speech Communication
Multimodal Video Indexing: A Review of the State-of-the-art

Multimedia Tools and Applications
Automatic classification of speech and music using neural networks

Proceedings of the 2nd ACM international workshop on Multimedia databases
Adaptive network-based fuzzy inference system vs. other classification algorithms for warped LPC-based speech/music discrimination

Engineering Applications of Artificial Intelligence
A multimodal data mining framework for soccer goal detection based on decision tree logic

International Journal of Computer Applications in Technology
Speech/Music Discrimination Based on Discrete Wavelet Transform

SETN '08 Proceedings of the 5th Hellenic conference on Artificial Intelligence: Theories, Models and Applications
Automatic music boundary detection using short segmental acoustic similarity in a music piece

EURASIP Journal on Audio, Speech, and Music Processing - Intelligent Audio, Speech, and Music Processing Applications
New speech/music discrimination approach based on fundamental frequency estimation

Multimedia Tools and Applications
SPEECH/MUSIC DISCRIMINATION BASED ON WARPING TRANSFORMATION AND FUZZY LOGIC FOR INTELLIGENT AUDIO CODING

Applied Artificial Intelligence
A decision-tree-based algorithm for speech/music classification and segmentation

EURASIP Journal on Audio, Speech, and Music Processing
A combination of data mining method with decision trees building for Speech/Music discrimination

Computer Speech and Language
Environmental sound recognition with time-frequency audio features

IEEE Transactions on Audio, Speech, and Language Processing
A combination of data mining method with context-based state transfer for speech/music discrimination

WiCOM'09 Proceedings of the 5th International Conference on Wireless communications, networking and mobile computing
Two-stage cascaded classification approach based on genetic fuzzy learning for speech/music discrimination

Engineering Applications of Artificial Intelligence
Speech/music classification using occurrence pattern of ZCR and STE

IITA'09 Proceedings of the 3rd international conference on Intelligent information technology application
Online speech/music segmentation based on the variance mean of filter bank energy

EURASIP Journal on Advances in Signal Processing
Improvement to speech-music discrimination using sinusoidal model based features

Multimedia Tools and Applications
Multimedia data mining: state of the art and challenges

Multimedia Tools and Applications
Neural network classification of gunshots using spectral characteristics

ACMOS'11 Proceedings of the 13th WSEAS international conference on Automatic control, modelling & simulation
Low-complexity F0-based speech/nonspeech discrimination approach for digital hearing aids

Multimedia Tools and Applications
Speech/music discrimination in audio podcast using structural segmentation and timbre recognition

CMMR'10 Proceedings of the 7th international conference on Exploring music contents
Audio classification for radio broadcast indexing: feature normalization and multiple classifiers decision

PCM'04 Proceedings of the 5th Pacific Rim Conference on Advances in Multimedia Information Processing - Volume Part II
Song/instrumental classification using spectrogram based contextual features

Proceedings of the CUBE International Information Technology Conference
Dictionary learning based sparse coefficients for audio classification with max and average pooling

Digital Signal Processing
Mining movie archives for song sequences

Multimedia Tools and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Automatic discrimination of speech and music is an important tool in many multimedia applications. Previous work has focused on using long-term features such as differential parameters, variances and time-averages of spectral parameters. These classifiers use features estimated over windows of 0.5-5 seconds, and are relatively complex. We present our results of combining the line spectral frequencies (LSFs) and zero crossing-based features for frame-level narrowband speech/music discrimination. Our classification results for different types of music and speech show the good discriminating power of these features. Our classification algorithms operate using only a frame delay of 20 ms, making them suitable for real-time multimedia applications.