Automatic classification of speech and music using neural networks

Authors:
M. Kashif Saeed Khan;Wasfi G. Al-Khatib;Muhammad Moinuddin
Affiliations:
King Fahd Univ. of Petroleum and Minerals, Dhahran, Saudi Arabia;King Fahd Univ. of Petroleum and Minerals, Dhahran, Saudi Arabia;King Fahd Univ. of Petroleum and Minerals, Dhahran, Saudi Arabia
Venue:
Proceedings of the 2nd ACM international workshop on Multimedia databases
Year:
2004

Citing 7
Cited 2

Neural Networks: A Comprehensive Foundation

Neural Networks: A Comprehensive Foundation
Construction and Evaluation of a Robust Multifeature Speech/Music Discriminator

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2
Real-time discrimination of broadcast speech/music

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 02
A comparison of features for speech, music discrimination

ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 01
Speech/music discrimination for multimedia applications

ICASSP '00 Proceedings of the Acoustics, Speech, and Signal Processing, 2000. on IEEE International Conference - Volume 04
Robust singing detection in speech/music discriminator design

ICASSP '01 Proceedings of the Acoustics, Speech, and Signal Processing, 200. on IEEE International Conference - Volume 02
A speech/music discriminator based on RMS and zero-crossings

IEEE Transactions on Multimedia

Speech/Music Discrimination Based on Discrete Wavelet Transform

SETN '08 Proceedings of the 5th Hellenic conference on Artificial Intelligence: Theories, Models and Applications
A wavelet-based parameterization for speech/music discrimination

Computer Speech and Language

Quantified Score

Hi-index	0.00

Visualization

Abstract

The importance of automatic discrimination between speech signals and music signals has evolved as a research topic over recent years. The need to classify audio into categories such as speech or music is an important aspect of many multimedia document retrieval systems. Several approaches have been previously used to discriminate between speech and music data. In this paper, we propose the use of the mean and variance of the discrete wavelet transform in addition to other features that have been used previously for audio classification. We have used Multi-Layer Perceptron (MLP) Neural Networks as a classifier. Our initial tests have shown encouraging results that indicate the viability of our approach.