Video Handling with Music and Speech Detection
IEEE MultiMedia
Construction and Evaluation of a Robust Multifeature Speech/Music Discriminator
ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Real-time discrimination of broadcast speech/music
ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 02
Speech/music discrimination for multimedia applications
ICASSP '00 Proceedings of the Acoustics, Speech, and Signal Processing, 2000. on IEEE International Conference - Volume 04
Content-Based Retrieval of Audio in News Broadcasts
FQAS '09 Proceedings of the 8th International Conference on Flexible Query Answering Systems
Hi-index | 0.00 |
Automatic discrimination of speech and music is an important tool in many multimedia applications. This article presents an evolutionary, fuzzy, rules-based speech/music discrimination approach for intelligent audio coding, which exploits only one simple feature, called Warped LPC-based Spectral Centroid (WLPC-SC). Comparison between WLPC-SC and the classical features proposed in the literature for audio classification is performed, aiming to assess the good discriminatory power of the proposed feature. The length of the vector for describing the proposed psychoacoustic-based feature is reduced to a few statistical values (mean, variance, and skewness), which are then transformed to a new feature space, applying linear discriminant analysis (LDA), with the aim of increasing the classification accuracy percentage. The classification task is performed applying a support vector machine (SVM) to the features in the transformed space. The final decision is made by a fuzzy expert system, which improves the accuracy rate provided by the SVM, taking into account the audio labels assigned by this classifier to past audio frames. The accuracy rate improvement due to the fuzzy expert system is also reported. Experimental results reveal that our speech/music discriminator is robust and fast, making it suitable for intelligent audio coding.