Environmental sounds classification based on visual features

Authors:
Sameh Souli;Zied Lachiri
Affiliations:
Signal, Image and pattern recognition research unit Dept. of Genie Electrique, ENIT, Le Belvédère, Tunisia;Signal, Image and pattern recognition research unit Dept. of Genie Electrique, ENIT, Le Belvédère, Tunisia
Venue:
CIARP'11 Proceedings of the 16th Iberoamerican Congress conference on Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications
Year:
2011

Citing 9
Cited 1

Bounds on Error Expectation for Support Vector Machines

Neural Computation
Frame level noise classification in mobile environments

ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 01
Robust Object Recognition with Cortex-Like Mechanisms

IEEE Transactions on Pattern Analysis and Machine Intelligence
Audio classification from time-frequency texture

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Environmental sound recognition with time-frequency audio features

IEEE Transactions on Audio, Speech, and Language Processing
Stress Detection Using Speech Spectrograms and Sigma-pi Neuron Units

ICNC '09 Proceedings of the 2009 Fifth International Conference on Natural Computation - Volume 02
Using One-Class SVMs and Wavelets for Audio Surveillance

IEEE Transactions on Information Forensics and Security
An overview of statistical learning theory

IEEE Transactions on Neural Networks
A comparison of methods for multiclass support vector machines

IEEE Transactions on Neural Networks

Multiclass support vector machines for environmental sounds classification in visual domain based on log-Gabor filters

International Journal of Speech Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a method aimed at classification of the environmental sounds in the visual domain by using the scale and translation invariance. We present a new approach that extracts visual features from sound spectrograms. We suggest to apply support vector machines (SVM's) in order to address sound classification. Indeed, in the proposed method we explore sound spectrograms as texture images, and extracts the time-frequency structures by using a translation-invariant wavelet transform and a patch transform alternated with local maximum and global maximum to pursuit scale and translation invariance. We illustrate the performance of this method on an audio database, which composed of 10 sounds classes. The obtained recognition rate is of the order 91.82 % with the multiclass decomposition method: One-Against-One.