A general audio classifier based on human perception motivated model

Authors:
Hadi Harb;Liming Chen
Affiliations:
LIRIS CNRS FRE 2672, Department of Mathématiques-Informatique, Ecole Centrale de Lyon, Ecully, France 69131;LIRIS CNRS FRE 2672, Department of Mathématiques-Informatique, Ecole Centrale de Lyon, Ecully, France 69131
Venue:
Multimedia Tools and Applications
Year:
2007

Citing 0
Cited 3

Semantic labeling of nonspeech audio clips

EURASIP Journal on Audio, Speech, and Music Processing - Special issue on scalable audio-content analysis
Dynamic and scalable audio classification by collective network of binary classifiers framework: An evolutionary approach

Neural Networks
Initial objective & subjective evaluation of a similarity-based audio compression technique

Proceedings of the 8th Audio Mostly Conference

Quantified Score

Hi-index	0.00

Visualization

Abstract

The audio channel conveys rich clues for content-based multimedia indexing. Interesting audio analysis includes, besides widely known speech recognition and speaker identification problems, speech/music segmentation, speaker gender detection, special effect recognition such as gun shots or car pursuit, and so on. All these problems can be considered as an audio classification problem which needs to generate a label from low audio signal analysis. While most audio analysis techniques in the literature are problem specific, we propose in this paper a general framework for audio classification. The proposed technique uses a perceptually motivated model of the human perception of audio classes in the sense that it makes a judicious use of certain psychophysical results and relies on a neural network for classification. In order to assess the effectiveness of the proposed approach, large experiments on several audio classification problems have been carried out, including speech/music discrimination in Radio/TV programs, gender recognition on a subset of the switchboard database, highlights detection in sports videos, and musical genre recognition. The classification accuracies of the proposed technique are comparable to those obtained by problem specific techniques while offering the basis of a general approach for audio classification.