Content based audio classification: a neural network approach

  • Authors:
  • Vikramjit Mitra;Chia-Jiu Wang

  • Affiliations:
  • University of Maryland, Department of Electrical and Computer Engineering, 20742, College Park, MD, USA;University of Colorado at Colorado Springs, Department of Electrical and Computer Engineering, 80933, Colorado Springs, CO, USA

  • Venue:
  • Soft Computing - A Fusion of Foundations, Methodologies and Applications - Special issue on neural networks for pattern recognition and data mining
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Content based music genre classification is a key component for next generation multimedia search agents. This paper introduces an audio classification technique based on audio content analysis. Artificial Neural Networks (ANNs), specifically multi-layered perceptrons (MLPs) are implemented to perform the classification task. Windowed audio files of finite length are analyzed to generate multiple feature sets which are used as input vectors to a parallel neural architecture that performs the classification. This paper examines a combination of linear predictive coding (LPC), mel frequency cepstrum coefficients (MFCCs), Haar Wavelet, Daubechies Wavelet and Symlet coefficients as feature sets for the proposed audio classifier. Parallel to MLP, a Gaussian radial basis function (GRBF) based ANN is also implemented and analyzed. The obtained prediction accuracy of 87.3% in determining the audio genres claims the efficiency of the proposed architecture. The ANN prediction values are processed by a rule based inference engine (IE) that presents the final decision.