Feature clustering for instrument classification

  • Authors:
  • Uwe Ligges;Sebastian Krey

  • Affiliations:
  • Technische Universität Dortmund, Fakultät Statistik, Vogelpothsweg 87, 44221, Dortmund, Germany;Technische Universität Dortmund, Fakultät Statistik, Vogelpothsweg 87, 44221, Dortmund, Germany

  • Venue:
  • Computational Statistics - Special Issue: Proceedings of Reisensburg 2009
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose a method that allows for instrument classification from a piece of sound. Features are derived from a pre-filtered time series divided into small windows. Afterwards, features from the (transformed) spectrum, Perceptive Linear Prediction (PLP), and Mel Frequency Cepstral Coefficients (MFCCs) as known from speech processing are selected. As a clustering method, k-means is applied yielding a reduced number of features for the classification task. A SVM classifier using a polynomial kernel yields good results. The accuracy is very convincing given a misclassification error of roughly 19% for 59 different classes of instruments. As expected, misclassification error is smaller for a problem with less classes. The rastamat library (Ellis in PLP and RASTA (and MFCC, and inversion) in Matlab. http://www.ee.columbia.edu/~dpwe/resources/matlab/rastamat/, online web resource, 2005) functionality has been ported from Matlab to R. This means feature extraction as known from speech processing is now easily available from the statistical programming language R. This software has been used on a cluster of machines for the computer intensive evaluation of the proposed method.