The influence of polyphony on the dynamical modelling of musical timbre

  • Authors:
  • Jean-Julien Aucouturier;Francois Pachet

  • Affiliations:
  • Ikegami Laboratory, Department of General Systems Studies, Graduate School of Arts and Sciences, The University of Tokyo, 3-8-1 Komaba, Meguro-ku, Tokyo 153-8902, Japan;SONY Computer Science Laboratory, 6 rue Amyot, 75005 Paris, France

  • Venue:
  • Pattern Recognition Letters
  • Year:
  • 2007

Quantified Score

Hi-index 0.10

Visualization

Abstract

This letter addresses the problem of pattern recognition of polyphonic musical timbre. Frame-level dynamics of audio features are particularly difficult to model, although they have been identified as crucial perceptive dimensions of timbre perception. Recent studies seem to indicate that traditional means to model data dynamics, such as delta-coefficients, texture windows or Markov modelling, do not provide any improvement over the best static models for real-world, complex polyphonic textures of several seconds' length. This contradicts experimental data on the perception of individual instrument notes. This letter describes an experiment to identify the cause of this contradiction. We propose that the difficulty of modelling the dynamics of full songs results either from the complex structure of the temporal succession of notes, or from the vertical polyphonic nature of individual notes. We discriminate between both hypothesis by comparing the performance of static and dynamical algorithms on several specially designed datasets, namely monophonic individual notes, polyphonic individual notes, and polyphonic multiple-note textures. We conclude that the main cause of the difficulty of modelling dynamics of real-world polyphonic musical textures is the polyphonic nature of the data.