Fundamentals of speech recognition
Fundamentals of speech recognition
Text algorithms
A Review of Automatic Rhythm Description Systems
Computer Music Journal
Sparse and structured decompositions of signals with the molecular matching pursuit
IEEE Transactions on Audio, Speech, and Language Processing
Sparse component analysis and blind source separation of underdetermined mixtures
IEEE Transactions on Neural Networks
Explicit modeling of temporal dynamics within musical signals for acoustical unit similarity
Pattern Recognition Letters
Context-Aware features for singing voice detection in polyphonic music
AMR'11 Proceedings of the 9th international conference on Adaptive Multimedia Retrieval: large-scale multimedia retrieval and evaluation
Hi-index | 0.10 |
This letter addresses the problem of pattern recognition of polyphonic musical timbre. Frame-level dynamics of audio features are particularly difficult to model, although they have been identified as crucial perceptive dimensions of timbre perception. Recent studies seem to indicate that traditional means to model data dynamics, such as delta-coefficients, texture windows or Markov modelling, do not provide any improvement over the best static models for real-world, complex polyphonic textures of several seconds' length. This contradicts experimental data on the perception of individual instrument notes. This letter describes an experiment to identify the cause of this contradiction. We propose that the difficulty of modelling the dynamics of full songs results either from the complex structure of the temporal succession of notes, or from the vertical polyphonic nature of individual notes. We discriminate between both hypothesis by comparing the performance of static and dynamical algorithms on several specially designed datasets, namely monophonic individual notes, polyphonic individual notes, and polyphonic multiple-note textures. We conclude that the main cause of the difficulty of modelling dynamics of real-world polyphonic musical textures is the polyphonic nature of the data.