On the Relative Importance of Individual Components of Chord Recognition Systems

Authors:
Taemin Cho;Juan P. Bello
Affiliations:
Music & Audio Res. Lab. (MARL), New York Univ., New York, NY, USA;Music & Audio Res. Lab. (MARL), New York Univ., New York, NY, USA
Venue:
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)
Year:
2014

Citing 10
Cited 0

The Computer Music Tutorial

The Computer Music Tutorial
Automatic chord recognition from audio using a supervised HMM trained with audio-from-symbolic data

Proceedings of the 1st ACM workshop on Audio and music computing multimedia
Tonal Description of Polyphonic Audio for Music Content Processing

INFORMS Journal on Computing
Making chroma features more robust to timbre changes

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Structured Prediction Models for Chord Transcription of Music Audio

ICMLA '09 Proceedings of the 2009 International Conference on Machine Learning and Applications
Towards timbre-invariant audio features for harmony-based music

IEEE Transactions on Audio, Speech, and Language Processing
Simultaneous estimation of chords and musical context from audio

IEEE Transactions on Audio, Speech, and Language Processing
Extracting Predominant Local Pulse Information From Music Recordings

IEEE Transactions on Audio, Speech, and Language Processing
Rethinking Automatic Chord Recognition with Convolutional Neural Networks

ICMLA '12 Proceedings of the 2012 11th International Conference on Machine Learning and Applications - Volume 02
A Minimum Frame Error Criterion for Hidden Markov Model Training

ICMLA '12 Proceedings of the 2012 11th International Conference on Machine Learning and Applications - Volume 02

Quantified Score

Hi-index	0.00

Visualization

Abstract

Most chord recognition systems share a common architecture comprising two main stages: feature extraction and pattern matching, and two optional sub stages: pre-filtering and post-filtering. Understanding the interaction between these basic components is very important not only for achieving optimal performance, but also for assessing the potential and limitations of the system. Unfortunately, there are no studies that sufficiently evaluate the effects of the different approaches to each processing step and the interactions between these steps. In this paper we attempt to remedy this deficiency by performing a systematic evaluation encompassing a wide variety of techniques used for each processing step. In our study we find that filtering has a significant impact on performance, but providing musical context information in the transition matrix is rendered moot by the need to enforce continuity in the estimations. We discovered that the benefits of using complex chord models can be largely offset by an appropriate choice of features. In addition, the initial performance gap between different features were not fully compensated by any subsequent processing stages.