A System for Automatic Chord Transcription from Audio Using Genre-Specific Hidden Markov Models

Authors:
Kyogu Lee
Affiliations:
Center for Computer Research in Music and Acoustics, Stanford University, Stanford, USA CA 94305
Venue:
Adaptive Multimedial Retrieval: Retrieval, User, and Semantics
Year:
2007

Citing 5
Cited 2

A system for the automatic segmentation and classification of chord sequences

Proceedings of the 1st ACM workshop on Audio and music computing multimedia
Automatic chord recognition from audio using a supervised HMM trained with audio-from-symbolic data

Proceedings of the 1st ACM workshop on Audio and music computing multimedia
Detecting harmonic change in musical audio

Proceedings of the 1st ACM workshop on Audio and music computing multimedia
The Cognition of Basic Musical Structures

The Cognition of Basic Musical Structures
Acoustic Chord Transcription and Key Extraction From Audio Using Key-Dependent HMMs Trained on Synthesized Audio

IEEE Transactions on Audio, Speech, and Language Processing

Automatic Chord Estimation from Audio: A Review of the State of the Art

IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)
Genre-Based Music Language Modeling with Latent Hierarchical Pitman-Yor Process Allocation

IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)

Quantified Score

Hi-index	0.00

Visualization

Abstract

We describe a system for automatic chord transcription from the raw audio using genre-specific hidden Markov models trained on audio-from-symbolic data. In order to avoid enormous amount of human labor required to manually annotate the chord labels for ground-truth, we use symbolic data such as MIDI files to automate the labeling process. In parallel, we synthesize the same symbolic files to provide the models with the sufficient amount of observation feature vectors along with the automatically generated annotations for training. In doing so, we build different models for various musical genres, whose model parameters reveal characteristics specific to their corresponding genre. The experimental results show that the HMMs trained on synthesized data perform very well on real acoustic recordings. It is also shown that when the correct genre is chosen, simpler, genre-specific model yields performance better than or comparable to that of more complex model that is genre-independent. Furthermore, we also demonstrate the potential application of the proposed model to the genre classification task.