A conditional random field viewpoint of symbolic audio-to-score matching

Authors:
Cyril Joder;Slim Essid;Gaël Richard
Affiliations:
Institut Telecom - Telecom ParisTech - CNRS/LTCI, Paris, France;Institut Telecom - Telecom ParisTech - CNRS/LTCI, Paris, France;Institut Telecom - Telecom ParisTech - CNRS/LTCI, Paris, France
Venue:
Proceedings of the international conference on Multimedia
Year:
2010

Citing 5
Cited 0

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Aligning music audio with symbolic scores using a hybrid graphical model

Machine Learning
Modeling form for on-line following of musical performances

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
A Coupled Duration-Focused Architecture for Real-Time Music-to-Score Alignment

IEEE Transactions on Pattern Analysis and Machine Intelligence
Precise pitch profile feature extraction from musical audio for key detection

IEEE Transactions on Multimedia

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a new approach of symbolic audio-to-score alignment, with the use of Conditional Random Fields (CRFs). Unlike Hidden Markov Models, these graphical models allow the calculation of state conditional probabilities to be made on the basis of several audio frames. The CRF models that we propose exploit this property to take into account the rhythmic information of the musical score. Assuming that the tempo is locally constant, they confront the neighborhood of each frame with several tempo hypotheses. Experiments on a pop-music database show that this use of contextual information leads to a significant improvement of the alignment accuracy. In particular, the proportion of detected onsets inside a 100-ms tolerance window increases by more than 10% when a 1-s neighborhood is considered.