A conditional random field viewpoint of symbolic audio-to-score matching

  • Authors:
  • Cyril Joder;Slim Essid;Gaël Richard

  • Affiliations:
  • Institut Telecom - Telecom ParisTech - CNRS/LTCI, Paris, France;Institut Telecom - Telecom ParisTech - CNRS/LTCI, Paris, France;Institut Telecom - Telecom ParisTech - CNRS/LTCI, Paris, France

  • Venue:
  • Proceedings of the international conference on Multimedia
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a new approach of symbolic audio-to-score alignment, with the use of Conditional Random Fields (CRFs). Unlike Hidden Markov Models, these graphical models allow the calculation of state conditional probabilities to be made on the basis of several audio frames. The CRF models that we propose exploit this property to take into account the rhythmic information of the musical score. Assuming that the tempo is locally constant, they confront the neighborhood of each frame with several tempo hypotheses. Experiments on a pop-music database show that this use of contextual information leads to a significant improvement of the alignment accuracy. In particular, the proportion of detected onsets inside a 100-ms tolerance window increases by more than 10% when a 1-s neighborhood is considered.