Automatic recognition of lyrics in singing

  • Authors:
  • Annamaria Mesaros;Tuomas Virtanen

  • Affiliations:
  • Department of Signal Processing, Tampere University of Technology, Tampere, Finland;Department of Signal Processing, Tampere University of Technology, Tampere, Finland

  • Venue:
  • EURASIP Journal on Audio, Speech, and Music Processing - Special issue on atypical speech
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

The paper considers the task of recognizing phonemes and words from a singing input by using a phonetic hidden Markov model recognizer. The system is targeted to both monophonic singing and singing in polyphonic music. A vocal separation algorithm is applied to separate the singing from polyphonic music. Due to the lack of annotated singing databases, the recognizer is trained using speech and linearly adapted to singing. Global adaptation to singing is found to improve singing recognition performance. Further improvement is obtained by gender-specific adaptation. We also study adaptation with multiple base classes defined by either phonetic or acoustic similarity. We test phoneme-level and word-level n-gram language models. The phoneme language models are trained on the speech database text. The large-vocabulary word-level language model is trained on a database of textual lyrics. Two applications are presented. The recognizer is used to align textual lyrics to vocals in polyphonic music, obtaining an average error of 0.94 seconds for line-level alignment. A query-by-singing retrieval application based on the recognized words is also constructed; in 57% of the cases, the first retrieved song is the correct one.