Improving mood classification in music digital libraries by combining lyrics and audio

  • Authors:
  • Xiao Hu;J. Stephen Downie

  • Affiliations:
  • University of Illinois at Urbana-Champaign, Champaign, IL, USA;University of Illinois at Urbana-Champaign, Champaign, IL, USA

  • Venue:
  • Proceedings of the 10th annual joint conference on Digital libraries
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Mood is an emerging metadata type and access point in music digital libraries (MDL) and online music repositories. In this study, we present a comprehensive investigation of the usefulness of lyrics in music mood classification by evaluating and comparing a wide range of lyric text features including linguistic and text stylistic features. We then combine the best lyric features with features extracted from music audio using two fusion methods. The results show that combining lyrics and audio significantly outperformed systems using audio-only features. In addition, the examination of learning curves shows that the hybrid lyric + audio system needed fewer training samples to achieve the same or better classification accuracies than systems using lyrics or audio singularly. These experiments were conducted on a unique large-scale dataset of 5,296 songs (with both audio and lyrics for each) representing 18 mood categories derived from social tags. The findings push forward the state-of-the-art on lyric sentiment analysis and automatic music mood classification and will help make mood a practical access point in music digital libraries.