LAMP, A Lyrics and Audio MandoPop Dataset for Music Mood Estimation: Dataset Compilation, System Construction, and Testing

Authors:
Wei Rong Chu;Richard Tzong-Han Tsai;Ying-Sian Wu;Hui-Hsin Wu;Hung-Yi Chen;Jane Yung-jen Hsu
Affiliations:
-;-;-;-;-;-
Venue:
TAAI '10 Proceedings of the 2010 International Conference on Technologies and Applications of Artificial Intelligence
Year:
2010

Citing 0
Cited 1

Mining sentiments from songs using latent dirichlet allocation

IDA'11 Proceedings of the 10th international conference on Advances in intelligent data analysis X

Quantified Score

Hi-index	0.00

Visualization

Abstract

Music mood estimation (MME) is an emerging subfield in music information retrieval research. Whereas most MME research focuses on audio analysis, exploring the significance of lyrics in predicting song emotion has been receiving more attention in recent years. One major impediment to MME research is the lack of clearly-labeled and publicly-available datasets of separately annotated lyrics and audio. In the first section of this paper, we describe the creation of the LAMP dataset, containing 492 mandarin pop songs with separate mood annotations for lyrics text and audio music. Our second contribution is to demonstrate with statistical analysis on the LAMP dataset how lyrics and audio contribute individually to a song’s overall mood. Our analysis suggests that lyrics can serve as a valid measure for music mood estimation, especially in song valence, and provide supplementary mood information to audio. Thirdly, we propose the Sentiment Score Approach for extracting affective words from lyrics text and show that it is the most effective individual method for improving MME accuracy while reducing the number of features. Lastly, we combine our best lyrical feature configuration with audio features in an MME system for estimating song valence. This configuration outperforms audio-features-only by 16.517% and lyrical-features-only by 1.5%, suggesting strongly that lyrical features can be an important source of supplementary information for audio-music features when predicting song valence.