MusicCommentator: Generating Comments Synchronized with Musical Audio Signals by a Joint Probabilistic Model of Acoustic and Textual Features

Authors:
Kazuyoshi Yoshii;Masataka Goto
Affiliations:
National Institute of Advanced Industrial Science and Technology (AIST), Ibaraki, Japan 305-8568;National Institute of Advanced Industrial Science and Technology (AIST), Ibaraki, Japan 305-8568
Venue:
ICEC '09 Proceedings of the 8th International Conference on Entertainment Computing
Year:
2009

Citing 4
Cited 1

Why we tag: motivations for annotation in mobile and online media

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Network analysis of massively collaborative creation of multimedia contents: case study of hatsune miku videos on nico nico douga

Proceedings of the 1st international conference on Designing interactive user experiences for TV and video
Can social annotation support users in evaluating the trustworthiness of video clips?

Proceedings of the 2nd ACM workshop on Information credibility on the web
Semantic Annotation and Retrieval of Music and Sound Effects

IEEE Transactions on Audio, Speech, and Language Processing

Leveraging viewer comments for mood classification of music video clips

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a system called MusicCommentator that suggests possible comments on appropriate temporal positions in a musical audio clip. In an online video sharing service, many users can provide free-form text comments for temporal events occurring in clips not for entire clips. To emulate the commenting behavior of users, we propose a joint probabilistic model of audio signals and comments. The system trains the model by using existing clips and users' comments given to those clips. Given a new clip and some of its comments, the model is used to estimate what temporal positions could be commented on and what comments could be added to those positions. It then concatenates possible words by taking language constraints into account. Our experimental results showed that using existing comments in a new clip resulted in improved accuracy for generating suitable comments to it.