User preference-aware music video generation based on modeling scene moods

Authors:
Rajiv Ratn Shah;Yi Yu;Roger Zimmermann
Affiliations:
National University of Singapore, Singapore;National University of Singapore, Singapore;National University of Singapore, Singapore
Venue:
Proceedings of the 5th ACM Multimedia Systems Conference
Year:
2014

Citing 6
Cited 0

Cutting-plane training of structural SVMs

Machine Learning
Fuzzy color histogram-based video segmentation

Computer Vision and Image Understanding
Generating ground truth for music mood classification using mechanical turk

Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries
A study of term weighting schemes using class information for text classification

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Location-based recommendation system using Bayesian user's preference model in mobile devices

UIC'07 Proceedings of the 4th international conference on Ubiquitous Intelligence and Computing
Automatic music soundtrack generation for outdoor videos from contextual sensor information

Proceedings of the 20th ACM international conference on Multimedia

Quantified Score

Hi-index	0.00

Visualization

Abstract

Due to technical advances in mobile devices (e.g., smartphones, tablets) and wireless communications, people now can easily capture user-generated videos (UGVs) anywhere, anytime and instantly share their real-life experiences via social web sites. Enjoying videos has become very popular entertainment. One challenge is that many mobile videos do not have very appealing audio that was captured with the video. In this demonstration, to overcome this issue we propose a music video generation/creation system (Android app and backend system) that aims to make UGVs more attractive by generating scene-adaptive and user-preference aware music tracks. In our system, we take geographic categories, visual content and user listening history into account. In particular, the sequences of geographic categories and visual features are integrated into a SVMhmm model to predict video scene moods. The music genre, as a user preference is also exploited to personalize the recommended songs. We believe this is the first work that predicts scene moods from a real-world video dataset collected by users' daily outdoor recordings to facilitate user-preference aware music video generation. Our experiments confirm that our system can effectively combine objective scene moods and individual music tastes to recommend appealing soundtracks for videos. Our Android app only sends recorded sensor data and a few keyframes of a UGV to a cloud service (backend system) to retrieve recommended music tracks, therefore it is bandwidth efficient since the transmission of video data is not required for analysis.