Toward an automatically generated soundtrack from low-level cross-modal correlations for automotive scenarios

Authors:
Marco Cristani;Anna Pesarin;Carlo Drioli;Vittorio Murino;Antonio Rodà;Michele Grapulin;Nicu Sebe
Affiliations:
Università degli Studi di Verona, Verona, Italy;Università degli Studi di Verona, Verona, Italy;Università degli Studi di Verona, Verona, Italy;Università degli Studi di Verona, Verona, Italy;Università di Udine, Udine, Italy;Università degli Studi di Verona, Verona, Italy;University of Trento, Trento, Italy
Venue:
Proceedings of the international conference on Multimedia
Year:
2010

Citing 13
Cited 3

A multimedia system for authoring motion pictures

MULTIMEDIA '96 Proceedings of the fourth ACM international conference on Multimedia
Garage cinema and the future of media technology

Communications of the ACM
Unsupervised Learning of Finite Mixture Models

IEEE Transactions on Pattern Analysis and Machine Intelligence
The Application of Video Semantics and Theme Representation in Automated Video Editing

Multimedia Tools and Applications
Computational Media Aesthetics: Finding Meaning Beautiful

IEEE MultiMedia
Application of computational media aesthetics methodology to extracting color semantics in film

Proceedings of the tenth ACM international conference on Multimedia
Pivot Vector Space Approach for Audio-Video Mixing

IEEE MultiMedia
Editing out Video Editing

IEEE MultiMedia
Extracting information about emotions in films

MULTIMEDIA '03 Proceedings of the eleventh ACM international conference on Multimedia
Modeling Intent for Home Video Repurposing

IEEE MultiMedia
Improving Musical Expressiveness by Time-Varying Brightness Shaping

Computer Music Modeling and Retrieval. Sense of Sounds
Automated music video generation using multi-level feature-based segmentation

Multimedia Tools and Applications
A phase-based approach to the estimation of the optical flow field using spatial filtering

IEEE Transactions on Neural Networks

Picasso - to sing, you must close your eyes and draw

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
PICASSO: automated soundtrack suggestion for multi-modal data

Proceedings of the 20th ACM international conference on Information and knowledge management
Example-based video remixing support system

MM '11 Proceedings of the 19th ACM international conference on Multimedia

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we propose a novel recommendation policy for driving scenarios. While driving a car, listening to an audio track may enrich the atmosphere, conveying emotions that let the driver sense a more arousing experience. Here, we are introducing a recommendation policy that, given a video sequence taken by a camera mounted onboard a car, chooses the most suitable audio piece from a predetermined set of melodies. The mixing mechanism takes inspiration from a set of generic qualitative aesthetical rules for cross-modal linking, realized by associating audio and video features. The contribution of this paper is to translate such qualitative rules into quantitative terms, learning from an extensive training dataset cross-modal statistical correlations, and validating them in a thoroughly way. In this way, we are able to define what are the audio and video features that correlate at best (i.e., promoting or rejecting some aesthetical rules), and what are their correlation intensities. This knowledge is then employed for the realization of the recommendation policy. A set of user studies illustrate and validate the policy, thus encouraging further developments toward a real implementation in an automotive application.