Progressive perceptual audio rendering of complex scenes

Authors:
Thomas Moeck;Nicolas Bonneel;Nicolas Tsingos;George Drettakis;Isabelle Viaud-Delmon;David Alloza
Affiliations:
REVES/INRIA Sophia-Antipolis and University of Erlangen-Nuremberg;REVES/INRIA Sophia-Antipolis;REVES/INRIA Sophia-Antipolis;REVES/INRIA Sophia-Antipolis;CNRS-UPMC UMR, EdenGames;CNRS-UPMC UMR, EdenGames
Venue:
Proceedings of the 2007 symposium on Interactive 3D graphics and games
Year:
2007

Citing 6
Cited 12

A Model of Saliency-Based Visual Attention for Rapid Scene Analysis

IEEE Transactions on Pattern Analysis and Machine Intelligence
Perceptual audio rendering of complex virtual environments

ACM SIGGRAPH 2004 Papers
Efficient method for multiple compressed audio streams spatialization

Proceedings of the 3rd international conference on Mobile and ubiquitous multimedia
Visual Localization Ability Influences Cross-Modal Bias

Journal of Cognitive Neuroscience
Subband-domain filtering of MPEG audio signals

ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 02
Multi-resolution sound rendering

SPBG'04 Proceedings of the First Eurographics conference on Point-Based Graphics

Fast modal sounds with scalable frequency-domain synthesis

ACM SIGGRAPH 2008 papers
Moving sounds enhance the visually-induced self-motion illusion (circular vection) in virtual reality

ACM Transactions on Applied Perception (TAP)
Efficient and practical audio-visual rendering for games using crossmodal perception

Proceedings of the 2009 symposium on Interactive 3D graphics and games
RESound: interactive sound rendering for dynamic virtual environments

MM '09 Proceedings of the 17th ACM international conference on Multimedia
Interactive sound rendering

ACM SIGGRAPH 2009 Courses
Variety Is the Spice of (Virtual) Life

MIG '09 Proceedings of the 2nd International Workshop on Motion in Games
A 3-D immersive synthesizer for environmental sounds

IEEE Transactions on Audio, Speech, and Language Processing
Spatial audio: graphic modeling for X3D

Proceedings of the 16th International Conference on 3D Web Technology
Palliating visual artifacts through audio rendering

SG'11 Proceedings of the 11th international conference on Smart graphics
Bimodal task-facilitation in a virtual traffic scenario through spatialized sound rendering

ACM Transactions on Applied Perception (TAP)
Interactive sound propagation using compact acoustic transfer operators

ACM Transactions on Graphics (TOG)
Acoustic Rendering and Auditory–Visual Cross-Modal Perception and Interaction

Computer Graphics Forum

Quantified Score

Hi-index	0.00

Visualization

Abstract

Despite recent advances, including sound source clustering and perceptual auditory masking, high quality rendering of complex virtual scenes with thousands of sound sources remains a challenge. Two major bottlenecks appear as the scene complexity increases: the cost of clustering itself, and the cost of pre-mixing source signals within each cluster. In this paper, we first propose an improved hierarchical clustering algorithm that remains efficient for large numbers of sources and clusters while providing progressive refinement capabilities. We then present a lossy pre-mixing method based on a progressive representation of the input audio signals and the perceptual importance of each sound source. Our quality evaluation user tests indicate that the recently introduced audio saliency map is inappropriate for this task. Consequently we propose a "pinnacle", loudness-based metric, which gives the best results for a variety of target computing budgets. We also performed a perceptual pilot study which indicates that in audio-visual environments, it is better to allocate more clusters to visible sound sources. We propose a new clustering metric using this result. As a result of these three solutions, our system can provide high quality rendering of thousands of 3D-sound sources on a "gamer-style" PC.