A Model of Saliency-Based Visual Attention for Rapid Scene Analysis
IEEE Transactions on Pattern Analysis and Machine Intelligence
Perceptual audio rendering of complex virtual environments
ACM SIGGRAPH 2004 Papers
Efficient method for multiple compressed audio streams spatialization
Proceedings of the 3rd international conference on Mobile and ubiquitous multimedia
Visual Localization Ability Influences Cross-Modal Bias
Journal of Cognitive Neuroscience
Subband-domain filtering of MPEG audio signals
ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 02
Multi-resolution sound rendering
SPBG'04 Proceedings of the First Eurographics conference on Point-Based Graphics
Fast modal sounds with scalable frequency-domain synthesis
ACM SIGGRAPH 2008 papers
ACM Transactions on Applied Perception (TAP)
Efficient and practical audio-visual rendering for games using crossmodal perception
Proceedings of the 2009 symposium on Interactive 3D graphics and games
RESound: interactive sound rendering for dynamic virtual environments
MM '09 Proceedings of the 17th ACM international conference on Multimedia
ACM SIGGRAPH 2009 Courses
Variety Is the Spice of (Virtual) Life
MIG '09 Proceedings of the 2nd International Workshop on Motion in Games
A 3-D immersive synthesizer for environmental sounds
IEEE Transactions on Audio, Speech, and Language Processing
Spatial audio: graphic modeling for X3D
Proceedings of the 16th International Conference on 3D Web Technology
Palliating visual artifacts through audio rendering
SG'11 Proceedings of the 11th international conference on Smart graphics
Bimodal task-facilitation in a virtual traffic scenario through spatialized sound rendering
ACM Transactions on Applied Perception (TAP)
Interactive sound propagation using compact acoustic transfer operators
ACM Transactions on Graphics (TOG)
Acoustic Rendering and Auditory–Visual Cross-Modal Perception and Interaction
Computer Graphics Forum
Hi-index | 0.00 |
Despite recent advances, including sound source clustering and perceptual auditory masking, high quality rendering of complex virtual scenes with thousands of sound sources remains a challenge. Two major bottlenecks appear as the scene complexity increases: the cost of clustering itself, and the cost of pre-mixing source signals within each cluster. In this paper, we first propose an improved hierarchical clustering algorithm that remains efficient for large numbers of sources and clusters while providing progressive refinement capabilities. We then present a lossy pre-mixing method based on a progressive representation of the input audio signals and the perceptual importance of each sound source. Our quality evaluation user tests indicate that the recently introduced audio saliency map is inappropriate for this task. Consequently we propose a "pinnacle", loudness-based metric, which gives the best results for a variety of target computing budgets. We also performed a perceptual pilot study which indicates that in audio-visual environments, it is better to allocate more clusters to visible sound sources. We propose a new clustering metric using this result. As a result of these three solutions, our system can provide high quality rendering of thousands of 3D-sound sources on a "gamer-style" PC.