Spatialized audioconferencing: what are the benefits?

  • Authors:
  • Ryan Kilgore;Mark Chignell;Paul Smith

  • Affiliations:
  • Department of Mechanical and Industrial Engineering, University of Toronto;Department of Mechanical and Industrial Engineering, University of Toronto;IBM Centre for Advanced Studies

  • Venue:
  • CASCON '03 Proceedings of the 2003 conference of the Centre for Advanced Studies on Collaborative research
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Audioconference participants often have difficulty identifying the voices of other conferees, especially in ad hoc groups of unfamiliar members. Simultaneous presentation of multiple voices through a single, monaural channel can be discordant and difficult to comprehend. To address these shortcomings, we have developed the Vocal Village, a communications tool that allows for real-time spatialized audioconferencing across the Internet. The Vocal Village system uses binaural audio signals to present the voices of individual conference participants from different apparent positions in space by adding location cues to audio information.This paper describes our experimental research to determine whether the real-time, "within the head," spatialization cues implemented by Vocal Village are sufficient to provide performance benefits compared to traditional, monaural audio-conferencing methods. Performance benefits included memory, speaker identification, and participant preference. We also investigated whether providing users with the ability to control the location of conference participants within a virtual auditory space further enhanced any such benefits.The "within the head" spatialization used in this experiment did not lead to a statistically significant increase in the ability to remember who said what in an audioconference. However, there was a borderline significant increase in remembering who said what when participants were given the opportunity to move the voices of two similar sounding conferees into different apparent locations. Participants also significantly preferred spatialized audio formats over the mono audio format. Spatialization had a significant effect on improving participants' perceived confidence in their memory of conferee viewpoints. Additionally, spatialization significantly reduced both the perceived difficulty of identifying speakers during conferences, as well as the amount of attention perceived to be dedicated to performing such voice identification. Providing subjects with the ability to control the apparent location of conference participants resulted in the greatest benefit to both of these measures.