A privacy-sensitive approach to modeling multi-person conversations

  • Authors:
  • Danny Wyatt;Tanzeem Choudhury;Jeff Bilmes;Henry Kautz

  • Affiliations:
  • Dept. of Computer Science, University of Washington;Intel Research, Seattle, WA;Dept. of Electrical Engineering, University of Washington;Dept. of Computer Science, University of Rochester

  • Venue:
  • IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we introduce a new dynamic Bayesian network that separates the speakers and their speaking turns in a multi-person conversation. We protect the speakers' privacy by using only features from which intelligible speech cannot be reconstructed. The model we present combines data from multiple audio streams, segments the streams into speech and silence, separates the different speakers, and detects when other nearby individuals who are not wearing microphones are speaking. No pre-trained speaker specific models are used, so the system can be easily applied in new and different environments. We show promising results in two very different datasets that vary in background noise, microphone placement and quality, and conversational dynamics.