A privacy-sensitive approach to modeling multi-person conversations

Authors:
Danny Wyatt;Tanzeem Choudhury;Jeff Bilmes;Henry Kautz
Affiliations:
Dept. of Computer Science, University of Washington;Intel Research, Seattle, WA;Dept. of Electrical Engineering, University of Washington;Dept. of Computer Science, University of Rochester
Venue:
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Year:
2007

Citing 7
Cited 10

Dynamic Bayesian Multinets

UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Editorial: speech and emotion

Speech Communication - Special issue on speech and emotion
Emotional speech: towards a new generation of databases

Speech Communication - Special issue on speech and emotion
Conversational scene analysis

Conversational scene analysis
Discrete-time speech signal processing: principles and practice

Discrete-time speech signal processing: principles and practice
Topic and role discovery in social networks

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Mobile context inference using low-cost sensors

LoCA'05 Proceedings of the First international conference on Location- and Context-Awareness

Chatting activity recognition in social occasions using factorial conditional random fields with iterative classification

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 3
Modeling vocal interaction for text-independent participant characterization in multi-party conversation

SIGdial '08 Proceedings of the 9th SIGdial Workshop on Discourse and Dialogue
Speaker change detection with privacy-preserving audio cues

Proceedings of the 2009 international conference on Multimodal interfaces
Probabilistic models for concurrent chatting activity recognition

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Speaker recognition using speaker-independent universal acoustic model and synchronous sensing for "business microscope"

ISWPC'09 Proceedings of the 4th international conference on Wireless pervasive computing
Sensor-Based Human Activity Recognition in a Multi-user Scenario

AmI '09 Proceedings of the European Conference on Ambient Intelligence
Probabilistic models for concurrent chatting activity recognition

ACM Transactions on Intelligent Systems and Technology (TIST)
Generative modeling and classification of dialogs by a low-level turn-taking feature

Pattern Recognition
Recognizing multi-user activities using wearable sensors in a smart home

Pervasive and Mobile Computing
The sound of silence

Proceedings of the 11th ACM Conference on Embedded Networked Sensor Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we introduce a new dynamic Bayesian network that separates the speakers and their speaking turns in a multi-person conversation. We protect the speakers' privacy by using only features from which intelligible speech cannot be reconstructed. The model we present combines data from multiple audio streams, segments the streams into speech and silence, separates the different speakers, and detects when other nearby individuals who are not wearing microphones are speaking. No pre-trained speaker specific models are used, so the system can be easily applied in new and different environments. We show promising results in two very different datasets that vary in background noise, microphone placement and quality, and conversational dynamics.