Collaborative personal speaker identification: A generalized approach

Authors:
Mirco Rossi;Oliver Amft;Gerhard Tröster
Affiliations:
Wearable Computing Lab., ETH Zurich, Switzerland11http://www.wearable.ethz.ch.;Wearable Computing Lab., ETH Zurich, Switzerland11http://www.wearable.ethz.ch. and ACTLab, Signal Processing Systems, TU Eindhoven, The Netherlands22http://www.actlab.ele.tue.nl.;Wearable Computing Lab., ETH Zurich, Switzerland11http://www.wearable.ethz.ch.
Venue:
Pervasive and Mobile Computing
Year:
2012

Citing 8
Cited 0

Discrete Time Processing of Speech Signals

Discrete Time Processing of Speech Signals
Speaker change detection and tracking in real-time news broadcasting analysis

Proceedings of the tenth ACM international conference on Multimedia
Wearable sensing to annotate meeting recordings

Personal and Ubiquitous Computing
Sensing and modeling human networks

Sensing and modeling human networks
MyLifeBits: a personal database for everything

Communications of the ACM - Personal information management
InSense: Interest-Based Life Logging

IEEE MultiMedia
Predicting shoppers' interest from social interactions using sociometric sensors

CHI '09 Extended Abstracts on Human Factors in Computing Systems
Audiovisual Probabilistic Tracking of Multiple Speakers in Meetings

IEEE Transactions on Audio, Speech, and Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper introduces a collaborative personal speaker identification system to annotate conversations and meetings using speech-independent speaker modeling and one audio channel. This system can operate in standalone and collaborative modes, and learn about speakers online that were detected as unknown. In collaborative mode, the system exchanges current speaker information with personal systems of others to improve identification performance. Our collaboration concept is based on distributed personal systems only, hence it does not require a specific infrastructure to operate. We present a generalized description of collaboration situations and derive three use scenarios in which the system was subsequently evaluated. Compared to standalone operation, collaboration among four personal identification systems increased system performance by up to 9% for 4 relevant speakers and up to 21% for 24 relevant speakers. Allowing unknown speakers in a conversation did not impede performance gains of a collaboration. In a scenario where individual systems had nonidentical speaker sets, collaboration gains were 16% for 24 relevant speakers.