Speaker separation and tracking system

Authors:
U. Anliker;J. F. Randall;G. Tröster
Affiliations:
The Wearable Computing Lab., ETH Zurich, Zurich, Switzerland;The Wearable Computing Lab., ETH Zurich, Zurich, Switzerland;The Wearable Computing Lab., ETH Zurich, Zurich, Switzerland
Venue:
EURASIP Journal on Applied Signal Processing
Year:
2006

Citing 12
Cited 0

A robust audio classification and segmentation method

MULTIMEDIA '01 Proceedings of the ninth ACM international conference on Multimedia
Information Retrieval

Information Retrieval
Speaker change detection and tracking in real-time news broadcasting analysis

Proceedings of the tenth ACM international conference on Multimedia
Construction and Evaluation of a Robust Multifeature Speech/Music Discriminator

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2
Wearable Phased Arrays for Sound Localization and Enhancement

ISWC '00 Proceedings of the 4th IEEE International Symposium on Wearable Computers
Wearable sensing to annotate meeting recordings

Personal and Ubiquitous Computing
A multi-modal approach for determining speaker location and focus

Proceedings of the 5th international conference on Multimodal interfaces
Design of the QBIC Wearable Computing Platform

ASAP '04 Proceedings of the Application-Specific Systems, Architectures and Processors, 15th IEEE International Conference
A tutorial on text-independent speaker verification

EURASIP Journal on Applied Signal Processing
The 2004 ICSI-SRI-UW meeting recognition system

MLMI'04 Proceedings of the First international conference on Machine Learning for Multimodal Interaction
Blind separation of speech mixtures via time-frequency masking

IEEE Transactions on Signal Processing
A systematic approach to the design of distributed wearable systems

IEEE Transactions on Computers

Quantified Score

Hi-index	0.00

Visualization

Abstract

Replicating human hearing in electronics under the constraints of using only two microphones (even with more than two speakers) and the user carrying the device at all times (i.e., mobile device weighing less than 100 g) is nontrivial. Our novel contribution in this area is a two-microphone system that incorporates both blind source separation and speaker tracking. This system handles more than two speakers and overlapping speech in a mobile environment. The system also supports the case in which a feedback loop from the speaker tracking step to the blind source separation can improve performance. In order to develop and optimize this system, we have established a novel benchmark that we here with present. Using the introduced complexity metrics, we present the tradeoffs between system performance and computational load. Our results prove that in our case, source separation was significantly more dependent on frame duration than on sampling frequency.