A robust audio classification and segmentation method
MULTIMEDIA '01 Proceedings of the ninth ACM international conference on Multimedia
Information Retrieval
Speaker change detection and tracking in real-time news broadcasting analysis
Proceedings of the tenth ACM international conference on Multimedia
Construction and Evaluation of a Robust Multifeature Speech/Music Discriminator
ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2
Wearable Phased Arrays for Sound Localization and Enhancement
ISWC '00 Proceedings of the 4th IEEE International Symposium on Wearable Computers
Wearable sensing to annotate meeting recordings
Personal and Ubiquitous Computing
A multi-modal approach for determining speaker location and focus
Proceedings of the 5th international conference on Multimodal interfaces
Design of the QBIC Wearable Computing Platform
ASAP '04 Proceedings of the Application-Specific Systems, Architectures and Processors, 15th IEEE International Conference
A tutorial on text-independent speaker verification
EURASIP Journal on Applied Signal Processing
The 2004 ICSI-SRI-UW meeting recognition system
MLMI'04 Proceedings of the First international conference on Machine Learning for Multimodal Interaction
Blind separation of speech mixtures via time-frequency masking
IEEE Transactions on Signal Processing
A systematic approach to the design of distributed wearable systems
IEEE Transactions on Computers
Hi-index | 0.00 |
Replicating human hearing in electronics under the constraints of using only two microphones (even with more than two speakers) and the user carrying the device at all times (i.e., mobile device weighing less than 100 g) is nontrivial. Our novel contribution in this area is a two-microphone system that incorporates both blind source separation and speaker tracking. This system handles more than two speakers and overlapping speech in a mobile environment. The system also supports the case in which a feedback loop from the speaker tracking step to the blind source separation can improve performance. In order to develop and optimize this system, we have established a novel benchmark that we here with present. Using the introduced complexity metrics, we present the tradeoffs between system performance and computational load. Our results prove that in our case, source separation was significantly more dependent on frame duration than on sampling frequency.