Talker Variability in Speech Processing
Talker Variability in Speech Processing
Voice Source Localization for Automatic Camera Pointing System in Videoconferencing
ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97) -Volume 1 - Volume 1
Tracking Multiple Talkers Using Microphone-Array Measurements
ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97) -Volume 1 - Volume 1
Providing the basis for human-robot-interaction: a multi-modal attention system for a mobile robot
Proceedings of the 5th international conference on Multimodal interfaces
Hi-index | 0.08 |
Speaker localization is an important issue in the study of human communication, and is related to a variety of practical applications. When two or more speakers speak simultaneously, finding the direction of arrival of the speech signals is a complicated task. The spectral separation between different speech signals was first quantified. Some 40%, in the mean sense, of the spectral information for the 0-5 kHz band were found to differ significantly (by at least 10 dB) between any two speakers, even when they speak the same utterance at the same time and with the same intensity. Signals in the frequency domain were analyzed to transform the problem into a set of single-source single-frequency problems. This made it possible to apply a time delay direction finding (TDDF) algorithm (Berdugo et al., J. Acoust. Soc. Am. 105 (6) (1999) 3355). Next, a new "fusion" algorithm was developed which extended the solution to separate the speech signals of two speakers at low SNR values. The results obtained in simulations as well as in actual experimental studies, demonstrated high angular resolution between two speakers (approximately 20° for a 10 cm array extent) even at low SNR ratios. This algorithm may be suitable for various applications, such as video conferencing and hearing aids.