Microphone arrays for video camera steering
Acoustic signal processing for telecommunication
Spoken Dialogues with Computers
Spoken Dialogues with Computers
Voice Source Localization for Automatic Camera Pointing System in Videoconferencing
ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97) -Volume 1 - Volume 1
Towards Vision-Based 3-D People Tracking in a Smart Room
ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
Acoustic source location in noisy and reverberant environment using CSP analysis
ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 02
Localization of multiple sound sources based on a CSP analysis with a microphone array
ICASSP '00 Proceedings of the Acoustics, Speech, and Signal Processing, 2000. on IEEE International Conference - Volume 02
A Person Tracking System for CHIL Meetings
Multimodal Technologies for Perception of Humans
A speaker diarization method based on the probabilistic fusion of audio-visual location information
Proceedings of the 2009 international conference on Multimodal interfaces
IEEE Transactions on Audio, Speech, and Language Processing
A video monitoring model with a distributed camera system for the smart space
ruSMART/NEW2AN'10 Proceedings of the Third conference on Smart Spaces and next generation wired, and 10th international conference on Wireless networking
Sound source localization for real-world humanoid robots
SITE'12 Proceedings of the 11th international conference on Telecommunications and Informatics, Proceedings of the 11th international conference on Signal Processing
Hi-index | 0.00 |
This work addresses the problem of automatic speaker localization and tracking in a real lecture scenario. Evaluation criteria recently adopted under CHIL and NIST benchmarking are outlined. Two speaker localization systems are described, which are based on the use of Generalized Cross Correlation Phase Transform analysis and Global Coherence Field. Benchmarking results, obtained on a set of 13 lectures, showed an average RMS error of about 30 cm in the speaker localization.