Real-Time Automated Video and Audio Capture with Multiple Cameras and Microphones
Journal of VLSI Signal Processing Systems
Voice Source Localization for Automatic Camera Pointing System in Videoconferencing
ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97) -Volume 1 - Volume 1
Robust distant-talking speech recognition
ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 01
Hi-index | 0.00 |
This paper proposes methods to achieve improved accuracy of speech direction of arrival estimation. Previous research has proposed a high resolution DOA estimation system for human vowels using only two microphones. However, in real environment, the conventional DOA estimation system is not robust enough to provide accurate results for human speech. To increase the robustness of the system for speech, non-utterance frame omission method and steering frequency selection method are proposed. Non-utterance frame omission evaluates the strength of speech in each frame and omits frames that have no or weak speech presence. Steering frequency selection is applied to determine the frequency that is imperative for DOA estimation based on harmonic product spectrum. Finally, the proposed system is evaluated both through simulation and real environment test. Proposed system shows a distinct improvement for speech DOA estimation amounting to about 46% decrease in estimation error compared to the conventional system for sound sources present at the side of the array.