Direction of arrival estimation improvement of speech on a two-microphone array

Authors:
Mohd Nadzrul Bin Mohd Nor;Tomoya Matsumura;Takao Onoye
Affiliations:
Osaka University, Yamadaoka, Suita, Osaka, Japan;Osaka University, Yamadaoka, Suita, Osaka, Japan;Osaka University, Yamadaoka, Suita, Osaka, Japan
Venue:
SIP '07 Proceedings of the Ninth IASTED International Conference on Signal and Image Processing
Year:
2007

Citing 3
Cited 0

Real-Time Automated Video and Audio Capture with Multiple Cameras and Microphones

Journal of VLSI Signal Processing Systems
Voice Source Localization for Automatic Camera Pointing System in Videoconferencing

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97) -Volume 1 - Volume 1
Robust distant-talking speech recognition

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 01

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper proposes methods to achieve improved accuracy of speech direction of arrival estimation. Previous research has proposed a high resolution DOA estimation system for human vowels using only two microphones. However, in real environment, the conventional DOA estimation system is not robust enough to provide accurate results for human speech. To increase the robustness of the system for speech, non-utterance frame omission method and steering frequency selection method are proposed. Non-utterance frame omission evaluates the strength of speech in each frame and omits frames that have no or weak speech presence. Steering frequency selection is applied to determine the frequency that is imperative for DOA estimation based on harmonic product spectrum. Finally, the proposed system is evaluated both through simulation and real environment test. Proposed system shows a distinct improvement for speech DOA estimation amounting to about 46% decrease in estimation error compared to the conventional system for sound sources present at the side of the array.