EURASIP Journal on Applied Signal Processing
Hi-index | 0.00 |
A multi-microphone time-frequency speech masking technique is proposed. This technique utilizes both the time-frequency magnitude and phase information in order to estimate the signal-to-noise ratio (SNR) maximizing masking coefficients for each time-frequency block given that the direction (or alternatively, the time-delay of arrival) of the speaker of interest is known. Using this masking algorithm, speech features (such as formants) from the direction of interest are preserved while features from other directions are severely degraded. Digit recognition experiments indicate that the proposed technique can result in a substantial increase in the digit recognition accuracy rate. At 0 dB, for example, the proposed technique results in a digit recognition accuracy rate improvement of 26% over the single microphone case and an improvement of 12% over the two microphone superdirective beamforming case.