Speech recognition in noisy environments with the aid of microphone arrays
Speech Communication
Fundamentals of speech recognition
Fundamentals of speech recognition
ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2
Acoustic Source Location in a Three-Dimensional Space Using Crosspower Spectrum Phase
ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97) -Volume 1 - Volume 1
Optimal positioning of sensors for a microphone array
ICASSP '94 Proceedings of the Acoustics, Speech, and Signal Processing,1994. on IEEE International Conference - Volume 04
Acoustic source location in noisy and reverberant environment using CSP analysis
ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 02
Investigation of effectiveness of microphone arrays for in car use based on sound field simulation
ICASSP '01 Proceedings of the Acoustics, Speech, and Signal Processing, 2001. on IEEE International Conference - Volume 05
Constrained iterative speech enhancement with application to speechrecognition
IEEE Transactions on Signal Processing
IEEE Transactions on Signal Processing
Hi-index | 0.00 |
Among a number of studies which have investigated various speech enhancement and processing schemes for in-vehicle speech systems, the delay-and-sum beamforming (DASB) and adaptive beamforming are two typical methods that both have their advantages and disadvantages. In this paper, we propose a novel combined fixed/adaptive beamforming solution (CFA-BF) based on previous work for speech enhancement and recognition in real moving car environments, which seeks to take advantage of both methods. The working scheme of CFA-BF consists of two steps: source location calibration and target signal enhancement. The first step is to pre-record the transfer functions between the speaker and microphone array from different potential source positions using adaptive beamforming under quiet environments; and the second step is to use this pre-recorded information to enhance the desired speech when the car is running on the road. An evaluation using extensive actual car speech data from the CU-Move Corpus shows that the method can decrease WER for speech recognition by up to 30% over a single channel scenario and improve speech quality via the SEGSNR measure by up to 1dB on the average.