Time and frequency filtering of filter-bank energies for robust HMM speech recognition
Speech Communication - Special issue on noise robust ASR
Robust automatic speech recognition with missing and unreliable acoustic data
Speech Communication
Automatic language identification
Speech Communication
Towards a robust/fast continuous speech recognition system using a voiced-unvoiced decision
ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 01
Hi-index | 0.00 |
In this paper, we propose a model for the incorporation of voicing information into a speech recognition system in noisy environments. The employed voicing information is estimated by a novel method that can provide this information for each filter-bank channel and does not require information about the fundamental frequency. The voicing information is modelled by employing the Bernoulli distribution. The voicing model is obtained for each HMM state and mixture by a Viterbi-style training procedure. The proposed voicing incorporation is evaluated both within a standard model and two other models that had compensated for the noise effect, the missing-feature and the multi-conditional training model. Experiments are first performed on noisy speech data from the Aurora 2 database. Significant performance improvements are achieved when the voicing information is incorporated within the standard model as well as the noise-compensated models. The employment of voicing information is also demonstrated on a phoneme recognition task on the noise-corrupted TIMIT database and considerable improvements are observed.