Speech Communication - Special issue on speech processing in adverse conditions
Robust automatic speech recognition with missing and unreliable acoustic data
Speech Communication
Robustness in Automatic Speech Recognition: Fundamentals and Applications
Robustness in Automatic Speech Recognition: Fundamentals and Applications
Unsupervised discovery of morphemes
MPL '02 Proceedings of the ACL-02 workshop on Morphological and phonological learning - Volume 6
On noise masking for automatic missing data speech recognition: A survey and discussion
Computer Speech and Language
Issues with uncertainty decoding for noise robust automatic speech recognition
Speech Communication
Robust Face Recognition via Sparse Representation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Sparse imputation for noise robust speech recognition using soft masks
ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Mask estimation for missing data speech recognition based on statistics of binaural interaction
IEEE Transactions on Audio, Speech, and Language Processing
IEEE Transactions on Audio, Speech, and Language Processing
IEEE Transactions on Information Theory
Computer Speech and Language
Hi-index | 0.00 |
An effective way to increase noise robustness in automatic speech recognition is to label the noisy speech features as either reliable or unreliable ('missing'), and replace ('impute') the missing ones by clean speech estimates. Conventional imputation techniques employ parametric models and impute the missing features on a frame-by-frame basis. At low SNRs, frame-based imputation techniques fail because many time frames contain few, if any, reliable features. In previous work, we introduced an exemplar-based method, dubbed sparse imputation, which can impute missing features using reliable features from neighbouring frames. We achieved substantial gains in performance at low SNRs for a connected digit recognition task. In this work, we investigate whether the exemplar-based approach can be generalised to a large vocabulary task. Experiments on artificially corrupted speech show that sparse imputation substantially outperforms a conventional imputation technique when the ideal 'oracle' reliability of features is used. With error-prone estimates of feature reliability, sparse imputation performance is comparable to our baseline imputation technique in the cleanest conditions, and substantially better at lower SNRs. With noisy speech recorded in realistic noise conditions, sparse imputation performs slightly worse than our baseline imputation technique in the cleanest conditions, but substantially better in the noisier conditions.