Efficient SNR driven SPLICE implementation for robust speech recognition

Authors:
Stefano Squartini;Emanuele Principi;Simone Cifani;Rudi Rotili;Francesco Piazza
Affiliations:
3MediaLabs, DIBET, Università Politecnica delle Marche, Ancona, Italy;3MediaLabs, DIBET, Università Politecnica delle Marche, Ancona, Italy;3MediaLabs, DIBET, Università Politecnica delle Marche, Ancona, Italy;3MediaLabs, DIBET, Università Politecnica delle Marche, Ancona, Italy;3MediaLabs, DIBET, Università Politecnica delle Marche, Ancona, Italy
Venue:
COST'10 Proceedings of the 2010 international conference on Analysis of Verbal and Nonverbal Communication and Enactment
Year:
2010

Citing 4
Cited 0

A vector Taylor series approach for environment-independent speech recognition

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 02
Multichannel Cepstral Domain Feature Warping for Robust Speech Recognition

Proceedings of the 2011 conference on Neural Nets WIRN10: Proceedings of the 20th Italian Workshop on Neural Nets
Comparative evaluation of single-channel MMSE-Based noise reduction schemes for speech recognition

Journal of Electrical and Computer Engineering
Robust Speech Recognition Using a Cepstral Minimum-Mean-Square-Error-Motivated Noise Suppressor

IEEE Transactions on Audio, Speech, and Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

The SPLICE algorithm has been recently proposed in the literature to address the robustness issue in Automatic Speech Recognition (ASR). Several variants have been also proposed to improve some drawbacks of the original technique. In this presentation an innovative efficient solution is discussed: it is based on SNR estimation in the frequency or mel domain and investigates the possibility of using different noise types for GMM training in order to maximize the generalization capabilities of the tool and therefore the recognition performances in presence of unknown noise sources. Computer simulations, conducted on the AURORA2 database, seem to confirm the effectiveness of the idea: the proposed approach yields similar accuracy performances w.r.t. the reference one, even employing a simpler mismatch compensation paradigm which does not need any a-priori knowledge on the noises used in the training phase.