An FFT-based companding front end for noise-robust automatic speech recognition

Authors:
Bhiksha Raj;Lorenzo Turicchia;Bent Schmidt-Nielsen;Rahul Sarpeshkar
Affiliations:
Mitsubishi Electric Research Laboratories, Cambridge, MA;Massachusetts Institute of Technology, Cambridge, MA;Mitsubishi Electric Research Laboratories, Cambridge, MA;Massachusetts Institute of Technology, Cambridge, MA
Venue:
EURASIP Journal on Audio, Speech, and Music Processing
Year:
2007

Citing 2
Cited 2

Speech recognition by machines and humans

Speech Communication
Automatic speech recognition with an adaptation model motivated by auditory processing

IEEE Transactions on Audio, Speech, and Language Processing

Recognition of noisy speech: a comparative survey of robust model architecture and feature enhancement

EURASIP Journal on Audio, Speech, and Music Processing
MyConverse: recognising and visualising personal conversations using smartphones

Proceedings of the 2013 ACM conference on Pervasive and ubiquitous computing adjunct publication

Quantified Score

Hi-index	0.00

Visualization

Abstract

We describe an FFT-based companding algorithm for preprocessing speech before recognition. The algorithm mimics tone-to-tone suppression and masking in the auditory system to improve automatic speech recognition performance in noise. Moreover, it is also very computationally efficient and suited to digital implementations due to its use of the FFT. In an automotive digits recognition task with the CU-Move database recorded in real environmental noise, the algorithm improves the relative word error by 12.5% at -5 dB signal-to-noise ratio (SNR) and by 6.2% across all SNRs (-5 dB SNR to +15 dB SNR). In the Aurora-2 database recorded with artificially added noise in several environments, the algorithm improves the relative word error rate in almost all situations.