Automatic speech recognition performance in different room acoustic environments with and without dereverberation preprocessing

Authors:
Alexandros Tsilfidis;Iosif Mporas;John Mourjopoulos;Nikos Fakotakis
Affiliations:
Wire Communications Laboratory, University of Patras, Greece;Wire Communications Laboratory, University of Patras, Greece;Wire Communications Laboratory, University of Patras, Greece;Wire Communications Laboratory, University of Patras, Greece
Venue:
Computer Speech and Language
Year:
2013

Citing 22
Cited 1

Automatic segmentation and labeling of speech based on Hidden Markov Models

Speech Communication
Speech recognition by machines and humans

Speech Communication
Should recognizers have ears?

Speech Communication - Special issue on robust speech recognition
Automatic segmentation of speech recorded in unknown noisy channel characteristics

Speech Communication - Special issue on robust speech recognition
Combining speech enhancement and auditory feature extraction for robust speech recognition

Speech Communication - Special issue on noise robust ASR
Robust automatic speech recognition with missing and unreliable acoustic data

Speech Communication
Perceptually inspired signal processing strategies for robust speech recognition in reverberant environments

Perceptually inspired signal processing strategies for robust speech recognition in reverberant environments
Training of HMM with filtered speech material for hands-free recognition

ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 01
A new approach for the adaptation of HMMs to reverberation and background noise

Speech Communication
Reverberant speech enhancement by temporal and spectral processing

IEEE Transactions on Audio, Speech, and Language Processing
Fast communication: Signal-dependent constraints for perceptually motivated suppression of late reverberation

Signal Processing
Perceptually-motivated selective suppression of late reverberation

DSP'09 Proceedings of the 16th international conference on Digital Signal Processing
Robust speech recognition based on dereverberation parameter optimization using acoustic model likelihood

IEEE Transactions on Audio, Speech, and Language Processing - Special issue on processing reverberant speech: methodologies and applications
Speech dereverberation based on variance-normalized delayed linear prediction

IEEE Transactions on Audio, Speech, and Language Processing - Special issue on processing reverberant speech: methodologies and applications
Sub-band temporal modulation envelopes and their normalization for automatic speech recognition in reverberant environments

Computer Speech and Language
Harmonicity-Based Blind Dereverberation for Single-Channel Speech Signals

IEEE Transactions on Audio, Speech, and Language Processing
Suppression of Late Reverberation Effect on Speech Signal Using Long-Term Multiple-step Linear Prediction

IEEE Transactions on Audio, Speech, and Language Processing
A two-stage algorithm for one-microphone reverberant speech enhancement

IEEE Transactions on Audio, Speech, and Language Processing
Joint Dereverberation and Residual Echo Suppression of Speech Signals in Noisy Environments

IEEE Transactions on Audio, Speech, and Language Processing
Speech Analysis in a Model of the Central Auditory System

IEEE Transactions on Audio, Speech, and Language Processing
A Large Margin Algorithm for Speech-to-Phoneme and Music-to-Score Alignment

IEEE Transactions on Audio, Speech, and Language Processing
Robust Speech Dereverberation Using Multichannel Blind Deconvolution With Spectral Subtraction

IEEE Transactions on Audio, Speech, and Language Processing

A domain-independent statistical methodology for dialog management in spoken dialog systems

Computer Speech and Language

Quantified Score

Hi-index	0.00

Visualization

Abstract

The performance of recent dereverberation methods for reverberant speech preprocessing prior to Automatic Speech Recognition (ASR) is compared for an extensive range of room and source-receiver configurations. It is shown that room acoustic parameters such as the clarity (C50) and the definition (D50) correlate well with the ASR results. When available, such room acoustic parameters can provide insight into reverberant speech ASR performance and potential improvement via dereverberation preprocessing. It is also shown that the application of a recent dereverberation method based on perceptual modelling can be used in the above context and achieve significant Phone Recognition (PR) improvement, especially under highly reverberant conditions.