Fundamentals of speech recognition
Fundamentals of speech recognition
On artificial bandwidth extension of telephone speech
Signal Processing - Special section: Hans Wilhelm Schüßler celebrates his 75th birthday
Hi-index | 0.00 |
In mobile communications the transmitted speech signals are narrowband, thus sampled at 8 kHz. They are low-pass filtered under 4 kHz and a lot of intelligibility is lost. The goal of Artificial Bandwidth Extension (ABWE) is to recover the lost quality by reconstruction of the voice spectrum between 4 and 8 kHz, bringing thus the superior listening quality and intelligibility of wideband speech. The validity of an algorithm based on a Hidden Markov Model (HMM) has been demonstrated in the majority of speech variety, but resulted quite ineffective in the reconstruction of the fricative consonants. We investigated the causes of inefficient extension of the fricatives and the deriving problems. We developed a codebook design technique which provides a particular emphasis on these sounds in order to improve the fidelity of the reproduction and the dynamic of the processing. Our design improves noticeably the intelligibility of the fricatives. Log-spectral distance measures demonstrate the faithful extension as well as the subjective listening quality and intelligibility.