Speech enhancement based on Sparse Code Shrinkage employing multiple speech models

Authors:
Peter JančOvič;Xin Zou;MüNevver KöKüEr
Affiliations:
School of Electronic, Electrical & Computer Engineering, University of Birmingham, Pritchatts Road, B15 2TT Birmingham, UK;School of Electronic, Electrical & Computer Engineering, University of Birmingham, Pritchatts Road, B15 2TT Birmingham, UK;School of Electronic, Electrical & Computer Engineering, University of Birmingham, Pritchatts Road, B15 2TT Birmingham, UK
Venue:
Speech Communication
Year:
2012

Citing 8
Cited 1

Sparse code shrinkage: denoising of nongaussian data by maximum likelihood estimation

Neural Computation
Advanced Digital Signal Processing and Noise Reduction

Advanced Digital Signal Processing and Noise Reduction
Efficient alternatives to the Ephraim and Malah suppression rule for audio signal enhancement

EURASIP Journal on Applied Signal Processing
Speech enhancement by map spectral amplitude estimation using a super-Gaussian speech model

EURASIP Journal on Applied Signal Processing
A Bayesian estimation approach for speech enhancement using hiddenMarkov models

IEEE Transactions on Signal Processing
Speech Signal Enhancement Based on MAP Algorithm in the ICA Space

IEEE Transactions on Signal Processing
HMM-Based Gain Modeling for Enhancement of Speech in Noise

IEEE Transactions on Audio, Speech, and Language Processing
Codebook driven short-term predictor parameter estimation for speech enhancement

IEEE Transactions on Audio, Speech, and Language Processing

Compressive speech enhancement

Speech Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a single-channel speech enhancement system based on the Sparse Code Shrinkage (SCS) algorithm and employment of multiple speech models. The enhancement system consists of two stages: training and enhancement. In the training stage, the Gaussian mixture modelling (GMM) is employed to cluster speech signals in ICA-based transform domain into several categories, and for each category a super-Gaussian model is estimated that is used during the enhancement stage. In the enhancement stage, the estimate of each signal frame is obtained as a weighted average of estimates obtained by using each speech category model. The weights are calculated according to the probability of each category, given the signal enhanced using the conventional SCS algorithm. During the enhancement, the individual speech category models are further adapted at each signal frame. Experimental evaluations are performed on speech signals from the TIMIT database, corrupted by Gaussian noise and three real-world noises, Subway, Street, and Railway noise, from the NOISEX-92 database. Evaluations are performed in terms of segmental SNR, spectral distortion and PESQ measure. Experimental results show that the proposed multi-model SCS enhancement algorithm significantly outperforms the conventional WF, SCS and multi-model WF algorithms.