Speech enhancement using fourth-order cumulants and optimum filters in the subband domain

Authors:
Elias Nemer;Rafik Goubran;Samy Mahmoud
Affiliations:
Intel, 350 E. Plumeria Drive, San Jose, CA;Systems and Computer Engineering, Carleton University, Ottawa, Ont., Canada;Systems and Computer Engineering, Carleton University, Ottawa, Ont., Canada
Venue:
Speech Communication
Year:
2002

Citing 4
Cited 3

Discrete Time Processing of Speech Signals

Discrete Time Processing of Speech Signals
Speech analysis and quality enhancement using higher order cumulants

Speech analysis and quality enhancement using higher order cumulants
Speech enhancement based on a priori signal to noise estimation

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 02
Parameter estimation of exponentially damped sinusoids using ahigher order correlation-based approach

IEEE Transactions on Signal Processing

Non-intrusive single-ended speech quality assessment in VoIP

Speech Communication
Fast communication: High-resolution harmonic retrieval using the full fourth-order cumulant

Signal Processing
Applications of cumulants in speech processing

NOLISP'09 Proceedings of the 2009 international conference on Advances in Nonlinear Speech Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

A new method for speech enhancement using time-domain optimum filters and fourth-order cumulants (FOC) is proposed based on newly established properties of the FOC of speech signals. In the exploratory part of the paper, the analytical expression of the FOC of subbanded speech is derived assuming a sinusoidal model and up to two harmonics per band. Important properties about this cumulant are revealed and actual speech data is used to verify the derivations and the underlying model. In the application part of the work, speech enhancement is formulated as an estimation problem and the expression for the time-domain causal optimum filters is derived for a pth order system. The key idea is to use the FOC of the noisy speech to estimate the parameters required for the enhancement filters, namely the second-order statistics of the speech and noise. It is shown that the kurtosis and the diagonal slice of the FOC may be used to estimate such parameters as the SNR, the speech autocorrelation and the probability of speech presence in a given band. Subjective listening and examination of the spectrograms show that the resulting algorithm is effective on typical noises encountered in mobile telephony. Compared to the TIA-IS127 standard for noise reduction, it results in overall more noise reduction and better speech preservation in Gaussian, street and fan noise. Its effectiveness diminishes however in harmonic and impulsive types such as office and car engine, where discrimination between speech and noise based on FOC becomes more difficult.