Enhancement of noisy speech by temporal and spectral processing

Authors:
P. Krishnamoorthy;S. R. M. Prasanna
Affiliations:
Samsung India Software Center, Noida 201301, India;Department of Electronics and Communication Engineering, Indian Institute of Technology Guwahati, Guwahati, Assam 781039, India
Venue:
Speech Communication
Year:
2011

Citing 16
Cited 1

Numerical recipes in C (2nd ed.): the art of scientific computing

Numerical recipes in C (2nd ed.): the art of scientific computing
Assessment for automatic speech recognition II: NOISEX-92: a database and an experiment to study the effect of additive noise on speech recognition systems

Speech Communication - Special issue on speech processing in adverse conditions
Discrete Time Processing of Speech Signals

Discrete Time Processing of Speech Signals
The Modulation Spectrogram: In Pursuit of an Invariant Representation of Speech

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 3 - Volume 3
Multiple statistical models for soft decision in noisy speech enhancement

Pattern Recognition
A Laplacian-based MMSE estimator for speech enhancement

Speech Communication
Reduction of musical residual noise for speech enhancement using masking properties and optimal smoothing

Pattern Recognition Letters
Subjective comparison and evaluation of speech enhancement algorithms

Speech Communication
Finding Pitch Markers using First Order Gaussian Differentiator

ICISIP '05 Proceedings of the 2005 3rd International Conference on Intelligent Sensing and Information Processing
Speech enhancement by joint statistical characterization in the Log Gabor Wavelet domain

Speech Communication
Reverberant speech enhancement by temporal and spectral processing

IEEE Transactions on Audio, Speech, and Language Processing
Computing the discrete-time “analytic” signal via FFT

IEEE Transactions on Signal Processing
Event-Based Instantaneous Fundamental Frequency Estimation From Speech Signals

IEEE Transactions on Audio, Speech, and Language Processing
Evaluation of Objective Quality Measures for Speech Enhancement

IEEE Transactions on Audio, Speech, and Language Processing
A Generalized Time–Frequency Subtraction Method for Robust Speech Enhancement Based on Wavelet Filter Banks Modeling of Human Auditory System

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
De-noising by soft-thresholding

IEEE Transactions on Information Theory

Recognition of consonant-vowel (CV) units under background noise using combined temporal and spectral preprocessing

International Journal of Speech Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a noisy speech enhancement method by combining linear prediction (LP) residual weighting in the time domain and spectral processing in the frequency domain to provide better noise suppression as well as better enhancement in the speech regions. The noisy speech is initially processed by the excitation source (LP residual) based temporal processing that involves identifying and enhancing the excitation source based speech-specific features present at the gross and fine temporal levels. The gross level features are identified by estimating the following speech parameters: sum of the peaks in the discrete Fourier transform (DFT) spectrum, smoothed Hilbert envelope of the LP residual and modulation spectrum values, all from the noisy speech signal. The fine level features are identified using the knowledge of the instants of significant excitation. A weight function is derived from the gross and fine weight functions to obtain the temporally processed speech signal. The temporally processed speech is further subjected to spectral domain processing. Spectral processing involves estimation and removal of degrading components, and also identification and enhancement of speech-specific spectral components. The proposed method is evaluated using different objective and subjective quality measures. The quality measures show that the proposed combined temporal and spectral processing method provides better enhancement, compared to either temporal or spectral processing alone.