Monaural voiced speech segregation based on dynamic harmonic function

Authors:
Xueliang Zhang;Wenju Liu;Bo Xu
Affiliations:
National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, China nad Computer Science Department, Inner Mongolia University, Huhhot, China;National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, China;National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, China
Venue:
EURASIP Journal on Audio, Speech, and Music Processing
Year:
2010

Citing 10
Cited 0

Modelling auditory processing and organisation

Modelling auditory processing and organisation
A theory and computational model of auditory monaural sound separation (stream, speech enhancement, selective attention, pitch perception, noise cancellation)

A theory and computational model of auditory monaural sound separation (stream, speech enhancement, selective attention, pitch perception, noise cancellation)
Computational Auditory Scene Analysis: Principles, Algorithms, and Applications

Computational Auditory Scene Analysis: Principles, Algorithms, and Applications
On the optimality of ideal binary time-frequency masks

Speech Communication
Monaural voiced speech segregation based on elaborate harmonic grouping strategy

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Speech Enhancement

Speech Enhancement
A Pitch Detector Based on a Generalized Correlation Function

IEEE Transactions on Audio, Speech, and Language Processing
Separation of speech from interfering sounds based on oscillatory correlation

IEEE Transactions on Neural Networks
Estimation of speech embedded in a reverberant and noisy environment by independent component analysis and wavelets

IEEE Transactions on Neural Networks
Monaural speech segregation based on pitch tracking and amplitude modulation

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

Correlogram is an important representation for periodic signals. It is widely used in pitch estimation and source separation. For these applications, major problems of correlogram are its low resolution and redundant information. This paper proposes a voiced speech segregation systembased on a newly introduced concept called dynamic harmonic function (DHF). In the proposed system, conventional correlograms are further processed by replacing the autocorrelation function (ACF) with DHF. The advantages of DHF are: 1) peak's width is adjustable by controlling the variance of the Gaussian function and 2) the invalid peaks of ACF, not at the pitch period, tend to be suppressed. Based on DHF, pitch detection and effective source segregation algorithms are proposed. Our system is systematically evaluated and compared with the correlogram-based system. Both the signal-to-noise ratio results and the perceptual evaluation of speech quality scores show that the proposed system yields substantially better performance.