Blind separation of speech mixtures via time-frequency masking
IEEE Transactions on Signal Processing
Hi-index | 0.00 |
The Hilbert transformation together with empirical mode decomposition (EMD) produces Hilbert spectrum (HS) which is a fine-resolution time-frequency (TF) representation of any nonlinear and non-stationary signal. A method of audio signal separation from stereo mixtures based on the spatial location of the sources is presented in this paper. The TF representation of the audio signal is obtained by HS. The sources are localized in the space of time and intensity differences between two microphones’ signals. The separation is performed by masking the target signal in TF domain considering that the sources are disjoint orthogonal. The experimental results of the proposed method show a noticeable improvement of separation efficiency.