Statistical model of speech signals based on composite autoregressive system with application to blind source separation

Authors:
Hirokazu Kameoka;Takuya Yoshioka;Mariko Hamamura;Jonathan Le Roux;Kunio Kashino
Affiliations:
NTT Communication Science Laboratories, NTT Corporation, Atsugi, Kanagawa, Japan;NTT Communication Science Laboratories, NTT Corporation, Atsugi, Kanagawa, Japan;NTT Communication Science Laboratories, NTT Corporation, Atsugi, Kanagawa, Japan;NTT Communication Science Laboratories, NTT Corporation, Atsugi, Kanagawa, Japan;NTT Communication Science Laboratories, NTT Corporation, Atsugi, Kanagawa, Japan
Venue:
LVA/ICA'10 Proceedings of the 9th international conference on Latent variable analysis and signal separation
Year:
2010

Citing 5
Cited 0

Nonnegative matrix factorization with the itakura-saito divergence: With application to music analysis

Neural Computation
Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation

IEEE Transactions on Audio, Speech, and Language Processing
Separation of an instantaneous mixture of Gaussian autoregressive sources by the exact maximum likelihood approach

IEEE Transactions on Signal Processing
Audio source separation with a single sensor

IEEE Transactions on Audio, Speech, and Language Processing
Blind Separation and Dereverberation of Speech Mixtures by Joint Optimization

IEEE Transactions on Audio, Speech, and Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a new statistical model for speech signals, which consists of a time-invariant dictionary incorporating a set of the power spectral densities of excitation signals and a set of all-pole filters where the gain of each pair of excitation and filter elements is allowed to vary over time. We use this model to develop a combined blind separation and dereverberation method for speech. Reasonably good separations were obtained under a highly reverberant condition.