Monaural speech separation and recognition challenge
Computer Speech and Language
Super-human multi-talker speech recognition: A graphical modeling approach
Computer Speech and Language
Supervised and semi-supervised separation of sounds from single-channel mixtures
ICA'07 Proceedings of the 7th international conference on Independent component analysis and signal separation
Non-negative hidden Markov modeling of audio with application to source separation
LVA/ICA'10 Proceedings of the 9th international conference on Latent variable analysis and signal separation
LVA/ICA'10 Proceedings of the 9th international conference on Latent variable analysis and signal separation
Performance measurement in blind audio source separation
IEEE Transactions on Audio, Speech, and Language Processing
Hi-index | 0.00 |
The use of high level information in source separation algorithms can greatly constrain the problem and lead to improved results by limiting the solution space to semantically plausible results. The automatic speech recognition community has shown that the use of high level information in the form of language models is crucial to obtaining high quality recognition results. In this paper, we apply language models in the context of speech separation. Specifically, we use language models to constrain the recently proposed non-negative factorial hidden Markov model. We compare the proposed method to non-negative spectrogram factorization using standard source separation metrics and show improved results in all metrics.