Monaural speech separation and recognition challenge
Computer Speech and Language
IEEE Transactions on Signal Processing
The WEKA data mining software: an update
ACM SIGKDD Explorations Newsletter
Source/filter model for unsupervised main melody extraction from polyphonic audio signals
IEEE Transactions on Audio, Speech, and Language Processing
Supervised and semi-supervised separation of sounds from single-channel mixtures
ICA'07 Proceedings of the 7th international conference on Independent component analysis and signal separation
Discovering convolutive speech phones using sparseness and non-negativity
ICA'07 Proceedings of the 7th international conference on Independent component analysis and signal separation
Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-way Data Analysis and Blind Source Separation
Automatic recognition of lyrics in singing
EURASIP Journal on Audio, Speech, and Music Processing - Special issue on atypical speech
Performance measurement in blind audio source separation
IEEE Transactions on Audio, Speech, and Language Processing
IEEE Transactions on Audio, Speech, and Language Processing
Convolutive Speech Bases and Their Application to Supervised Speech Separation
IEEE Transactions on Audio, Speech, and Language Processing
Exemplar-Based Sparse Representations for Noise Robust Automatic Speech Recognition
IEEE Transactions on Audio, Speech, and Language Processing
Hi-index | 0.00 |
We describe the implementation of monaural audio source separation algorithms in our toolkit openBliSSART (Blind Source Separation for Audio Recognition Tasks). To our knowledge, it provides the first freely available C+驴+ implementation of Non-Negative Matrix Factorization (NMF) supporting the Compute Unified Device Architecture (CUDA) for fast parallel processing on graphics processing units (GPUs). Besides integrating parallel processing, openBliSSART introduces several numerical optimizations of commonly used monaural source separation algorithms that reduce both computation time and memory usage. By illustrating a variety of use-cases from audio effects in music processing to speech enhancement and feature extraction, we demonstrate the wide applicability of our application framework for a multiplicity of research and end-user applications. We evaluate the toolkit by benchmark results of the NMF algorithms and discuss the influence of their parameterization on source separation quality and real-time factor. In the result, the GPU parallelization in openBliSSART introduces double-digit speedups with respect to conventional CPU computation, enabling real-time processing on a desktop PC even for high matrix dimensions.