Optimization and Parallelization of Monaural Source Separation Algorithms in the openBliSSART Toolkit

Authors:
Felix Weninger;Björn Schuller
Affiliations:
Institute for Human-Machine Communication, Technische Universität München, München, Germany 80290;Institute for Human-Machine Communication, Technische Universität München, München, Germany 80290
Venue:
Journal of Signal Processing Systems
Year:
2012

Citing 13
Cited 0

Nonnegative matrix factorization with the itakura-saito divergence: With application to music analysis

Neural Computation
Monaural speech separation and recognition challenge

Computer Speech and Language
A multiplicative algorithm for convolutive non-negative matrix factorization based on squared Euclidean distance

IEEE Transactions on Signal Processing
The WEKA data mining software: an update

ACM SIGKDD Explorations Newsletter
Source/filter model for unsupervised main melody extraction from polyphonic audio signals

IEEE Transactions on Audio, Speech, and Language Processing
Supervised and semi-supervised separation of sounds from single-channel mixtures

ICA'07 Proceedings of the 7th international conference on Independent component analysis and signal separation
Discovering convolutive speech phones using sparseness and non-negativity

ICA'07 Proceedings of the 7th international conference on Independent component analysis and signal separation
Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-way Data Analysis and Blind Source Separation

Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-way Data Analysis and Blind Source Separation
Automatic recognition of lyrics in singing

EURASIP Journal on Audio, Speech, and Music Processing - Special issue on atypical speech
Performance measurement in blind audio source separation

IEEE Transactions on Audio, Speech, and Language Processing
Monaural Sound Source Separation by Nonnegative Matrix Factorization With Temporal Continuity and Sparseness Criteria

IEEE Transactions on Audio, Speech, and Language Processing
Convolutive Speech Bases and Their Application to Supervised Speech Separation

IEEE Transactions on Audio, Speech, and Language Processing
Exemplar-Based Sparse Representations for Noise Robust Automatic Speech Recognition

IEEE Transactions on Audio, Speech, and Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

We describe the implementation of monaural audio source separation algorithms in our toolkit openBliSSART (Blind Source Separation for Audio Recognition Tasks). To our knowledge, it provides the first freely available C+驴+ implementation of Non-Negative Matrix Factorization (NMF) supporting the Compute Unified Device Architecture (CUDA) for fast parallel processing on graphics processing units (GPUs). Besides integrating parallel processing, openBliSSART introduces several numerical optimizations of commonly used monaural source separation algorithms that reduce both computation time and memory usage. By illustrating a variety of use-cases from audio effects in music processing to speech enhancement and feature extraction, we demonstrate the wide applicability of our application framework for a multiplicity of research and end-user applications. We evaluate the toolkit by benchmark results of the NMF algorithms and discuss the influence of their parameterization on source separation quality and real-time factor. In the result, the GPU parallelization in openBliSSART introduces double-digit speedups with respect to conventional CPU computation, enabling real-time processing on a desktop PC even for high matrix dimensions.