A set of level 3 basic linear algebra subprograms
ACM Transactions on Mathematical Software (TOMS)
Independent component analysis, a new concept?
Signal Processing - Special issue on higher order statistics
Natural gradient works efficiently in learning
Neural Computation
Independent component analysis: algorithms and applications
Neural Networks
Independent components of magnetoencephalography: localization
Neural Computation
Scalable Parallel Programming with CUDA
Queue - GPU Computing
Parallel ICA methods for EEG neuroimaging
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Efficient Independent Component Analysis on a GPU
CIT '10 Proceedings of the 2010 10th IEEE International Conference on Computer and Information Technology
Computational Intelligence and Neuroscience - Special issue on academic software applications for electromagnetic brain mapping using MEG and EEG
Equivariant adaptive source separation
IEEE Transactions on Signal Processing
Fast and robust fixed-point algorithms for independent component analysis
IEEE Transactions on Neural Networks
Euro-Par'12 Proceedings of the 18th international conference on Parallel processing workshops
Hi-index | 0.00 |
In recent years, Independent Component Analysis (ICA) has become a standard to identify relevant dimensions of the data in neuroscience. ICA is a very reliablemethod to analyze data but it is, computationally, very costly. The use of ICA for online analysis of the data, used in brain computing interfaces, results are almost completely prohibitive. We show an increase with almost no cost (a rapid video card) of speed of ICA by about 25 fold. The EEG data, which is a repetition of many independent signals in multiple channels, is very suitable for processing using the vector processors included in the graphical units. We profiled the implementation of this algorithm and detected two main types of operations responsible of the processing bottleneck and taking almost 80% of computing time: vector-matrix and matrix-matrix multiplications. By replacing function calls to basic linear algebra functions to the standard CUBLAS routines provided by GPU manufacturers, it does not increase performance due to CUDA kernel launch overhead. Instead, we developed a GPU-based solution that, comparing with the original BLAS and CUBLAS versions, obtains a 25x increase of performance for the ICA calculation.