GPU-based parallel householder bidiagonalization

Authors:
Fangbin Liu;Frank J. Seinstra
Affiliations:
University of Amsterdam, Kruislaan, SJ, Amsterdam, The Netherlands;VU University, De Boelelaan, HV, Amsterdam, The Netherlands
Venue:
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Year:
2010

Citing 4
Cited 0

LU-GPU: Efficient Algorithms for Solving Dense Linear Systems on Graphics Hardware

SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
High-Performance Distributed Video Content Analysis with Parallel-Horus

IEEE MultiMedia
Using many-core hardware to correlate radio astronomy signals

Proceedings of the 23rd international conference on Supercomputing
Adaptive Parallel Householder Bidiagonalization

Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we discuss the GPU-based implementation and optimization of Householder bidiagonalization, a matrix factorization method which is an integral part of full Singular Value Decomposition (SVD) - an important algorithm for many problems in the research domain of Multimedia Content Analysis (MMCA). On cluster computers, complex adaptive run-time techniques often must be implemented to overcome the growing negative performance impact of load imbalances and to ensure reasonable speedup. We show that the nature of the many-core platform can avoid the necessity of applying such complex run-time parallelization techniques in software while achieving a performance of 64 gigaflops/s on a single-GPU GTX 295 in double precision, 82% of the theoretical peak performance.