A fully parallel algorithm for the symmetric eigenvalue problem
SIAM Journal on Scientific and Statistical Computing
Efficient parallel reduction to bidiagonal form
Parallel Computing
A software architecture for user transparent parallel image processing
Parallel Computing - Parallel computing in image and video processing
IEEE Transactions on Parallel and Distributed Systems
User Transparent Parallel Processing of the 2004 NIST TRECVID Data Set
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
MediaMill: exploring news video archives based on learned semantics
Proceedings of the 13th annual ACM international conference on Multimedia
The Semantic Pathfinder: Using an Authoring Metaphor for Generic Multimedia Indexing
IEEE Transactions on Pattern Analysis and Machine Intelligence
GPU-based parallel householder bidiagonalization
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Hi-index | 0.00 |
With the increasing use of large image and video archives and high-resolution multimedia data streams in many of today's research and application areas, there is a growing need for multimedia-oriented high-performance computing. As a consequence, a need for algorithms, methodologies, and tools that can serve as support in the (automatic) parallelization of multimedia applications is rapidly emerging. This paper discusses the parallelization of Householder bidiagonalization, a matrix factorization method which is an integral part of full Singular Value Decomposition (SVD) -- an important algorithm for many multimedia problems. Householder bidiagonalization is hard to parallelize efficiently because the total number of matrix elements taking part in the calculations reduces during runtime. To overcome the growing negative performance impact of load imbalances and overprovisioning of compute resources, we apply adaptive runtime techniques of periodic matrix remapping and process reduction for improved performance. Results show that our adaptive parallel execution approach provides a significant improvement in efficiency, even when applying a set of compute resources which is (initially) very large.