Blocked algorithms and software for reduction of a regular matrix pair to generalized Schur form
ACM Transactions on Mathematical Software (TOMS)
A framework for symmetric band reduction
ACM Transactions on Mathematical Software (TOMS)
Parallel Two-Stage Reduction of a Regular Matrix Pair to Hessenberg-Triangular Form
PARA '00 Proceedings of the 5th International Workshop on Applied Parallel Computing, New Paradigms for HPC in Industry and Academia
Scheduling two-sided transformations using tile algorithms on multicore architectures
Scientific Programming
Hi-index | 0.00 |
A new cache-efficient algorithm for reduction from block Hessenberg form to Hessenberg form is presented and evaluated. The algorithm targets parallel computers with shared memory. One level of look-ahead in combination with a dynamic load-balancing scheme significantly reduces the idle time and allows the use of coarse-grained tasks. The coarse tasks lead to high-performance computations on each processor/core. Speedups close to 13 over the sequential unblocked algorithm have been observed on a dual quad-core machine using one thread per core.