Introduction to algorithms
Applied numerical linear algebra
Applied numerical linear algebra
ScaLAPACK user's guide
LAPACK Users' guide (third ed.)
LAPACK Users' guide (third ed.)
Templates for the solution of algebraic eigenvalue problems: a practical guide
Templates for the solution of algebraic eigenvalue problems: a practical guide
Implementing the one-sided Jacobi method on a 2D/3D mesh multicomputer
Parallel Computing
Dynamic ordering for a parallel block-Jacobi SVD algorithm
Parallel Computing - Parallel matrix algorithms and applications
Accelerating the SVD Block-Jacobi Method
Computing - Editorial: Special issue on GAMM – Workshop on Guaranteed Error-bounds for the Solution of Nonlinear Problems in Applied Mathematics
Efficient pre-processing in the parallel block-Jacobi SVD algorithm
Parallel Computing - Parallel matrix algorithms and applications (PMAA'04)
Fast dimension reduction for document classification based on imprecise spectrum analysis
Information Sciences: an International Journal
Hi-index | 0.00 |
The parallel two-sided block-Jacobi singular value decomposition (SVD) algorithm with dynamic ordering, originally proposed in [Parallel Comput. 28 (2002) 243-262], has been extended with respect to the blocking factor ℓ. Unlike the unique blocking factor ℓ = 2p in the original algorithm running on p processors, the current blocking factor is a variable parameter that covers the values in two different regions--namely, ℓ = p/k and ℓ = 2kp for some integer k. Two new parallel two-sided block-Jacobi SVD algorithms with dynamic ordering are described in detail. They arise in those two regions and differ in the logical data arrangement and communication complexity of the reordering step. For the case of ℓ = 2kp, it is proved that a designed point-to-point communication algorithm is optimal with respect to the amount of communication required per processor as well as to the amount of overall communication. Using the message passing programming model for distributed memory machines, new parallel block-Jacobi SVD algorithms were implemented on an SGI-Cray Origin 2000 parallel computer. Numerical experiments were performed on p = 12 and 24 processors using a set of six matrices of order 4000 and blocking factors ℓ, 2 ≤ ℓ ≤ 192. To achieve the minimal total parallel execution time, the use of a blocking factor ℓ ∈ {2,p, 2p} can be recommended for matrices with distinct singular values. However, for matrices with a multiple minimal singular value, the total parallel execution time may monotonically increase with ℓ. In this case, the recommended Jacobi method with ℓ = 2 is just the ScaLAPACK routine with some additional matrix multiplications, and it computes the SVD in one parallel iteration step.