High performance computing
Using PLAPACK: parallel linear algebra package
Using PLAPACK: parallel linear algebra package
ScaLAPACK user's guide
Ethernet: the definitive guide
Ethernet: the definitive guide
A Proposal for a Heterogeneous Cluster ScaLAPACK (Dense Linear Solvers)
IEEE Transactions on Computers
MPI-The Complete Reference, Volume 1: The MPI Core
MPI-The Complete Reference, Volume 1: The MPI Core
A Case for NOW (Networks of Workstations)
IEEE Micro
A Proposal for a Set of Parallel Basic Linear Algebra Subprograms
A Proposal for a Set of Parallel Basic Linear Algebra Subprograms
Performance of Various Computers Using Standard Linear Equations Software
Performance of Various Computers Using Standard Linear Equations Software
The LAPACK for Clusters Project: An Example of Self Adapting Numerical Software
HICSS '04 Proceedings of the Proceedings of the 37th Annual Hawaii International Conference on System Sciences (HICSS'04) - Track 9 - Volume 9
Self-adapting software for numerical linear algebra and LAPACK for clusters
Parallel Computing - Special issue: Parallel and distributed scientific and engineering computing
Hi-index | 0.00 |
This paper presents a parallel LU factorization algorithm designed to take advantage of physical broadcast communication facilities as well as overlapping of communication and computing. Physical broadcast is directly available on Ethernet networks hardware, one of the most used interconnection networks in current clusters installed for parallel computing. Overlapped communication is a well-known strategy for hiding communication latency, which is one of the most common source of parallel performance penalization. Performance analysis and experimentation of the proposed parallel LU factorization algorithm are presented. Also, the performance of the proposed algorithm is compared with that of the algorithm used in ScaLAPACK (Scalable LAPACK), which is commonly accepted as having optimized performance.