A Mapping Strategy for Parallel Processing
IEEE Transactions on Computers
Heuristic Technique for Processor and Link Assignment in Multicomputers
IEEE Transactions on Computers
CHARM++: a portable concurrent object oriented system based on C++
OOPSLA '93 Proceedings of the eighth annual conference on Object-oriented programming systems, languages, and applications
Generic programming and the STL: using and extending the C++ Standard Template Library
Generic programming and the STL: using and extending the C++ Standard Template Library
PLAPACK: parallel linear algebra package design overview
SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
The C++ Programming Language
Conceptual Graphs and Formal Concept Analysis
ICCS '97 Proceedings of the Fifth International Conference on Conceptual Structures: Fulfilling Peirce's Dream
HPCN Europe 1996 Proceedings of the International Conference and Exhibition on High-Performance Computing and Networking
ISAAC '88 Proceedings of the International Symposium ISSAC'88 on Symbolic and Algebraic Computation
Implementing the MPI process topology mechanism
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
C++ Template Metaprogramming: Concepts, Tools, and Techniques from Boost and Beyond (C++ in Depth Series)
An overview of the Trilinos project
ACM Transactions on Mathematical Software (TOMS) - Special issue on the Advanced CompuTational Software (ACTS) Collection
Lifting sequential graph algorithms for distributed-memory parallel computation
OOPSLA '05 Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Topology mapping for Blue Gene/L supercomputer
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
IEEE Transactions on Computers
Scientific Programming - Parallel/High-Performance Object-Oriented Scientific Computing (POOSC '05), Glasgow, UK, 25 July 2005
Dynamic topology aware load balancing algorithms for molecular dynamics applications
Proceedings of the 23rd international conference on Supercomputing
Elements of Programming
Effecting parallel graph eigensolvers through library composition
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Generic topology mapping strategies for large-scale parallel architectures
Proceedings of the international conference on Supercomputing
Writing parallel libraries with MPI - common practice, issues, and extensions
EuroMPI'11 Proceedings of the 18th European MPI Users' Group conference on Recent advances in the message passing interface
Hi-index | 0.00 |
Sparse linear algebra is a key component of many scientific computations such as computational fluid dynamics, mechanical engineering or the design of new materials to mention only a few. The discretization of complex geometries in unstructured meshes leads to sparse matrices with irregular patterns. Their distribution in turn results in irregular communication patterns within parallel operations. In this paper, we show how sparse linear algebra can be implemented effortless on distributed memory architectures. We demonstrate how simple it is to incorporate advanced partitioning, network topology mapping, and data migration techniques into parallel HPC programs by establishing novel abstractions. For this purpose, we developed a linear algebra library - Parallel Matrix Template Library 4 - based on generic and meta-programming introducing a new paradigm: meta-tuning. The library establishes its own domain-specific language embedded in C++. The simplicity of software development is not paid by lower performance. Moreover, the incorporation of topology mapping demonstrated performance improvements up to 29%.