ICCS '02 Proceedings of the International Conference on Computational Science-Part II
Performance optimization of RK methods using block-based pipelining
Performance analysis and grid computing
Journal of Computational Physics
New simulation methodology for finance: work reduction in financial simulations
Proceedings of the 35th conference on Winter simulation: driving innovation
Exploring the structure of the space of compilation sequences using randomized search algorithms
The Journal of Supercomputing
Optimizing locality and scalability of embedded Runge--Kutta solvers using block-based pipelining
Journal of Parallel and Distributed Computing
Synchronous parallel kinetic Monte Carlo for continuum diffusion-reaction systems
Journal of Computational Physics
Cache efficient bidiagonalization using BLAS 2.5 operators
ACM Transactions on Mathematical Software (TOMS)
Hybrid differentiation strategies for simulation and analysis of applications in C++
ACM Transactions on Mathematical Software (TOMS)
Algorithms for memory hierarchies: advanced lectures
Algorithms for memory hierarchies: advanced lectures
Algorithm engineering: bridging the gap between algorithm theory and practice
Algorithm engineering: bridging the gap between algorithm theory and practice
Towards cache-optimized multigrid using patch-adaptive relaxation
PARA'04 Proceedings of the 7th international conference on Applied Parallel Computing: state of the Art in Scientific Computing
High performance computing education for students in computational engineering
ICCS'05 Proceedings of the 5th international conference on Computational Science - Volume Part II
Integrating teaching and research in HPC: experiences and opportunities
ICCS'05 Proceedings of the 5th international conference on Computational Science - Volume Part II
Hi-index | 0.01 |
Performance Optimization of Numerically Intensive Codes offers a comprehensive, tutorial-style, hands-on, introductory and intermediate-level treatment of all the essential ingredients for achieving high performance in numerical computations on modern computers. The authors explain computer architectures, data traffic and issues related to performance of serial and parallel code optimization exemplified by actual programs written for algorithms of wide interest. The unique hands-on style is achieved by extensive case studies using realistic computational problems. The performance gain obtained by applying the techniques described in this book can be very significant. The book bridges the gap between the literature in system architecture, the one in numerical methods and the occasional descriptions of optimization topics in computer vendors' literature. It also allows readers to better judge the suitability of certain computer architecture to their computational requirements. In contrast to standard textbooks on computer architecture and on programming techniques the book treats these topics together at the level necessary for writing high-performance programs. The book facilitates easy access to these topics for computational scientists and engineers mainly interested in practical issues related to efficient code development.