Exploiting fast matrix multiplication within the level 3 BLAS
ACM Transactions on Mathematical Software (TOMS)
Practical experience in the numerical dangers of heterogeneous computing
ACM Transactions on Mathematical Software (TOMS)
Fast matrix multiplies using graphics hardware
Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Using modern graphics architectures for general-purpose computing: a framework and analysis
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware
Linear algebra operators for GPU implementation of numerical algorithms
ACM SIGGRAPH 2003 Papers
GPU Gems: Programming Techniques, Tips and Tricks for Real-Time Graphics
GPU Gems: Programming Techniques, Tips and Tricks for Real-Time Graphics
Understanding the efficiency of GPU algorithms for matrix-matrix multiplication
Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware
IEEE Micro
Automatic Tuning Matrix Multiplication Performance on Graphics Hardware
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
The Rise of the Commodity Vectors
High Performance Computing for Computational Science - VECPAR 2008
Processing data streams with hard real-time constraints on heterogeneous systems
Proceedings of the international conference on Supercomputing
Proceedings of Programming Models and Applications on Multicores and Manycores
Hi-index | 0.00 |
GPUs for numerical computations are becoming an attractive alternative in research. In this paper, we propose a new parallel processing environment for matrix multiplications by using both CPUs and GPUs. The execution time of matrix multiplications can be decreased to 40.1% by our method, compared with using the fastest of either CPU only case or GPU only case. Our method performs well when matrix sizes are large.