Lagrange interpolation on a processor tree with ring connections
Journal of Parallel and Distributed Computing
Cost-optimal parallel B-spline interpolations
ICS '90 Proceedings of the 4th international conference on Supercomputing
Advanced Computer Architecture: Parallelism,Scalability,Programmability
Advanced Computer Architecture: Parallelism,Scalability,Programmability
Internet Streaming SIMD Extensions
Computer
Importance of explicit vectorization for CPU and GPU software performance
Journal of Computational Physics
Hi-index | 0.00 |
This paper reports the results of SIMD implementation of a number of interpolation algorithms on common personal computers. These methods fit a curve on some given input points for which a mathematical function form is not known. We have implemented four widely used methods using vector processing capabilities embedded in Pentium processors. By using SSE (streaming SIMD extension) we could perform all operations on four packed single-precision (32-bit) floating point values simultaneously. Therefore, the running time decreases three times or even more depending on the number of points and the interpolation method. We have implemented four interpolation methods using SSE technology then analyzed their speedup as a function of the number of points being interpolated. A comparison between characteristics of developed vector algorithms is also presented.