Efficient parallel evaluation of straight-line code and arithmetic circuits
SIAM Journal on Computing
On communication latency in PRAM computations
SPAA '89 Proceedings of the first annual ACM symposium on Parallel algorithms and architectures
An introduction to parallel algorithms
An introduction to parallel algorithms
Parallelizing complex scans and reductions
PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Polylog-time and near-linear work approximation scheme for undirected shortest paths
STOC '94 Proceedings of the twenty-sixth annual ACM symposium on Theory of computing
An Efficient Parallel Algorithm for the Solution of a Tridiagonal Linear System of Equations
Journal of the ACM (JACM)
Optimizing Supercompilers for Supercomputers
Optimizing Supercompilers for Supercomputers
Dependence Analysis for Supercomputing
Dependence Analysis for Supercomputing
Synthesis of Parallel Algorithms
Synthesis of Parallel Algorithms
Recognizing and Parallelizing Bounded Recurrences
Proceedings of the Fourth International Workshop on Languages and Compilers for Parallel Computing
On Parallel Integer Sorting
Document for a Standard Message-Passing Interface
Document for a Standard Message-Passing Interface
Parallel Processing of First Order Linear Recurrence on SMP Machines
The Journal of Supercomputing
Automatic inversion generates divide-and-conquer parallel programs
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
Hi-index | 0.00 |
We define a new type of recurrence equations called 驴Simple Indexed Recurrences驴 (SIR). In this type of equations, ordinary recurrences are generalized to $X[g(i)] = op_i(X[f(i)],X[g(i)])$, where $f,g:\{1\ldots n\}\mapsto \{1\ldots m\}$, $op_i(x,y)$ is a binary associative operator and $g$ is distinct, i.e., $\forall i\ne j \; \; g(i)\ne g(j)$. This enables us to model certain sequential loops as a sequence of SIR equations. A parallel algorithm that solves a set of SIR equations will, in fact, parallelize sequential loops of the above type. Such a parallel SIR algorithm must be efficient enough to compete with the $O(n)$ work complexity of the original loop. We show why efficient parallel algorithms for the related problems of List Ranking and Tree Contraction, which require $O(n)$ work, cannot be applied to solving SIR. We instead use repeated iterations of pointer jumping to compute the final values of $X[]$ in ${\frac {n} {p}} \cdot \log p$ steps and $n \cdot \log p$ work, with $p$ processors. A sequence of experiments was performed to test the effect of synchronous and asynchronous executions on the actual performance of the algorithm. These experiments show that pointer jumping requires $O(n)$ work in most practical cases of SIR loops.An efficient solution is given for the special case where we know how to compute the inverse of $op_i$, and finally, useful applications of SIR to the well-known Livermore Loops benchmark are presented.