An efficient delay-optimal distributed termination detection algorithm
Journal of Parallel and Distributed Computing
Hi-index | 0.00 |
The authors develop parallel A* algorithms suitable for distributed-memory machines. In parallel A* algorithms, inefficiencies grow with the number of processors P used, causing performance to drop significantly at lower and intermediate work densities (the ratio of the problem size to P). To alleviate this effect, they propose a novel parallel startup phase and efficient dynamic work distribution strategies and thus improve the scalability of parallel A* search. They also tackle the problem of duplicate searching by different processors, by using work transfer as a means to partial duplicate pruning. The parallel startup scheme proposed requires only Theta (log P) time compared to Theta (P) time for sequential startup methods used in the past. Using the traveling salesman problem (TSP) as the test case, the work distribution strategies yield speedup improvements of more than 30% and 15% at lower and intermediate work densities, respectively, while requiring 20% to 45% less memory, compared to previous approaches. Moreover, the simple duplicate pruning scheme provides an average reduction of 20% in execution time for up to 64 processors, compared to previous approaches that do not prune any duplicates.