Matrix analysis
Maximum matchings in general graphs through randomization
Journal of Algorithms
A Unified Approach to Path Problems
Journal of the ACM (JACM)
Static scheduling algorithms for allocating directed task graphs to multiprocessors
ACM Computing Surveys (CSUR)
PSBLAS: a library for parallel linear algebra computation on sparse matrices
ACM Transactions on Mathematical Software (TOMS)
Communications of the ACM
FRONTIERS '95 Proceedings of the Fifth Symposium on the Frontiers of Massively Parallel Computation (Frontiers'95)
Detecting short directed cycles using rectangular matrix multiplication and dynamic programming
SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms
A Scalable Distributed Parallel Breadth-First Search Algorithm on BlueGene/L
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
pMapper: Automatic Mapping of Parallel Matlab Programs
DOD_UGC '05 Proceedings of the 2005 Users Group Conference on 2005 Users Group Conference
Single vehicle pickup and delivery with time windows: made to measure genetic encoding and operators
Proceedings of the 9th annual conference companion on Genetic and evolutionary computation
pMatlab Parallel Matlab Library
International Journal of High Performance Computing Applications
Technical Challenges of Supporting Interactive HPC
HPCMP-UGC '07 Proceedings of the 2007 DoD High Performance Computing Modernization Program Users Group Conference
Proceedings of the 22nd annual international conference on Supercomputing
A Unified Framework for Numerical and Combinatorial Computing
Computing in Science and Engineering
Challenges and Advances in Parallel Sparse Matrix-Matrix Multiplication
ICPP '08 Proceedings of the 2008 37th International Conference on Parallel Processing
Performance Modeling and Mapping of Sparse Computations
HPCMP-UGC '08 Proceedings of the 2008 DoD HPCMP Users Group Conference
Next Generation Sequence Analysis Using Genetic Algorithms on Multi-core Technology
IJCBS '09 Proceedings of the 2009 International Joint Conference on Bioinformatics, Systems Biology and Intelligent Computing
Hi-index | 0.00 |
We present a framework for optimizing the distributed performance of sparse matrix computations. These computations are optimally parallelized by distributing their operations across processors in a subtly uneven balance. Because the optimal balance point depends on the non-zero patterns in the data, the algorithm, and the underlying hardware architecture, it is difficult to determine. The Hogs and Slackers genetic algorithm (GA) identifies processors with many operations -hogs, and processors with few operations -slackers. Its intelligent operation-balancing mutation operator swaps data blocks between hogs and slackers to explore new balance points. We show that this operator is integral to the performance of the genetic algorithm and use the framework to conduct an architecture study that varies network specifications. The Hogs and Slackers GA is itself a parallel algorithm with near linear speedup on a large computing cluster.