Scaling performance of interior-point method on large-scale chip multiprocessor system

Authors:
Mikhail Smelyanskiy;Victor W Lee;Daehyun Kim;Anthony D Nguyen;Pradeep Dubey
Affiliations:
Microprocessor Technology Labs, Intel;Microprocessor Technology Labs, Intel;Microprocessor Technology Labs, Intel;Microprocessor Technology Labs, Intel;Microprocessor Technology Labs, Intel
Venue:
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Year:
2007

Citing 15
Cited 0

Block sparse Cholesky algorithms on advanced uniprocessor computers

SIAM Journal on Scientific Computing
An efficient block-oriented approach to parallel sparse Cholesky factorization

Proceedings of the 1993 ACM/IEEE conference on Supercomputing
The network architecture of the connection machine CM-5

Journal of Parallel and Distributed Computing
A Fully Asynchronous Multifrontal Solver Using Distributed Dynamic Scheduling

SIAM Journal on Matrix Analysis and Applications
A parallel formulation of interior point algorithms

Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Architectural Support for Parallel Reductions in Scalable Shared-Memory Multiprocessors

Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques
Solving Real-World Linear Programs: A Decade and More of Progress

Operations Research
SuperLU_DIST: A scalable distributed-memory sparse direct solver for unsymmetric linear systems

ACM Transactions on Mathematical Software (TOMS)
Active Memory Techniques for ccNUMA Multiprocessors

IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
Sparse gaussian elimination on high-performance computers

Sparse gaussian elimination on high-performance computers
Niagara: A 32-Way Multithreaded Sparc Processor

IEEE Micro
Chip multiprocessing and the cell broadband engine

Proceedings of the 3rd conference on Computing frontiers
Carbon: architectural support for fine-grained parallelism on chip multiprocessors

Proceedings of the 34th annual international symposium on Computer architecture
The NYU Ultracomputer Designing an MIMD Shared Memory Parallel Computer

IEEE Transactions on Computers
Gigaflops in linear programming

Operations Research Letters

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we describe parallelization of interior-point method (IPM) aimed at achieving high scalability on large-scale chip-multiprocessors (CMPs). IPM is an important computational technique used to solve optimization problems in many areas of science, engineering and finance. IPM spends most of its computation time in a few sparse linear algebra kernels. While each of these kernels contains a large amount of parallelism, sparse irregular datasets seen in many optimization problems make parallelism difficult to exploit. As a result, most researchers have shown only a relatively low scalability of 4X-12X on medium to large scale parallel machines. This paper proposes and evaluates several algorithmic and hardware features to improve IPM parallel performance on large-scale CMPs. Through detailed simulations, we demonstrate how exploring multiple levels of parallelism with hardware support for low overhead task queues and parallel reduction enables IPM to achieve up to 48X parallel speedup on a 64-core CMP.