Organizing matrices and matrix operations for paged memory systems

Authors:
A. C. McKellar;E. G. Coffman, Jr.
Affiliations:
Princeton Univ., Princeton, NJ;Princeton Univ., NJ
Venue:
Communications of the ACM
Year:
1969

Citing 7
Cited 64

Error Analysis of Direct Methods of Matrix Inversion

Journal of the ACM (JACM)
The working set model for program behavior

Communications of the ACM
Further experimental data on the behavior of programs in a paging environment

Communications of the ACM
Addressing multidimensional arrays

Communications of the ACM
Basic Programming

Basic Programming
Dynamic program behavior under paging

ACM '66 Proceedings of the 1966 21st national conference
A programming language

A programming language

A proposal for a set of level 3 basic linear algebra subprograms

ACM SIGNUM Newsletter
Programming in VS Fortran on the IBM 3090 for Maximum Vector Performance

Computer
Efficient Branch-and-Bound Algorithms on a Two-Level Memory System

IEEE Transactions on Software Engineering
On the problem of optimizing data transfers for complex memory systems

ICS '88 Proceedings of the 2nd international conference on Supercomputing
A structured approach to program optimization

IEEE Transactions on Software Engineering
Virtual memory algorithms

STOC '88 Proceedings of the twentieth annual ACM symposium on Theory of computing
Array Access Bounds for Block Storage Memory Systems

IEEE Transactions on Computers
Using PAGE-AHEAD for large FORTRAN programs

Proceedings of the 1989 ACM/IEEE conference on Supercomputing
A set of level 3 basic linear algebra subprograms

ACM Transactions on Mathematical Software (TOMS)
The input/output complexity of transitive closure

SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
The cache performance and optimizations of blocked algorithms

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Hierarchical blocking and data flow analysis for numerical linear algebra

Proceedings of the 1990 ACM/IEEE conference on Supercomputing
A data locality optimizing algorithm

PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Memory management support for tiled array organization

ACM SIGARCH Computer Architecture News
Design and evaluation of a compiler algorithm for prefetching

ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
The effectiveness of caches for vector processors

ICS '94 Proceedings of the 8th international conference on Supercomputing
MOB forms: a class of multilevel block algorithms for dense linear algebra operations

ICS '94 Proceedings of the 8th international conference on Supercomputing
Organizing arrays for paged memory systems

Communications of the ACM
Auto-blocking matrix-multiplication or tracking BLAS3 performance from source code

PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
A unified compiler algorithm for optimizing locality, parallelism and communication in out-of-core computations

Proceedings of the fifth workshop on I/O in parallel and distributed systems
Tolerating latency in multiprocessors through compiler-inserted prefetching

ACM Transactions on Computer Systems (TOCS)
Graph-theoretic methods in database theory

PODS '90 Proceedings of the ninth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Improving memory hierarchy performance for irregular applications

ICS '99 Proceedings of the 13th international conference on Supercomputing
A Unified Framework for Optimizing Locality, Parallelism, and Communication in Out-of-Core Computations

IEEE Transactions on Parallel and Distributed Systems
Solving Large Full Sets of Linear Equations in a Paged Virtual Store

ACM Transactions on Mathematical Software (TOMS)
Virtual Memory

ACM Computing Surveys (CSUR)
Storage reorganization techniques for matrix computation in a paging environment

Communications of the ACM
Improving locality by critical working sets

Communications of the ACM
Matrix computations with Fortran and paging

Communications of the ACM
Language support for Morton-order matrices

PPoPP '01 Proceedings of the eighth ACM SIGPLAN symposium on Principles and practices of parallel programming
Paging as a "language processing" task

POPL '81 Proceedings of the 8th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Improving Memory Hierarchy Performance for Irregular Applications Using Data and Computation Reorderings

International Journal of Parallel Programming
Design, Analysis, and Simulation of I/O Architectures for Hypercube Multiprocessors

IEEE Transactions on Parallel and Distributed Systems
A Blocked All-Pairs Shortest-Path Algorithm

SWAT '00 Proceedings of the 7th Scandinavian Workshop on Algorithm Theory
Bibliography on paging and related topics

ACM SIGOPS Operating Systems Review
A comparison of empirical and model-driven optimization

PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
A note on matrix multiplication in a paging environment

ACM '76 Proceedings of the 1976 annual conference
A tool for automatic program restructuring

ACM '73 Proceedings of the ACM annual conference
A critical overview of computer performance evaluation

ICSE '76 Proceedings of the 2nd international conference on Software engineering
Experimental results on the paging behavior of numerical programs

ICSE '82 Proceedings of the 6th international conference on Software engineering
A data locality optimizing algorithm

ACM SIGPLAN Notices - Best of PLDI 1979-1999
A blocked all-pairs shortest-paths algorithm

Journal of Experimental Algorithmics (JEA)
Program page reference patterns

SIGMETRICS '82 Proceedings of the 1982 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Communication lower bounds for distributed-memory matrix multiplication

Journal of Parallel and Distributed Computing
FFT program generation for shared memory: SMP and multicore

Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Operating with potentials of discrete variables

International Journal of Approximate Reasoning
On the Performance Enhancement of Paging Systems Through Program Analysis and Transformations

IEEE Transactions on Computers
Prepaging and Applications to Array Algorithms

IEEE Transactions on Computers
On the Paging Performance of Array Algorithms

IEEE Transactions on Computers
Factors Affecting the Efficiency of A Virtual Memory

IEEE Transactions on Computers
Programming with tiles

Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Hierarchical memory with block transfer

SFCS '87 Proceedings of the 28th Annual Symposium on Foundations of Computer Science
Design Issues in Parallel Array Languages for Shared Memory

SAMOS '08 Proceedings of the 8th international workshop on Embedded Computer Systems: Architectures, Modeling, and Simulation
On modeling program behavior

AFIPS '72 (Spring) Proceedings of the May 16-18, 1972, spring joint computer conference
Some programming techniques for processing multi-dimensional matrices in a paging environment

AFIPS '74 Proceedings of the May 6-10, 1974, national computer conference and exposition
User program performance in virtual storage systems

IBM Systems Journal
Virtual storage and virtual machine concepts

IBM Systems Journal
Using non-canonical array layouts in dense matrix operations

PARA'06 Proceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing
Transforming the adaptive irregular out-of-core applications for hiding communication and disk I/O

OTM'07 Proceedings of the 2007 OTM confederated international conference on On the move to meaningful internet systems: CoopIS, DOA, ODBASE, GADA, and IS - Volume Part II
New data structures for matrices and specialized inner kernels: low overhead for high performance

PPAM'07 Proceedings of the 7th international conference on Parallel processing and applied mathematics
Software data spreading: leveraging distributed caches to improve single thread performance

PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation
A parallel numerical solver using hierarchically tiled arrays

LCPC'10 Proceedings of the 23rd international conference on Languages and compilers for parallel computing
Optimizing matrix multiplication with a classifier learning system

LCPC'05 Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing
Upper and lower bounds on the cost of a map-reduce computation

Proceedings of the VLDB Endowment

Quantified Score

Hi-index	48.30

Visualization

Abstract

Matrix representations and operations are examined for the purpose of minimizing the page faulting occurring in a paged memory system. It is shown that carefully designed matrix algorithms can lead to enormous savings in the number of page faults occurring when only a small part of the total matrix can be in main memory at one time. Examination of addition, multiplication, and inversion algorithms shows that a partitioned matrix representation (i.e. one submatrix or partition per page) in most cases induced fewer page faults than a row-by-row representation. The number of page-pulls required by these matrix manipulation algorithms is also studied as a function of the number of pages of main memory available to the algorithm.