Error Analysis of Direct Methods of Matrix Inversion
Journal of the ACM (JACM)
The working set model for program behavior
Communications of the ACM
Further experimental data on the behavior of programs in a paging environment
Communications of the ACM
Addressing multidimensional arrays
Communications of the ACM
Basic Programming
Dynamic program behavior under paging
ACM '66 Proceedings of the 1966 21st national conference
A programming language
A proposal for a set of level 3 basic linear algebra subprograms
ACM SIGNUM Newsletter
Efficient Branch-and-Bound Algorithms on a Two-Level Memory System
IEEE Transactions on Software Engineering
On the problem of optimizing data transfers for complex memory systems
ICS '88 Proceedings of the 2nd international conference on Supercomputing
A structured approach to program optimization
IEEE Transactions on Software Engineering
STOC '88 Proceedings of the twentieth annual ACM symposium on Theory of computing
Array Access Bounds for Block Storage Memory Systems
IEEE Transactions on Computers
Using PAGE-AHEAD for large FORTRAN programs
Proceedings of the 1989 ACM/IEEE conference on Supercomputing
A set of level 3 basic linear algebra subprograms
ACM Transactions on Mathematical Software (TOMS)
The input/output complexity of transitive closure
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
The cache performance and optimizations of blocked algorithms
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Hierarchical blocking and data flow analysis for numerical linear algebra
Proceedings of the 1990 ACM/IEEE conference on Supercomputing
A data locality optimizing algorithm
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Memory management support for tiled array organization
ACM SIGARCH Computer Architecture News
Design and evaluation of a compiler algorithm for prefetching
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
The effectiveness of caches for vector processors
ICS '94 Proceedings of the 8th international conference on Supercomputing
MOB forms: a class of multilevel block algorithms for dense linear algebra operations
ICS '94 Proceedings of the 8th international conference on Supercomputing
Organizing arrays for paged memory systems
Communications of the ACM
Auto-blocking matrix-multiplication or tracking BLAS3 performance from source code
PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
Proceedings of the fifth workshop on I/O in parallel and distributed systems
Tolerating latency in multiprocessors through compiler-inserted prefetching
ACM Transactions on Computer Systems (TOCS)
Graph-theoretic methods in database theory
PODS '90 Proceedings of the ninth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Improving memory hierarchy performance for irregular applications
ICS '99 Proceedings of the 13th international conference on Supercomputing
IEEE Transactions on Parallel and Distributed Systems
Solving Large Full Sets of Linear Equations in a Paged Virtual Store
ACM Transactions on Mathematical Software (TOMS)
ACM Computing Surveys (CSUR)
Storage reorganization techniques for matrix computation in a paging environment
Communications of the ACM
Improving locality by critical working sets
Communications of the ACM
Matrix computations with Fortran and paging
Communications of the ACM
Language support for Morton-order matrices
PPoPP '01 Proceedings of the eighth ACM SIGPLAN symposium on Principles and practices of parallel programming
Paging as a "language processing" task
POPL '81 Proceedings of the 8th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
International Journal of Parallel Programming
Design, Analysis, and Simulation of I/O Architectures for Hypercube Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
A Blocked All-Pairs Shortest-Path Algorithm
SWAT '00 Proceedings of the 7th Scandinavian Workshop on Algorithm Theory
Bibliography on paging and related topics
ACM SIGOPS Operating Systems Review
A comparison of empirical and model-driven optimization
PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
A note on matrix multiplication in a paging environment
ACM '76 Proceedings of the 1976 annual conference
A tool for automatic program restructuring
ACM '73 Proceedings of the ACM annual conference
A critical overview of computer performance evaluation
ICSE '76 Proceedings of the 2nd international conference on Software engineering
Experimental results on the paging behavior of numerical programs
ICSE '82 Proceedings of the 6th international conference on Software engineering
A data locality optimizing algorithm
ACM SIGPLAN Notices - Best of PLDI 1979-1999
A blocked all-pairs shortest-paths algorithm
Journal of Experimental Algorithmics (JEA)
Program page reference patterns
SIGMETRICS '82 Proceedings of the 1982 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Communication lower bounds for distributed-memory matrix multiplication
Journal of Parallel and Distributed Computing
FFT program generation for shared memory: SMP and multicore
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Operating with potentials of discrete variables
International Journal of Approximate Reasoning
On the Performance Enhancement of Paging Systems Through Program Analysis and Transformations
IEEE Transactions on Computers
Prepaging and Applications to Array Algorithms
IEEE Transactions on Computers
On the Paging Performance of Array Algorithms
IEEE Transactions on Computers
Factors Affecting the Efficiency of A Virtual Memory
IEEE Transactions on Computers
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Hierarchical memory with block transfer
SFCS '87 Proceedings of the 28th Annual Symposium on Foundations of Computer Science
Design Issues in Parallel Array Languages for Shared Memory
SAMOS '08 Proceedings of the 8th international workshop on Embedded Computer Systems: Architectures, Modeling, and Simulation
AFIPS '72 (Spring) Proceedings of the May 16-18, 1972, spring joint computer conference
Some programming techniques for processing multi-dimensional matrices in a paging environment
AFIPS '74 Proceedings of the May 6-10, 1974, national computer conference and exposition
User program performance in virtual storage systems
IBM Systems Journal
Virtual storage and virtual machine concepts
IBM Systems Journal
Using non-canonical array layouts in dense matrix operations
PARA'06 Proceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing
Transforming the adaptive irregular out-of-core applications for hiding communication and disk I/O
OTM'07 Proceedings of the 2007 OTM confederated international conference on On the move to meaningful internet systems: CoopIS, DOA, ODBASE, GADA, and IS - Volume Part II
New data structures for matrices and specialized inner kernels: low overhead for high performance
PPAM'07 Proceedings of the 7th international conference on Parallel processing and applied mathematics
Software data spreading: leveraging distributed caches to improve single thread performance
PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation
A parallel numerical solver using hierarchically tiled arrays
LCPC'10 Proceedings of the 23rd international conference on Languages and compilers for parallel computing
Optimizing matrix multiplication with a classifier learning system
LCPC'05 Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing
Upper and lower bounds on the cost of a map-reduce computation
Proceedings of the VLDB Endowment
Hi-index | 48.30 |
Matrix representations and operations are examined for the purpose of minimizing the page faulting occurring in a paged memory system. It is shown that carefully designed matrix algorithms can lead to enormous savings in the number of page faults occurring when only a small part of the total matrix can be in main memory at one time. Examination of addition, multiplication, and inversion algorithms shows that a partitioned matrix representation (i.e. one submatrix or partition per page) in most cases induced fewer page faults than a row-by-row representation. The number of page-pulls required by these matrix manipulation algorithms is also studied as a function of the number of pages of main memory available to the algorithm.