The design and analysis of spatial data structures
The design and analysis of spatial data structures
Finding neighbors of equal size in linear quadtrees and octrees in constant time
CVGIP: Image Understanding
A parallel hashed Oct-Tree N-body algorithm
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
LogP: a practical model of parallel computation
Communications of the ACM
The art of computer programming, volume 1 (3rd ed.): fundamental algorithms
The art of computer programming, volume 1 (3rd ed.): fundamental algorithms
High performance Fortran for highly irregular problems
PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
Auto-blocking matrix-multiplication or tracking BLAS3 performance from source code
PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
Undulant-block elimination and integer-preserving matrix inversion
Science of Computer Programming
A Transformation System for Developing Recursive Programs
Journal of the ACM (JACM)
A class of data structures for associative searching
PODS '84 Proceedings of the 3rd ACM SIGACT-SIGMOD symposium on Principles of database systems
The history of FORTRAN I, II, and III
ACM SIGPLAN Notices - Special issue: History of programming languages conference
Matrix factorization using a block-recursive structure and block-recursive algorithms
Matrix factorization using a block-recursive structure and block-recursive algorithms
A new and effective hierarchical overlay structure for Peer-to-Peer networks
Computer Communications
Brief announcement: communication bounds for heterogeneous architectures
Proceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures
Communication-optimal Parallel and Sequential Cholesky Decomposition
SIAM Journal on Scientific Computing
A study on load imbalance in parallel hypermatrix multiplication using OpenMP
PPAM'05 Proceedings of the 6th international conference on Parallel Processing and Applied Mathematics
Adapting linear algebra codes to the memory hierarchy using a hypermatrix scheme
PPAM'05 Proceedings of the 6th international conference on Parallel Processing and Applied Mathematics
A paradigm for parallel matrix algorithms: scalable cholesky
Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
Graph expansion and communication costs of fast matrix multiplication
Journal of the ACM (JACM)
Hi-index | 0.00 |
Definitions for the uniform representation of d-dimensional matrices serially in Morton-order (or Z-order) support both their use with cartesian indices, and their divide-and-conquer manipulation as quaternary trees. In the latter case, d-dimensional arrays are accessed as 2d-ary trees. This data structure is important because, at once, it relaxes serious problems of locality and latency, and the tree helps schedule multiprocessing. It enables algorithms that avoid cache misses and page faults at all levels in hierarchical memory, independently of a specific runtime environment. This paper gathers the properties of Morton order and its mappings to other indexings, and outlines for compiler support of it. Statistics elsewhere show that the new ordering and block algorithms achieve high flop rates and, indirectly, parallelism without any low-level tuning.