A model for hierarchical memory
STOC '87 Proceedings of the nineteenth annual ACM symposium on Theory of computing
Cache and memory hierarchy design: a performance-directed approach
Cache and memory hierarchy design: a performance-directed approach
Horizons of parallel computation
Journal of Parallel and Distributed Computing
Guest Editors' Introduction-Cache Memory and Related Problems: Enhancing and Exploiting the Locality
IEEE Transactions on Computers - Special issue on cache memory and related problems
Dynamic storage allocation in the Atlas computer, including an automatic use of a backing store
Communications of the ACM
Models of Computation: Exploring the Power of Computing
Models of Computation: Exploring the Power of Computing
High Performance Compilers for Parallel Computing
High Performance Compilers for Parallel Computing
FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
Predicting Performance on SMPs. A Case Study: The SGI Power Challenge
IPDPS '00 Proceedings of the 14th International Symposium on Parallel and Distributed Processing
An Approach towards an Analytical Characterization of Locality and its Portability
IWIA '01 Proceedings of the Innovative Architecture for Future Generation High-Performance Processors and Systems (IWIA'01)
Optimal organizations for pipelined hierarchical memories
Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures
A Characterization of Temporal Locality and Its Portability across Memory Hierarchies
ICALP '01 Proceedings of the 28th International Colloquium on Automata, Languages and Programming,
An Address Dependence Model of Computation for Hierarchical Memories with Pipelined Transfer
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 8 - Volume 09
On approximating the ideal random access machine by physical machines
Journal of the ACM (JACM)
Hi-index | 0.00 |
We define a model of computation, called the Pipelined Hierarchical Random Access Machine with access function a (x), denoted the a(x)-PH-RAM. In this model, a processor interacts with a memory which can accept requests at a constant rate and satisfy each of the requests to the location x within a(x) units of time.We investigate memory management strategies that lead to time efficient implementations of arbitrary computations on a PH-RAM. We begin by developing the so called pipeline d decomposition-treememory management strategy, which can be tuned to the memory access function. Specifically, for a linear or sublinear access function a(x), w e define the concept of latency-hiding depth da(x) and show ho w an y computation of N operations can be implemented on an a(x)-PH-RAM in time T(N) = &Ogr;(Nda(N)). In particular, T(N) = &Ogr;(N log N) if a(x) = &Ogr;(x), T(N) = &Ogr;(N log log N) if a(x) = &Ogr;(x&Bgr;) with 0 &Bgr; T(N) = O(N log* N) if a(x) = &Ogr;(log x).We develop lower bound techniques that allow to establish existential lower bounds on PH-RAMs. In particular, we exhibit computations for which T(N) = &OHgr;(Nlog N/ log log N) when a(x) = &OHgr;(x), T(N) = &OHgr;(Nlog logN) when a(x) = &OHgr;(x&Bgr;) with 0 &Bgr; T(N) = &OHgr;(N log* N) when a(x) = &OHgr;(log x).The stated lower bounds show that the pipelined decomposition-tree strategy is existentially optimal for the latter case but indicates the potential for a modest, &Ogr;(log log N) improvement for linear access functions. To realize this potential, a superpipelined decomposition-tree memory manager is proposed, which achieves T(N) = &Ogr;(N log N/log log N).The pipelined decomposition-tree strategy can also be tuned to the computation, in order to exploit its temporal locality as characterized by the width parameters [9]. When the latter are suitably bounded, then T(N) = &Ogr;(N) on any PH-RAM with linear or sublinear access function. Finally, we discuss how performance could benefit from parallelism in the data-dependence dag of the computation or from architectural enhancements, such as block-transfer primitives, and formulate various questions that deserve further investigation.