A model of computation for VLSI with related complexity results
Journal of the ACM (JACM)
A model for hierarchical memory
STOC '87 Proceedings of the nineteenth annual ACM symposium on Theory of computing
Hitting the memory wall: implications of the obvious
ACM SIGARCH Computer Architecture News
Computational power of pipelined memory hierarchies
Proceedings of the thirteenth annual ACM symposium on Parallel algorithms and architectures
Models of Computation: Exploring the Power of Computing
Models of Computation: Exploring the Power of Computing
Optimal organizations for pipelined hierarchical memories
Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures
ESA '98 Proceedings of the 6th Annual European Symposium on Algorithms
Hierarchical memory with block transfer
SFCS '87 Proceedings of the 28th Annual Symposium on Foundations of Computer Science
Models for parallel and hierarchical computation
Proceedings of the 4th international conference on Computing frontiers
On approximating the ideal random access machine by physical machines
Journal of the ACM (JACM)
Hi-index | 0.00 |
Powerful memory models, including hierarchies with block transfer or with pipeline of accesses have been proposed in theory and are partially realized in commercial systems, to reduce the average memory overhead per operation. However, even for such powerful models, there are simple direct flow programs, with no branches and no indirect addressing, that require non-constant overhead, resulting in superlinear execution time. Indeed, we characterize a wide, natural class of machines, including nearly all previously proposed models, and develop a technique which yields superlinear time lower bounds on any machine of this class, for suitable direct-flow computations. We propose the Address Dependence Model (ADM) for machines with pipelined memory hierarchies, where any direct-flow program runs in time linear in the number of executed instructions. As an example of the capabilities of ADM for algorithms non amenable to direct-flow formulation, we show how to implement quicksort in time proportional to the number of executed comparisons, whose expected valu is O(n log n), even on memories where the latency of address x is a(x) = 驴(x). (In contrast, T = 驴(n log^2 n) for sorting in the block transfer model of [Hierarchical memory with block transfer].) Finally, we consider the question of physical implementation of ADM and propose an extensible machine design, in which the number of gates and the length of a wire that a signal traverses in one clock period are, within a given technology, independent of system size. Such designs scale with system size (in particular, with memory latency) as well as with technological advancement.We assume aggressive, but feasible [Optimal organizations for pipelined hierarchical memories], hierarchical memories pipelinable at a constant rate. The main contribution is a novel processor organization capable of fully exploiting such memories.