Principles of CMOS VLSI design: a systems perspective
Principles of CMOS VLSI design: a systems perspective
Depth-size trade-offs for parallel prefix computation
Journal of Algorithms
Faster optimal parallel prefix sums and list ranking
Information and Computation
Scans as Primitive Parallel Operations
IEEE Transactions on Computers
Limited width parallel prefix circuits
The Journal of Supercomputing
Optimal schedules for parallel prefix computation with bounded resources
PPOPP '91 Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming
Parallel computing using the prefix problem
Parallel computing using the prefix problem
The Strict Time Lower Bound and Optimal Schedules for Parallel Prefix with Resource Constraints
IEEE Transactions on Computers
Parallel computation: models and methods
Parallel computation: models and methods
Journal of the ACM (JACM)
New bounds for parallel prefix circuits
STOC '83 Proceedings of the fifteenth annual ACM symposium on Theory of computing
Constructing H4, a Fast Depth-Size Optimal Parallel Prefix Circuit
The Journal of Supercomputing
Z4: a new depth-size optimal parallel prefix circuit with small depth
Neural, Parallel & Scientific Computations
A new approach to constructing optimal parallel prefix circuits with small depth
Journal of Parallel and Distributed Computing
Constructing zero-deficiency parallel prefix adder of minimum depth
Proceedings of the 2005 Asia and South Pacific Design Automation Conference
Faster optimal parallel prefix circuits: New algorithmic construction
Journal of Parallel and Distributed Computing
On the construction of zero-deficiency parallel prefix circuits with minimum depth
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Computation-efficient parallel prefix
AIC'06 Proceedings of the 6th WSEAS International Conference on Applied Informatics and Communications
Two families of parallel prefix algorithms for multicomputers
TELE-INFO'08 Proceedings of the 7th WSEAS International Conference on Telecommunications and Informatics
Straightforward construction of depth-size optimal, parallel prefix circuits with fan-out 2
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Parallel prefix algorithms on the multicomputer
WSEAS Transactions on Computer Research
New parallel prefix algorithms
AIC'09 Proceedings of the 9th WSEAS international conference on Applied informatics and communications
New families of computation-efficient parallel prefix algorithms
WSEAS Transactions on Computers
A novel hybrid parallel-prefix adder architecture with efficient timing-area characteristic
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Hi-index | 0.00 |
Given n values x_1, x_2, …, x_n and an associative binary operation o, the prefix problem is to compute x_1 o x_2 o ··· o x_i, 1≤i≤n. Many combinational circuits for solving the prefix problem, called prefix circuits, have been designed. It has been proved that the size s(D(n)) and the depth d(D(n)) of an n-input prefix circuit D(n) satisfy the inequality d(D(n))+s(D(n))≥2n−2; thus, a prefix circuit is depth-size optimal if d(D(n))+s(D(n))=2n−2. In this paper, we construct a new depth-size optimal prefix circuit SL(n). In addition, we can build depth-size optimal prefix circuits whose depth can be any integer between d(SL(n)) and n−1. SL(n) has the same maximum fan-out ⌈lg n⌉+1 as Snir's SN(n), but the depth of SL(n) is smaller; thus, SL(n) is faster. Compared with another optimal prefix circuit LYD(n), d(LYD(n))+2≥d(SL(n))≥d(LYD(n)). However, LYD(n) may have a fan-out of at most 2 ⌈lg n⌉−2, and the fan-out of LYD(n) is greater than that of SL(n) for almost all n≥12. Because an operation node with greater fan-out occupies more chip area and is slower in VLSI implementation, in most cases, SL(n) needs less area and may be faster than LYD(n). Moreover, it is much easier to design SL(n) than LYD(n).