Sorting in c log n parallel steps
Combinatorica
Introduction to parallel algorithms and architectures: array, trees, hypercubes
Introduction to parallel algorithms and architectures: array, trees, hypercubes
Circuit complexity: from the worst case to the average case
STOC '94 Proceedings of the twenty-sixth annual ACM symposium on Theory of computing
Efficient VLSI architectures for Columnsort
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
The Average Case Complexity of the Parallel Prefix Problem
ICALP '94 Proceedings of the 21st International Colloquium on Automata, Languages and Programming
STOC '79 Proceedings of the eleventh annual ACM symposium on Theory of computing
On the Bit-Level Complexity of Bitonic Sorting Networks
ICPP '93 Proceedings of the 1993 International Conference on Parallel Processing - Volume 03
The VLSI Complexity of Sorting
IEEE Transactions on Computers
A (fairly) simple circuit that (usually) sorts
SFCS '90 Proceedings of the 31st Annual Symposium on Foundations of Computer Science
Hi-index | 0.00 |
In previous work we have introduced an average case measure for the time complexity of Boolean circuits - that is the delay between feeding the input bits into a circuit and the moment when the results are ready at the output gates - and analysed this complexity measure for prefix computations. Here we consider the problem to sort large integers that are given in binary notation. Contrary to a word comparator sorting circuit C where a basic computational element, a comparator, is charged with a single time step to compare two elements, in a bit comparator circuit C′ a comparison of two binary numbers has to be implemented by a Boolean subcircuit CM called comparator module that is built from Boolean gates of bounded fanin. Thus, compared to C, the depth of C′ will be larger by a factor up to the depth of CM. Our goal is to minimize the average delay of bit comparator sorting circuits. The worst-case delay can be estimated by the depth of the circuit. For this worstcase measure two topologically quite different designs seems to be appropriate for the comparator modules: a tree-like one if the inputs are long numbers, otherwise a linear array working in a pipelined fashion. Inserting these into a word comparator circuit we get bit level sorting circuits for binary numbers of length m for which the depth is either increased by a multiplicative factor of oder log m or by an additive term of order m. We show that this obvious solution can be improved significantly by constructing efficient sorting and merging circuits for the bit model that only suffer a constant factor time loss on the average if the inputs are uniformly distributed. This is done by designing suitable hybrid architectures of tree compaction and pipelining. These results can also be extended to classes of nonuniform distributions if we put a bound on the complexity of the distributions themselves.