Circuits, Systems, and Signal Processing
Introduction to parallel algorithms and architectures: array, trees, hypercubes
Introduction to parallel algorithms and architectures: array, trees, hypercubes
An Adaptation of the Fast Fourier Transform for Parallel Processing
Journal of the ACM (JACM)
The Art of Computer Programming, 2nd Ed. (Addison-Wesley Series in Computer Science and Information
The Art of Computer Programming, 2nd Ed. (Addison-Wesley Series in Computer Science and Information
STOC '83 Proceedings of the fifteenth annual ACM symposium on Theory of computing
Parallel Processing with the Perfect Shuffle
IEEE Transactions on Computers
Formal datapath representation and manipulation for implementing DSP transforms
Proceedings of the 45th annual Design Automation Conference
Sorting networks and their applications
AFIPS '68 (Spring) Proceedings of the April 30--May 2, 1968, spring joint computer conference
Permuting streaming data using RAMs
Journal of the ACM (JACM)
Operator Language: A Program Generation Framework for Fast Kernels
DSL '09 Proceedings of the IFIP TC 2 Working Conference on Domain-Specific Languages
Proceedings of the VLDB Endowment
"Smart" design space sampling to predict Pareto-optimal solutions
Proceedings of the 13th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, Tools and Theory for Embedded Systems
Hi-index | 0.00 |
Sorting networks offer great performance but become prohibitively expensive for large data sets. We present a domain-specific language and compiler to automatically generate hardware implementations of sorting networks with reduced area and optimized for latency or throughput. Our results show that the generator produces a wide range of Pareto-optimal solutions that both compete with and outperform prior sorting hardware.