Introduction to parallel algorithms and architectures: array, trees, hypercubes
Introduction to parallel algorithms and architectures: array, trees, hypercubes
Applied cryptography (2nd ed.): protocols, algorithms, and source code in C
Applied cryptography (2nd ed.): protocols, algorithms, and source code in C
VIS Speeds New Media Processing
IEEE Micro
Subword Parallelism with MAX-2
IEEE Micro
ASAP '00 Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures, and Processors
Bit Permutation Instructions for Accelerating Software Cryptography
ASAP '00 Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures, and Processors
Bit section instruction set extension of ARM for embedded applications
CASES '02 Proceedings of the 2002 international conference on Compilers, architecture, and synthesis for embedded systems
Design Tools for Application Specific Embedded Processors
EMSOFT '02 Proceedings of the Second International Conference on Embedded Software
Architectural techniques for accelerating subword permutations with repetitions
IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special section on the 2001 international conference on computer design (ICCD)
Security as a new dimension in embedded system design
Proceedings of the 41st annual Design Automation Conference
Sorter based permutation units for media-enhanced microprocessors
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Fast Bit Gather, Bit Scatter and Bit Permutation Instructions for Commodity Microprocessors
Journal of Signal Processing Systems
Hi-index | 0.00 |
This paper proposes a new way of efficiently doing arbitrary n-bit permutations in programmable processors modeled on the theory of omega and flip networks. The new om-flip instruction we introduce can perform any permutation of n subwords in log n instructions, with the subwords ranging from half-words down to single bits. Each omflip instruction can be done in a single cycle, with very efficient hardware implementation. The omflip instruction enhances a programmable processor's capability for handling multimedia and security applications, which use subword permutations extensively.