Reversible logic circuit synthesis
Proceedings of the 2002 IEEE/ACM international conference on Computer-aided design
Architectural techniques for accelerating subword permutations with repetitions
IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special section on the 2001 international conference on computer design (ICCD)
Security as a new dimension in embedded system design
Proceedings of the 41st annual Design Automation Conference
Data structures and algorithms for simplifying reversible circuits
ACM Journal on Emerging Technologies in Computing Systems (JETC)
Sorter based permutation units for media-enhanced microprocessors
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Fast Bit Gather, Bit Scatter and Bit Permutation Instructions for Commodity Microprocessors
Journal of Signal Processing Systems
A customized cross-bar for data-shuffling in domain-specific simd processors
ARCS'07 Proceedings of the 20th international conference on Architecture of computing systems
Boosting AES performance on a tiny processor core
CT-RSA'08 Proceedings of the 2008 The Cryptopgraphers' Track at the RSA conference on Topics in cryptology
Hi-index | 0.00 |
Abstract: We propose two new instructions, swperm and sieve, that can be used to efficiently complete an arbitrary bit-level permutation of an n-bit word with or without repetitions. Permutations with repetitions are rearrangements of an ordered set in which elements may replace other elements in the set; such permutations are useful in cryptographic algorithms. On a 4-way superscalar processor, an arbitrary 64-bit permutation with repetitions of 1-bit subwords can be completed in 11 instructions and only 4 cycles using the two proposed instructions. For subwords of size 4 bits or greater, an arbitrary permutation with repetitions of a 64-bit register can be completed in a single cycle using a single swperm instruction. This improves upon previous permutation instruction proposals that require log(r) sequential instructions to permute r subwords of a 64-bit word without repetitions. Our method requires fewer instructions to permute 4-bit or larger subwords packed in a 64-bit register and fewer execution cycles for 1-bit subwords on wide superscalar processors.