Cube structures for multiprocessors
Communications of the ACM
Interconnection Networks: An Engineering Approach
Interconnection Networks: An Engineering Approach
Design and Analysis of Even-Sized Binary Shuffle-Exchange Networks for Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
An emulator network for SIMD machine interconnection networks
ISCA '79 Proceedings of the 6th annual symposium on Computer architecture
Permutation Admissibility in Shuffle-Exchange Networks with Arbitrary Number of Stages
HIPC '98 Proceedings of the Fifth International Conference on High Performance Computing
ICCD '01 Proceedings of the International Conference on Computer Design: VLSI in Computers & Processors
Architectural techniques for accelerating subword permutations with repetitions
IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special section on the 2001 international conference on computer design (ICCD)
SODA: A Low-power Architecture For Software Radio
Proceedings of the 33rd annual international symposium on Computer Architecture
Energy efficient support for all levels of parallelism for complex media applications
Energy efficient support for all levels of parallelism for complex media applications
Software defined radio – a high performance embedded challenge
HiPEAC'05 Proceedings of the First international conference on High Performance Embedded Architectures and Compilers
AnySP: anytime anywhere anyway signal processing
Proceedings of the 36th annual international symposium on Computer architecture
Modeling Scalable SIMD DSPs in LISA
SAMOS '09 Proceedings of the 9th International Workshop on Embedded Computer Systems: Architectures, Modeling, and Simulation
Semi custom design: a case study on SIMD shufflers
PATMOS'07 Proceedings of the 17th international conference on Integrated Circuit and System Design: power and timing modeling, optimization and simulation
Hi-index | 0.00 |
Shuffle operations are one of the most common operations in SIMD based embedded system architectures. In this paper we study different families of shuffle operations that frequently occur in embedded applications running on SIMD architectures. These shuffle operations are used to drive the design of a custom shuffler for domain-specific SIMD processors. The energy efficiency of various crossbar based custom shufflers is analyzed and compared with the widely used full crossbar. We show that by customizing the crossbar to implement specific shuffle operations required in the target application domain, we can reduce the energy consumption of shuffle operations by up to 80%. We also illustrate the tradeoffs between flexibility and energy efficiency of custom shufflers and show that customization offers reasonable benefits without compromising the flexibility required for the target application domain.