Stride Permutation Networks for Array Processors

Authors:
Tuomas Järvinen;Perttu Salmela;Harri Sorokin;Jarmo Takala
Affiliations:
Nokia Technology Platforms, Tampere, Finland 33721;Tampere University of Technology, Tampere, Finland 33721;Tampere University of Technology, Tampere, Finland 33721;Tampere University of Technology, Tampere, Finland 33721
Venue:
Journal of VLSI Signal Processing Systems
Year:
2007

Citing 6
Cited 0

Synthesis of area-efficient and high-throughput rate data format converters

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
JAGUAR: a high speed VLSI chip for JPEG image compression standard

VLSID '95 Proceedings of the 8th International Conference on VLSI Design
Stride Permutation Networks for Array Processors

ASAP '04 Proceedings of the Application-Specific Systems, Architectures and Processors, 15th IEEE International Conference
Multi-port interconnection networks for radix-R algorithms

ICASSP '01 Proceedings of the Acoustics, Speech, and Signal Processing, 200. on IEEE International Conference - Volume 02
Architecture-oriented regular algorithms for discrete sine andcosine transforms

IEEE Transactions on Signal Processing
One- and two-dimensional constant geometry fast cosine transformalgorithms and architectures

IEEE Transactions on Signal Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In several digital signal processing algorithms, computational nodes are organized in consecutive stages and data is reordered between these stages. Parallel computation of such algorithms with reduced number of processing elements implies that several computational nodes are assigned to each element. As a drawback, permutations become more complex and require data storage. In this paper, a systematic design methodology for stride permutation networks is derived. These permutations are represented with Boolean matrices, which are decomposed and mapped directly onto register-based networks. The resulting networks are regular and scalable and they support any stride of power-of-two. In addition, the networks reach the lower bound in the number of registers indicating area-efficiency. Since the proposed methodology is systematic, it can be exploited in automated design generation.