An algorithm and architecture based on orthonormal &mgr;-rotations for computing the symmetric EVD
Integration, the VLSI Journal - Special issue: algorithms and parallel VLSI architectures
An Efficient Jacobi-Like Algorithm for Parallel Eigenvalue Computation
IEEE Transactions on Computers
The future of multiprocessor systems-on-chips
Proceedings of the 41st annual Design Automation Conference
Delay and Power Minimization in VLSI Interconnects with Spatio-Temporal Bus-Encoding Scheme
ISVLSI '07 Proceedings of the IEEE Computer Society Annual Symposium on VLSI
FreePDK: An Open-Source Variation-Aware Design Kit
MSE '07 Proceedings of the 2007 IEEE International Conference on Microelectronic Systems Education
Low-Complexity Link Microarchitecture for Mesochronous Communication in Networks-on-Chip
IEEE Transactions on Computers
The Future Is Low Power and Test
ETS '08 Proceedings of the 2008 13th European Test Symposium
Hi-index | 0.01 |
Modern VLSI manufacturing technology has kept shrinking down to the nanoscale level with a very fast trend. Integration with the advanced nano-technology now makes it possible to realize advanced parallel iterative algorithms directly which was almost impossible 10 years ago. In this paper, we want to discuss the influences of evolving VLSI technologies for iterative algorithms and present design strategies from an algorithmic and architectural point of view. We can simplify the parallel implementation of the iterative algorithm (i.e., processor elements of the multiprocessor array) in any way as long as the convergence is guaranteed. However, the modification of the algorithm (processors) usually increases the number of required iterations which also means that the switch activity of interconnects is increasing. We implemented a 3×3 Jacobi EVD array with the µ-CORDIC PE in both 0.18µm and 45nm technologies in order to further study the trade-off between the performance/complexity of processors and the load/throughput of interconnects. Our experimental results show that using the µ-CORDIC PE is beneficial concerning the design criteria since it yields smaller chip area, faster overall computation timing and less energy consumption per operation than the Full CORDIC PE.