VLSI array processors
Finite Word-Length Effects Of An Unified Systolic Array For 2-D DCT/IDCT
ASAP '96 Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures, and Processors
IEEE Transactions on Computers
Hi-index | 0.00 |
Recent advances in technology make it possible to integrate multiple processors into a single chip to build high performance parallel programmable digital signal processors (PPDSPs). These processors are expected to replace many dedicated digital signal processors to implement important image/signal processing algorithms such as discrete cosine transform (DCT). The paper addresses the issue of how to compare fast 2D-DCT algorithms when they are implemented on a PPDSP. Previously, the efficiency of these algorithms is compared based on the number of operations. This comparison is reasonable when these algorithms are implemented on a dedicated DSP. However, this comparison may not be suitable for general-purpose PPDSPs. The paper proposes to use three parameters, the number of data accesses, the number of communications, and the distance of communications, as new criterion for performance comparison of DCT algorithms. An algorithm-level technique is developed to estimate these parameters for DCT algorithms. The comparison results based on the parameters show that the algorithm proposed by Cho and Lee (1991) might be the best choice for a PPDSP unless it requires large overhead for communication between remote processors. In this case, the conventional row-column method with a fast 1D-DCT algorithm might be the most efficient.