The lifting scheme: a construction of second generation wavelets
SIAM Journal on Mathematical Analysis
A bandwidth-efficient architecture for media processing
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Imagine: Media Processing with Streams
IEEE Micro
ICCD '02 Proceedings of the 2002 IEEE International Conference on Computer Design: VLSI in Computers and Processors (ICCD'02)
Cg: a system for programming graphics hardware in a C-like language
ACM SIGGRAPH 2003 Papers
Programmable Stream Processors
Computer
Stream Register Files with Indexed Access
HPCA '04 Proceedings of the 10th International Symposium on High Performance Computer Architecture
IEEE Transactions on Parallel and Distributed Systems
A streaming implementation of transform and quantization in h.264
HPCC'06 Proceedings of the Second international conference on High Performance Computing and Communications
Low-complexity transform and quantization in H.264/AVC
IEEE Transactions on Circuits and Systems for Video Technology
Optimizing modulo scheduling to achieve reuse and concurrency for stream processors
The Journal of Supercomputing
Hi-index | 0.00 |
Stream processors can achieve high performance in stream applications that share stream characteristics of large parallelism, intensive computation and little data reuse. Transform coding, as a core component in video compression, is widely used in video storage and video transmission. This paper summarizes stream execution mechanism and explores design approaches of programmable stream processors including the Imagine stream processor and graphics processing unit (GPU). Based on the stream processing model, stream algorithms for block-based and frame-based (nonblock-based) transform coding are presented and mapped onto stream processors. Especially, an Interleaved Streaming Transform (IST) algorithm on Imagine and a Row-wise Zonal Transform (RZT) algorithm on GPU for 4脳4 integer transform in H.264 are proposed to exploit great potential of stream processing for block-based transform. Our experiments of transform coding suite on Imagine and GPU show that the coding efficiency of stream processors is far beyond the real-time requirements of current video applications, dealing with a variety of different video resolutions ranging from QCIF to high definition (HD). The performance evaluation of stream implementations discusses the architectural supports for transform coding, and presents the significant improvements over other programmable platforms. Transform coding may take advantage of the flexibility of programmable stream processors with high performance to play an important role in the future.