Simultaneous multithreading: maximizing on-chip parallelism
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Intel threading building blocks
Intel threading building blocks
Computer Organization and Architecture: Designing for Performance
Computer Organization and Architecture: Designing for Performance
Parallel ant colony for nonlinear function optimization with graphics hardware acceleration
SMC'09 Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics
Programming Massively Parallel Processors: A Hands-on Approach
Programming Massively Parallel Processors: A Hands-on Approach
OpenCL: Make Ubiquitous Supercomputing Possible
HPCC '10 Proceedings of the 2010 IEEE 12th International Conference on High Performance Computing and Communications
2-D discrete cosine transform (DCT) on meshes with hierarchical control modes
IbPRIA'05 Proceedings of the Second Iberian conference on Pattern Recognition and Image Analysis - Volume Part I
OpenCL: A Parallel Programming Standard for Heterogeneous Computing Systems
Computing in Science and Engineering
A methodology for efficient use of OpenCL, ESL and FPGAs in multi-core architectures
Euro-Par'12 Proceedings of the 18th international conference on Parallel processing workshops
Editorial: advanced semantic and social multimedia technologies for future computing environment
Multimedia Tools and Applications
Hi-index | 0.00 |
A noteworthy thing in desktop PCs is that they can provide a great opportunity to increase the performance of processing multimedia data by exploiting task- and data-parallelism with multi-core CPU and many-core GPU. This paper presents a high performance parallel implementation of 2D DCT on this heterogeneous computing environment. For this purpose, Intel TBB (threading building blocks) and OpenCL (Open Compute Language) are utilized for task- and data-parallelism, respectively. The simulation result shows that the parallel DCT implementations far the serial ones in processing speed. Especially, OpenCL implementation shows a linear speedup, a typical SIMD characteristic as the increase of 2D data sets.