Efficient programming paradigm for video streaming processing on TILE64 platform

Authors:
Xuan-Yi Lin;Kuan-Chou Lai;Kuan-Ching Li;Yeh-Ching Chung
Affiliations:
Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan, R.O.C. 30013;Department of Computer Science, National Taichung University of Education, Taichung, Taiwan, R.O.C. 40306;Department of Computer Science and Information Engineering, Providence University, Taichung, Taiwan, R.O.C. 43301;Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan, R.O.C. 30013
Venue:
The Journal of Supercomputing
Year:
2013

Citing 20
Cited 0

An Enabling Framework for Master-Worker Applications on the Computational Grid

HPDC '00 Proceedings of the 9th IEEE International Symposium on High Performance Distributed Computing
Dynamic multigrain parallelization on the cell broadband engine

Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
From single core to multi-core: preparing for a new exponential

Proceedings of the 2006 IEEE/ACM international conference on Computer-aided design
A parallel dynamic programming algorithm on a multi-core architecture

Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures
Thousand core chips: a technology perspective

Proceedings of the 44th annual Design Automation Conference
Design and Implementation of a Real-Time Video Player on Tiled-Display System

CIT '07 Proceedings of the 7th IEEE International Conference on Computer and Information Technology
MapReduce: simplified data processing on large clusters

Communications of the ACM - 50th anniversary issue: 1958 - 2008
Application mapping for chip multiprocessors

Proceedings of the 45th annual Design Automation Conference
Achieving predictable performance through better memory controller placement in many-core CMPs

Proceedings of the 36th annual international symposium on Computer architecture
Scheduling Concurrent Bag-of-Tasks Applications on Heterogeneous Platforms

IEEE Transactions on Computers
Hierarchical master-worker skeletons

PADL'08 Proceedings of the 10th international conference on Practical aspects of declarative languages
An Architecture for Distributed High Performance Video Processing in the Cloud

CLOUD '10 Proceedings of the 2010 IEEE 3rd International Conference on Cloud Computing
Handling the problems and opportunities posed by multiple on-chip memory controllers

Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Mapping of H.264/AVC Encoder on a Hierarchical Chip Multicore DSP Platform

HPCC '10 Proceedings of the 2010 IEEE 12th International Conference on High Performance Computing and Communications
Hierarchical Load Balancing for Charm++ Applications on Large Supercomputers

ICPPW '10 Proceedings of the 2010 39th International Conference on Parallel Processing Workshops
Parallelization of motion JPEG decoder on TILE64 many-core platform

MTPP'10 Proceedings of the Second Russia-Taiwan conference on Methods and tools of parallel programming multicomputers
A Novel Macro-Block Group Based AVS Coding Scheme for Many-Core Processor

Journal of Signal Processing Systems
Remote store programming: a memory model for embedded multicore

HiPEAC'10 Proceedings of the 5th international conference on High Performance Embedded Architectures and Compilers
Spatial and temporal data parallelization of the H.261 video coding algorithm

IEEE Transactions on Circuits and Systems for Video Technology
Enabling large-scale scientific workflows on petascale resources using MPI master/worker

Proceedings of the 1st Conference of the Extreme Science and Engineering Discovery Environment: Bridging from the eXtreme to the campus and beyond

Quantified Score

Hi-index	0.00

Visualization

Abstract

Advances at an unprecedented rate in computer hardware and networking technologies have made the many-core computing affordable and readily available in a matter of few years. Nonetheless, it incurs challenges to programmers to build scalable parallel software. Optimizations of parallel programs for a many-core platform are viewed as a multifaceted problem, where system and architectural factors should be taken into account. In this paper, we tackle this problem by implementing parallel programs with different available programming paradigms and evaluate application behaviors on TILE64 many-core platform. That is, we investigate a hybrid producer-write plus consumer-read shared memory programming paradigm for the implementation of master---worker video decoder and encoder in the referred many-core platform. Experimental results show that the proposed implementation has achieved competitive performance speedup, scaling well with the number of available cores and up to four times of performance improvement over other implementations on the decoding of sample 1080P video.