Finite State Machine-Based Optimization of Data Parallel Regular Domain Problems Applied in Low-Level Image Processing

Authors:
Frank J. Seinstra;Dennis Koelma;Andrew D. Bagdanov
Affiliations:
-;-;-
Venue:
IEEE Transactions on Parallel and Distributed Systems
Year:
2004

Citing 20
Cited 8

A software environment for parallel computer vision

Computer
Generating local addresses and communication sets for data-parallel programs

Journal of Parallel and Distributed Computing
Parallel image processing applications on a network of workstations

Parallel Computing
Global optimization for mapping parallel image processing tasks on distributed memory machines

Journal of Parallel and Distributed Computing
Logic simulation using networks of state machines

DATE '00 Proceedings of the conference on Design, automation and test in Europe
Data Locality Exploitation in the Decomposition of Regular Domain Problems

IEEE Transactions on Parallel and Distributed Systems
The distributed ASCI Supercomputer project

ACM SIGOPS Operating Systems Review
A Minimum Cost Approach for Segmenting Networks of Lines

International Journal of Computer Vision
P-3PC: A Point-to-Point Communication Model for Automatic and Optimal Decomposition of Regular Domain Problems

IEEE Transactions on Parallel and Distributed Systems
Introduction To Automata Theory, Languages, And Computation

Introduction To Automata Theory, Languages, And Computation
A data and task parallel image processing environment

Parallel Computing - Parallel computing in image and video processing
A software architecture for user transparent parallel image processing

Parallel Computing - Parallel computing in image and video processing
Fast Automatic Generation of DSP Algorithms

ICCS '01 Proceedings of the International Conference on Computational Sciences-Part I
EASY PIPE: An ``EASY to use'' Parallel Image processing Environment based on algorithmic skelekons

IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Multi-scale Document Description Using Rectangular Granulometries

DAS '02 Proceedings of the 5th International Workshop on Document Analysis Systems V
A PVM Implementation of a Portable Parallel Image Processing Library

EuroPVM '96 Proceedings of the Third European PVM Conference on Parallel Virtual Machine
Generalized Multipartitioning for Multi-Dimensional Arrays

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
A high-level approach to synthesis of high-performance codes for quantum chemistry

Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Learning to construct fast signal processing implementations

The Journal of Machine Learning Research
User transparency: a fully sequential programming model for efficient data parallel image processing: Research Articles

Concurrency and Computation: Practice & Experience

User Transparent Parallel Processing of the 2004 NIST TRECVID Data Set

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
High-Performance Distributed Video Content Analysis with Parallel-Horus

IEEE MultiMedia
Adaptive Parallel Householder Bidiagonalization

Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
User transparent task parallel multimedia content analysis

Euro-Par'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part II
Transformation of a mealy finite-state machine into a moore finite-state machine by splitting internal states

Journal of Computer and Systems Sciences International
A method for minimizing Moore finite-state machines by merging two states

Journal of Computer and Systems Sciences International
Towards user transparent parallel multimedia computing on GPU-Clusters

ISCA'10 Proceedings of the 2010 international conference on Computer Architecture
A parallel solution for high resolution histological image analysis

Computer Methods and Programs in Biomedicine

Quantified Score

Hi-index	0.00

Visualization

Abstract

A popular approach to providing nonexperts in parallel computing with an easy-to-use programming model is to design a software library consisting of a set of preparallelized routines, and hide the intricacies of parallelization behind the library's API. However, for regular domain problems (such as simple matrix manipulations or low-level image processing applications驴in which all elements in a regular subset of a dense data field are accessed in turn) speedup obtained with many such library-based parallelization tools is often suboptimal. This is because interoperation optimization (or: time-optimization of communication steps across library calls) is generally not incorporated in the library implementations. This paper presents a simple, efficient, finite state machine-based approach for communication minimization of library-based data parallel regular domain problems. In the approach, referred to as lazy parallelization, a sequential program is parallelized automatically at runtime by inserting communication primitives and memory management operations whenever necessary. Apart from being simple and cheap, lazy parallelization guarantees to generate legal, correct, and efficient parallel programs at all times. The effectiveness of the approach is demonstrated by analyzing the performance characteristics of two typical regular domain problems obtained from the field of low-level image processing. Experimental results show significant performance improvements over nonoptimized parallel applications. Moreover, obtained communication behavior is found to be optimal with respect to the abstraction level of message passing programs.