Handbook of pattern recognition & computer vision
Distributed processing of very large datasets with DataCutter
Parallel Computing - Clusters and computational grids for scientific computing
Computer Animation and Virtual Worlds - Special Issue: The Very Best Papers from CASA 2004
Visual Simulation of Heat Shimmering and Mirage
IEEE Transactions on Visualization and Computer Graphics
Proceedings of the 5th IEEE workshop on Challenges of large applications in distributed environments
Pathological Image Analysis Using the GPU: Stroma Classification for Neuroblastoma
BIBM '07 Proceedings of the 2007 IEEE International Conference on Bioinformatics and Biomedicine
Stroma classification for neuroblastoma on graphics processors
International Journal of Data Mining and Bioinformatics
Block-Based methods for image retrieval using local binary patterns
SCIA'05 Proceedings of the 14th Scandinavian conference on Image Analysis
IEEE Transactions on Information Technology in Biomedicine
IEEE Transactions on Information Technology in Biomedicine
Perceptually uniform color spaces for color texture analysis: an empirical evaluation
IEEE Transactions on Image Processing
Single-particle 3d reconstruction from cryo-electron microscopy images on GPU
Proceedings of the 23rd international conference on Supercomputing
Solving quadratic assignment problems by genetic algorithms with GPU computation: a case study
Proceedings of the 11th Annual Conference Companion on Genetic and Evolutionary Computation Conference: Late Breaking Papers
Performance Optimization Strategies of High Performance Computing on GPU
APPT '09 Proceedings of the 8th International Symposium on Advanced Parallel Processing Technologies
ICIP'09 Proceedings of the 16th IEEE international conference on Image processing
Run-time optimizations for replicated dataflows on heterogeneous environments
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
MEDICS: ultra-portable processing for medical image reconstruction
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Stream processing on GPUs using distributed multimedia middleware
PPAM'09 Proceedings of the 8th international conference on Parallel processing and applied mathematics: Part I
Towards jungle computing with Ibis/Constellation
Proceedings of the 2011 workshop on Dynamic distributed data-intensive applications, programming abstractions, and systems
Image and video processing on CUDA: state of the art and future directions
MACMESE'11 Proceedings of the 13th WSEAS international conference on Mathematical and computational methods in science and engineering
Detecting leukaemia (AML) blood cells using cellular automata and heuristic search
IDA'10 Proceedings of the 9th international conference on Advances in Intelligent Data Analysis
Implementing p systems parallelism by means of GPUs
WMC'09 Proceedings of the 10th international conference on Membrane Computing
Improving performance of adaptive component-based dataflow middleware
Parallel Computing
RETRACTED: Color and texture analysis on emerging parallel architectures
International Journal of High Performance Computing Applications
Optimizing dataflow applications on heterogeneous environments
Cluster Computing
Optimizing H.264/AVC interprediction on a GPU-based framework
Concurrency and Computation: Practice & Experience
A survey of pipelined workflow scheduling: Models and algorithms
ACM Computing Surveys (CSUR)
Feature-based analysis of large-scale spatio-temporal sensor data on hybrid architectures
International Journal of High Performance Computing Applications
Efficient data partitioning for the GPU computation of moment functions
Journal of Parallel and Distributed Computing
Hi-index | 0.00 |
We are currently witnessing the emergence of two paradigms in parallel computing: streaming processing and multi-core CPUs. Represented by solid commercial products widely available in commodity PCs, GPUs and multi-core CPUs bring together an unprecedented combination of high performance at low cost. The scientific computing community needs to keep pace with application models and middleware which scale efficiently to hundreds of internal processing units. The purpose of the work we present here is twofold: first, a cooperative environment is designed so that both parallel models can coexist and complement one another. Second, beyond the parallelism of multiple internal cores, further parallelism is introduced when multiple CPU sockets, multiple GPUs, and multiple nodes are combined within a unique multi-processor platform which exceeds 10 TFLOPS when using 16 nodes. We illustrate our cooperative parallelization approach by implementing a large-scale, biomedical image analysis application which contains a number of assorted kernels including typical streaming operators, co-occurrence matrices, convolutions, and histograms. Experimental results are compared among different implementation strategies and almost linear speed-up is achieved when all coexisting methods in CPUs and GPUs are combined.