Watersheds in Digital Spaces: An Efficient Algorithm Based on Immersion Simulations
IEEE Transactions on Pattern Analysis and Machine Intelligence
Topographic distance and watershed lines
Signal Processing - Special issue on mathematical morphology and its applications to signal processing
A connected component approach to the watershed segmentation
ISMM '98 Proceedings of the fourth international symposium on Mathematical morphology and its applications to image and signal processing
Parallel watershed transformation algorithms for image segmentation
Parallel Computing
A brief history of cellular automata
ACM Computing Surveys (CSUR)
Evolution in asynchronous cellular automata
ICAL 2003 Proceedings of the eighth international conference on Artificial life
Parallel Asynchronous Watershed Algorithm-Architecture
IEEE Transactions on Parallel and Distributed Systems
Programming Massively Parallel Processors: A Hands-on Approach
Programming Massively Parallel Processors: A Hands-on Approach
Parallel graph component labelling with GPUs and CUDA
Parallel Computing
Advances on watershed processing on GPU architecture
ISMM'11 Proceedings of the 10th international conference on Mathematical morphology and its applications to image and signal processing
Efficient GPU Asynchronous Implementation of a Watershed Algorithm Based on Cellular Automata
ISPA '12 Proceedings of the 2012 IEEE 10th International Symposium on Parallel and Distributed Processing with Applications
The Watershed Transform: Definitions, Algorithms and Parallelization Strategies
Fundamenta Informaticae
Hi-index | 0.00 |
The watershed transform is a method for non-supervised image segmentation. In this paper we show that a watershed algorithm based on a cellular automaton is a good choice for the recent GPU architectures, especially when the synchronization rules are relaxed. In particular, we propose a block-asynchronous computation strategy that maps the cellular automaton on the thread blocks of the GPU. This method reduces the number of points of global synchronization allowing efficient exploitation of the memory hierarchy of the GPU. We also avoid the artifacts produced in the watershed lines by the block-asynchronous updating scheme by correcting the data propagation speed among the blocks. The proposals are compared to an OpenMP multithreaded code. The high speedups indicate the potential of this kind of algorithm for new architectures based on hundreds of cores. The method is tuned to be applied to 3D volumes obtaining similar results.