Sequential Operations in Digital Picture Processing
Journal of the ACM (JACM)
Connectivity in Digital Pictures
Journal of the ACM (JACM)
Linear-time connected-component labeling based on sequential local operations
Computer Vision and Image Understanding
A Simple and Efficient Connected Components Labeling Algorithm
ICIAP '99 Proceedings of the 10th International Conference on Image Analysis and Processing
A linear-time component-labeling algorithm using contour tracing technique
Computer Vision and Image Understanding
Open Source GIS: A GRASS GIS Approach
Open Source GIS: A GRASS GIS Approach
Fast connected-component labelling in three-dimensional binary images based on iterative recursion
Computer Vision and Image Understanding
Fast and memory efficient 2-D connected components using linked lists of line segments
IEEE Transactions on Image Processing
Hi-index | 0.00 |
Labeling of connected components in an image or a raster of non-imagery data is a fundamental operation in fields of pattern recognition and machine intelligence. The bulk of effort devoted to designing efficient connected components labeling (CCL) algorithms concentrated on the domain of binary images where labeling is required for a computer to recognize objects. In contrast, in the Geographical Information Science (GIS) a CCL algorithm is mostly applied to multi-categorical rasters in order to either convert a raster to a shapefile, or for statistical characterization of individual clumps. Recently, it has become necessary to label connected components in very large, giga-cell size, multi-categorical rasters but performance of existing CCL algorithms lacks sufficient speed to accomplish such task. In this paper we present a modification to the popular two-scan CCL algorithm that enables labeling of giga-cell size, multi-categorical rasters. Our approach is to apply a divide-and-conquer technique coupled with parallel processing to a standard two-scan algorithm. For specificity, we have developed a variant of a standard CCL algorithm implemented as r.clump in GRASS GIS. We have established optimal values of data blocks (stemming from the divide-and-conquer technique) and optimal number of computational threads (stemming from parallel processing) for a new algorithm called r.clump3p. The performance of the new algorithm was tested on a series of rasters up to 160Mcells in size; for largest size test raster a speed up over the original algorithm is 74 times. Finally, we have applied the new algorithm to the National Land Cover Dataset 2006 raster with 1.6x10^1^0 cells. Labeling this raster took 39h using two-processors, 16 cores computer and resulted in 221,718,501 clumps. Estimated speed up over the original algorithm is 450 times. The r.clump3p works within the GRASS environment and is available in the public domain.