Shape-based clustering for 3D CAD objects: A comparative study of effectiveness
Computer-Aided Design
Hi-index | 0.00 |
Sequence clustering problem is different from traditional clustering problems in that the features of sequences are not observable and sequences cannot be placed in a metric space, which most clustering algorithms assume. The most widely used approach is to build a sequence graph using the all-pairwise sequence comparison data and to use the graph to generate clusters of sequences. Like other clustering problems, a metric to evaluate results from a sequence clustering algorithm is needed, but the metrics for traditional clustering problems are not readily applicable due to their metric space assumption. We propose Cluster Utility (CU), a metric that is based on consideration of similarity within a cluster and difference between clusters without metric space assumption. CU showed a very high correlation with the quality index. CU scales very well with data size and its strong correlation with quality index was nearly invariable regardless of data size change. CU can be used in two ways: to guide sequence clustering algorithms and to evaluate clustering results.