IEEE Transactions on Audio, Speech, and Language Processing
Hi-index | 0.00 |
Most partitional clustering algorithms require the number of desired clusters to be set a priori. Not only is this somewhat counter-intuitive, it is also difficult except in the simplest of situations. By contrast, hierarchical clustering may create partitions with varying numbers of clusters. The actual final partition depends on a threshold placed on the similarity measure used. Given a cluster quality metric, one can efficiently discover an appropriate threshold through a form of semi-supervised learning. This paper shows one such solution for complete-link hierarchical agglomerative clustering using the F-measure and a small subset of labeled examples. Empirical evaluation demonstrates promise.