A new measure of clustering effectiveness: Algorithms and experimental studies

Authors:
E. K. F. Dang;R. W. P. Luk;K. S. Ho;S. C. F. Chan;D. L. Lee
Affiliations:
Department of Computing, The Hong Kong Polytechnic University, Hung Hom, Hong Kong;Department of Computing, The Hong Kong Polytechnic University, Hung Hom, Hong Kong;Department of Computing, The Hong Kong Polytechnic University, Hung Hom, Hong Kong;Department of Computing, The Hong Kong Polytechnic University, Hung Hom, Hong Kong;Department of Computer Science and Engineering, Hong Kong University of Science & Technology, Hong Kong
Venue:
Journal of the American Society for Information Science and Technology
Year:
2008

Citing 0
Cited 1

Probability-based text clustering algorithm by alternately repeating two operations

Journal of Information Science

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose a new optimal clustering effectiveness measure, called CS1, based on a combination of clusters rather than selecting a single optimal cluster as in the traditional MK1 measure. For hierarchical clustering, we present an algorithm to compute CS1, defined by seeking the optimal combinations of disjoint clusters obtained by cutting the hierarchical structure at a certain similarity level. By reformulating the optimization to a 0-1 linear fractional programming problem, we demonstrate that an exact solution can be obtained by a linear time algorithm. We further discuss how our approach can be generalized to more general problems involving overlapping clusters, and we show how optimal estimates can be obtained by greedy algorithms. © 2008 Wiley Periodicals, Inc.