Classification of binary vectors by stochastic complexity
Journal of Multivariate Analysis
Probabilistic Models for Bacterial Taxonomy
Probabilistic Models for Bacterial Taxonomy
Fisher information and stochastic complexity
IEEE Transactions on Information Theory
On multivariate binary data clustering and feature weighting
Computational Statistics & Data Analysis
Hi-index | 0.10 |
Stochastic complexity (SC) has been employed as a cost function for solving binary clustering problem using Shannon code length (CL distance) as the distance function. The CL distance, however, is defined for a given static clustering only, and it does not take into account of the changes in the class distribution during the clustering process. We propose a new ΔSC distance function, which is derived directly from the difference of the cost function value before and after the classification. The effect of the new distance function is demonstrated by implementing it with two clustering algorithms.