A Factorization Approach to Grouping
ECCV '98 Proceedings of the 5th European Conference on Computer Vision-Volume I - Volume I
Spectral partitioning works: planar graphs and finite element meshes
FOCS '96 Proceedings of the 37th Annual Symposium on Foundations of Computer Science
Multiclass Spectral Clustering
ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Towards parameter-free data mining
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
A tutorial on spectral clustering
Statistics and Computing
Introduction to Information Retrieval
Introduction to Information Retrieval
An Introduction to Kolmogorov Complexity and Its Applications
An Introduction to Kolmogorov Complexity and Its Applications
Clustering pairwise distances with missing data: maximum cuts versus normalized cuts
DS'06 Proceedings of the 9th international conference on Discovery Science
Similarity of objects and the meaning of words
TAMC'06 Proceedings of the Third international conference on Theory and Applications of Models of Computation
IEEE Transactions on Information Theory
IEEE Transactions on Information Theory
IEEE Transactions on Information Theory
Hi-index | 0.00 |
The present paper analyzes the usefulness of the normalized compression distance for the problem to cluster the hemagglutinin (HA) sequences of influenza virus data for the HA gene in dependence on the available compressors. Using the CompLearn Toolkit, the built-in compressors zlib and bzip2 are compared. Moreover, a comparison is made with respect to hierarchical and spectral clustering. For the hierarchical clustering, hclust from the R package is used, and the spectral clustering is done via the kLine algorithm proposed by Fischer and Poland (2004). Our results are very promising and show that one can obtain an (almost) perfect clustering. It turned out that the zlib compressor allowed for better results than the bzip2 compressor and, if all data are concerned, then hierarchical clustering is a bit better than spectral clustering via kLines.