An Improved Co-Similarity Measure for Document Clustering

Authors:
Syed Fawad Hussain;Gilles Bisson;Clement Grimal
Affiliations:
-;-;-
Venue:
ICMLA '10 Proceedings of the 2010 Ninth International Conference on Machine Learning and Applications
Year:
2010

Citing 0
Cited 2

Bi-clustering gene expression data using co-similarity

ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part I
An architecture to efficiently learn co-similarities from multi-view datasets

ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

Co-clustering has been defined as a way to organize simultaneously subsets of instances and subsets of features in order to improve the clustering of both of them. In previous work, we proposed an efficient co-similarity measure allowing to simultaneously compute two similarity matrices between objects and features, each built on the basis of the other. Here we propose a generalization of this approach by introducing a notion of pseudo-norm and a pruning algorithm. Our experiments show that this new algorithm significantly improves the accuracy of the results when using either supervised or unsupervised feature selection data and that it outperforms other algorithms on various corpora.