Silhouettes: a graphical aid to the interpretation and validation of cluster analysis
Journal of Computational and Applied Mathematics
Authoritative sources in a hyperlinked environment
Journal of the ACM (JACM)
Concept decompositions for large sparse text data using clustering
Machine Learning
Explorations in Automatic Thesaurus Discovery
Explorations in Automatic Thesaurus Discovery
Cluster validation techniques for genome expression data
Signal Processing - Special issue: Genomic signal processing
Document clustering based on non-negative matrix factorization
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
The Journal of Machine Learning Research
Information-theoretic co-clustering
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Clustering on the Unit Hypersphere using von Mises-Fisher Distributions
The Journal of Machine Learning Research
Automatic labeling of multinomial topic models
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Enhancing cluster labeling using wikipedia
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Generic title labeling for clustered documents
Expert Systems with Applications: An International Journal
Clustered SVD strategies in latent semantic indexing
Information Processing and Management: an International Journal
From frequency to meaning: vector space models of semantics
Journal of Artificial Intelligence Research
W-kmeans: clustering news articles using wordNet
KES'10 Proceedings of the 14th international conference on Knowledge-based and intelligent information and engineering systems: Part III
Co-clustering under nonnegative matrix tri-factorization
ICONIP'11 Proceedings of the 18th international conference on Neural Information Processing - Volume Part II
IEEE Transactions on Neural Networks
Selecting labels for news document clusters
NLDB'07 Proceedings of the 12th international conference on Applications of Natural Language to Information Systems
Hi-index | 0.00 |
Efficient clustering algorithms have been developed to automatically group documents into subgroups (clusters). Once clustering has been performed, an important additional step is to help users make sense of the obtained clusters. Existing methods address this issue by assigning to each cluster a flat list of descriptive terms (labels) that are extracted from the documents, most often using statistical techniques borrowed from the field of feature selection or reduction. A limitation of these unstructured descriptions of clusters' contents is that they do not account for the meaningful relationships between the terms. In contrast, we propose a graph representation, which makes the clusters easier to interpret by putting the descriptive terms in context, and by performing some simple network analysis. Our experiments reveal that the proposed method allows for a deeper level of interpretation, both when the clusters under study are homogeneous and when they are heterogeneous. In addition, evaluation procedures presented in the paper show that the graph-based representation of each cluster, while being very synthetic, still quite faithfully reflects the original content of the cluster.