K-means based approaches to clustering nodes in annotated graphs

  • Authors:
  • Tijn Witsenburg;Hendrik Blockeel

  • Affiliations:
  • Leiden Institute of Advanced Computer Science, Universiteit Leiden, Leiden, The Netherlands;Leiden Institute of Advanced Computer Science, Universiteit Leiden, Leiden, The Netherlands and Department of Computer Science, Katholieke Universiteit Leuven, Leuven, Belgium

  • Venue:
  • ISMIS'11 Proceedings of the 19th international conference on Foundations of intelligent systems
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

The goal of clustering is to form groups of similar elements. Quality criteria for clusterings, as well as the notion of similarity, depend strongly on the application domain, which explains the existence of many different clustering algorithms and similarity measures. In this paper we focus on the problem of clustering annotated nodes in a graph, when the similarity between nodes depends on both their annotations and their context in the graph ("hybrid" similarity), using k-means-like clustering algorithms. We show that, for the similarity measure we focus on, k-means itself cannot trivially be applied. We propose three alternatives, and evaluate them empirically on the Cora dataset. We find that using these alternative clustering algorithms with the hybrid similarity can be advantageous over using standard k-means with a purely annotation-based similarity.