Classifying Wikipedia articles using network motif counts and ratios

  • Authors:
  • Guangyu Wu;Martin Harrigan;Pádraig Cunningham

  • Affiliations:
  • University College Dublin, Dublin, Ireland;University College Dublin, Dublin, Ireland;University College Dublin, Dublin, Ireland

  • Venue:
  • Proceedings of the Eighth Annual International Symposium on Wikis and Open Collaboration
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Because the production of Wikipedia articles is a collaborative process, the edit network around a article can tell us something about the quality of that article. Articles that have received little attention will have sparse networks; at the other end of the spectrum, articles that are Wikipedia battle grounds will have very crowded networks. In this paper we evaluate the idea of characterizing edit networks as a vector of motif counts that can be used in clustering and classification. Our objective is not immediately to develop a powerful classifier but to assess what is the signal in network motifs. We show that this motif count vector representation is effective for classifying articles on the Wikipedia quality scale. We further show that ratios of motif counts can effectively overcome normalization problems when comparing networks of radically different sizes.