Harmonic functions based semi-supervised learning for web spam detection

  • Authors:
  • Weifeng Zhang;Danmei Zhu;Yinzhou Zhang;Guoqiang Zhou;Baowen Xu

  • Affiliations:
  • Nanjing University of Posts and Telecommunications, Nanjing, China;Nanjing University of Posts and Telecommunications, Nanjing, China;Nanjing University of Posts and Telecommunications, Nanjing, China;Nanjing University of Posts and Telecommunications, Nanjing, China;Nanjing University, Nanjing, China

  • Venue:
  • Proceedings of the 2011 ACM Symposium on Applied Computing
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

In web spam detection, we propose a new semi-supervised learning algorithm named HFSSL (harmonic functions based semi-supervised learning). In our method, labeled and unlabeled web pages are represented as vertices in a weighted graph. The learning problem is then modeled as a Gaussian random field on this graph, where the mean of the field is characterized by harmonic functions, which can be efficiently obtained using matrix methods. The experiments on standard WEBSPAM-UK2006 show that our algorithm is effective.