Identifying web spam with user behavior analysis

  • Authors:
  • Yiqun Liu;Rongwei Cen;Min Zhang;Shaoping Ma;Liyun Ru

  • Affiliations:
  • Tsinghua University, Beijing, China P.R.;Tsinghua University, Beijing, China P.R.;Tsinghua University, Beijing, China P.R.;Tsinghua University, Beijing, China P.R.;Tsinghua University, Beijing, China P.R.

  • Venue:
  • AIRWeb '08 Proceedings of the 4th international workshop on Adversarial information retrieval on the web
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Combating Web spam has become one of the top challenges for Web search engines. State-of-the-art spam detection techniques are usually designed for specific known types of Web spam and are incapable and inefficient for newly-appeared spam. With user behavior analyses into Web access logs, we propose a spam page detection algorithm based on Bayesian Learning. The main contributions of our work are: (1) User visiting patterns of spam pages are studied and three user behavior features are proposed to separate Web spam from ordinary ones. (2) A novel spam detection framework is proposed that can detect unknown spam types and newly-appeared spam with the help of user behavior analysis. Preliminary experiments on large scale Web access log data (containing over 2.74 billion user clicks) show the effectiveness of the proposed features and detection framework.