Identifying Web Spam with the Wisdom of the Crowds

  • Authors:
  • Yiqun Liu;Fei Chen;Weize Kong;Huijia Yu;Min Zhang;Shaoping Ma;Liyun Ru

  • Affiliations:
  • State Key Laboratory of Intelligent Technology and Systems, Tsinghua University;State Key Laboratory of Intelligent Technology and Systems, Tsinghua University;State Key Laboratory of Intelligent Technology and Systems, Tsinghua University;State Key Laboratory of Intelligent Technology and Systems, Tsinghua University;State Key Laboratory of Intelligent Technology and Systems, Tsinghua University;State Key Laboratory of Intelligent Technology and Systems, Tsinghua University;State Key Laboratory of Intelligent Technology and Systems, Tsinghua University

  • Venue:
  • ACM Transactions on the Web (TWEB)
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Combating Web spam has become one of the top challenges for Web search engines. State-of-the-art spam-detection techniques are usually designed for specific, known types of Web spam and are incapable of dealing with newly appearing spam types efficiently. With user-behavior analyses from Web access logs, a spam page-detection algorithm is proposed based on a learning scheme. The main contributions are the following. (1) User-visiting patterns of spam pages are studied, and a number of user-behavior features are proposed for separating Web spam pages from ordinary pages. (2) A novel spam-detection framework is proposed that can detect various kinds of Web spam, including newly appearing ones, with the help of the user-behavior analysis. Experiments on large-scale practical Web access log data show the effectiveness of the proposed features and the detection framework.