Link spam detection based on mass estimation

  • Authors:
  • Zoltan Gyongyi;Pavel Berkhin;Hector Garcia-Molina;Jan Pedersen

  • Affiliations:
  • Computer Science Department, Stanford University, Stanford, CA and Yahoo! Inc.;Yahoo! Inc., Sunnyvale, CA;Computer Science Department, Stanford University, Stanford, CA;Yahoo! Inc., Sunnyvale, CA

  • Venue:
  • VLDB '06 Proceedings of the 32nd international conference on Very large data bases
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Link spamming intends to mislead search engines and trigger an artificially high link-based ranking of specific target web pages. This paper introduces the concept of spam mass, a measure of the impact of link spamming on a page's ranking. We discuss how to estimate spam mass and how the estimates can help identifying pages that benefit significantly from link spamming. In our experiments on the host-level Yahoo! web graph we use spam mass estimates to successfully identify tens of thousands of instances of heavyweight link spamming.