Learning to detect web spam by genetic programming

  • Authors:
  • Xiaofei Niu;Jun Ma;Qiang He;Shuaiqiang Wang;Dongmei Zhang

  • Affiliations:
  • School of Computer Science and Technology, Shandong University, Jinan, China and School of Computer Science & Technology of Shandong Jianzhu University, Shandong, China;School of Computer Science and Technology, Shandong University, Jinan, China;School of Computer Science and Technology, Shandong University, Jinan, China;Department of Computer Science, Texas State University, San Marcos;School of Computer Science and Technology, Shandong University, Jinan, China and School of Computer Science & Technology of Shandong Jianzhu University, Shandong, China

  • Venue:
  • WAIM'10 Proceedings of the 11th international conference on Web-age information management
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Web spam techniques enable some web pages or sites to achieve undeserved relevance and importance. They can seriously deteriorate search engine ranking results. Combating web spam has become one of the top challenges for web search. This paper proposes to learn a discriminating function to detect web spam by genetic programming. The evolution computation uses multi-populations composed of some small-scale individuals and combines the selected best individuals in every population to gain a possible best discriminating function. The experiments on WEBSPAM-UK2006 show that the approach can improve spam classification recall performance by 26%, F-measure performance by 11%, and accuracy performance by 4% compared with SVM.