Detecting Link Spam Using Temporal Information

Authors:
Guoyang Shen;Bin Gao;Tie-Yan Liu;Guang Feng;Shiji Song;Hang Li
Affiliations:
Microsoft Research Asia, China/ Tsinghua University, China;Microsoft Research Asia, China;Microsoft Research Asia, China;Microsoft Research Asia, China/ Tsinghua University, China;Tsinghua University, China;Microsoft Research Asia, China
Venue:
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Year:
2006

Citing 0
Cited 14

Splog detection using self-similarity analysis on blog temporal dynamics

AIRWeb '07 Proceedings of the 3rd international workshop on Adversarial information retrieval on the web
Know your neighbors: web spam detection using the web topology

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Link analysis for Web spam detection

ACM Transactions on the Web (TWEB)
Detecting splogs via temporal dynamics using self-similarity analysis

ACM Transactions on the Web (TWEB)
Looking into the past to better classify web spam

Proceedings of the 5th International Workshop on Adversarial Information Retrieval on the Web
Identifying spam link generators for monitoring emerging web spam

Proceedings of the 4th workshop on Information credibility
Freshness matters: in flowers, food, and web authority

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Temporal query log profiling to improve web search ranking

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Detecting spam blogs from blog search results

Information Processing and Management: an International Journal
Web spam classification: a few features worth more

Proceedings of the 2011 Joint WICOW/AIRWeb Workshop on Web Quality
Adversarial Web Search

Foundations and Trends in Information Retrieval
Detecting malicious web links and identifying their attack types

WebApps'11 Proceedings of the 2nd USENIX conference on Web application development
Detecting fake websites: the contribution of statistical learning theory

MIS Quarterly
Detecting Fake Medical Web Sites Using Recursive Trust Labeling

ACM Transactions on Information Systems (TOIS)

Quantified Score

Hi-index	0.00

Visualization

Abstract

How to effectively protect against spam on search ranking results is an important issue for contemporary web search engines. This paper addresses the problem of combating one major type of web spam: 'link spam.' Most of the previous work on anti link spam managed to make use of one snapshot of web data to detect spam, and thus it did not take advantage of the fact that link spam tends to result in drastic changes of links in a short time period. To overcome the shortcoming, this paper proposes using temporal information on links in detection of link spam, as well as other information. Specifically, it defines temporal features such as In-link Growth Rate (IGR) and In-link Death Rate (IDR) in a spam classification model (i.e., SVM). Experimental results on web domain graph data show that link spam can be successfully detected with the proposed method.