The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Making large-scale support vector machine learning practical
Advances in kernel methods
Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
Who Links to Whom: Mining Linkage between Web Sites
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
A large-scale study of the evolution of web pages
WWW '03 Proceedings of the 12th international conference on World Wide Web
On the Evolution of Clusters of Near-Duplicate Web Pages
LA-WEB '03 Proceedings of the First Conference on Latin American Web Congress
Building Nutch: Open Source Search
Queue - Search Engines
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Spam, damn spam, and statistics: using statistical analysis to locate spam web pages
Proceedings of the 7th International Workshop on the Web and Databases: colocated with ACM SIGMOD/PODS 2004
Identifying link farm spam pages
WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
Combating web spam with trustrank
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Challenges in web search engines
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Topical TrustRank: using topicality to combat web spam
Proceedings of the 15th international conference on World Wide Web
Detecting semantic cloaking on the web
Proceedings of the 15th international conference on World Wide Web
Improving web spam classifiers using link structure
AIRWeb '07 Proceedings of the 3rd international workshop on Adversarial information retrieval on the web
Measuring similarity to detect qualified links
AIRWeb '07 Proceedings of the 3rd international workshop on Adversarial information retrieval on the web
Web spam detection via commercial intent analysis
AIRWeb '07 Proceedings of the 3rd international workshop on Adversarial information retrieval on the web
Know your neighbors: web spam detection using the web topology
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Link analysis for Web spam detection
ACM Transactions on the Web (TWEB)
Combating Spamdexing: Incorporating Heuristics in Link-Based Ranking
Algorithms and Models for the Web-Graph
Cleaning search results using term distance features
AIRWeb '08 Proceedings of the 4th international workshop on Adversarial information retrieval on the web
Exploring linguistic features for web spam detection: a preliminary study
AIRWeb '08 Proceedings of the 4th international workshop on Adversarial information retrieval on the web
Predicting web spam with HTTP session information
Proceedings of the 17th ACM conference on Information and knowledge management
Web spam filtering in internet archives
Proceedings of the 5th International Workshop on Adversarial Information Retrieval on the Web
Detecting spam blogs: a machine learning approach
AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
The Journal of Machine Learning Research
On the robustness of google scholar against spam
Proceedings of the 21st ACM conference on Hypertext and hypermedia
Foundations and Trends in Information Retrieval
Survey on web spam detection: principles and algorithms
ACM SIGKDD Explorations Newsletter
An analysis of optimal link bombs
Theoretical Computer Science
Detecting Fake Medical Web Sites Using Recursive Trust Labeling
ACM Transactions on Information Systems (TOIS)
A Self-Supervised Approach to Comment Spam Detection Based on Content Analysis
International Journal of Information Security and Privacy
On the hardness of evading combinations of linear classifiers
Proceedings of the 2013 ACM workshop on Artificial intelligence and security
Hi-index | 0.00 |
The page rank of a commercial web site has an enormous economic impact because it directly influences the number of potential customers that find the site as a highly ranked search engine result. Link spamming – inflating the page rank of a target page by artificially creating many referring pages – has therefore become a common practice. In order to maintain the quality of their search results, search engine providers try to oppose efforts that decorrelate page rank and relevance and maintain blacklists of spamming pages while spammers, at the same time, try to camouflage their spam pages. We formulate the problem of identifying link spam and discuss a methodology for generating training data. Experiments reveal the effectiveness of classes of intrinsic and relational attributes and shed light on the robustness of classifiers against obfuscation of attributes by an adversarial spammer. We identify open research problems related to web spam.