Improved algorithms for topic distillation in a hyperlinked environment
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Authoritative sources in a hyperlinked environment
Journal of the ACM (JACM)
Document language models, query models, and risk minimization for information retrieval
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the 11th international conference on World Wide Web
Improvements of HITS algorithms for spam links
APWeb/WAIM'07 Proceedings of the joint 9th Asia-Pacific web and 8th international conference on web-age information management conference on Advances in data and web management
Adaptive combination of tag and link-based user similarity in flickr
Proceedings of the international conference on Multimedia
Clustering web pages to facilitate revisitation on mobile devices
Proceedings of the 2012 ACM international conference on Intelligent User Interfaces
AutoWeb: automatic classification of mobile web pages for revisitation
MobileHCI '12 Proceedings of the 14th international conference on Human-computer interaction with mobile devices and services
Russian web spam evolution: yandex experience
Proceedings of the 22nd international conference on World Wide Web companion
Quality-biased ranking for queries with commercial intent
Proceedings of the 22nd international conference on World Wide Web companion
Hi-index | 0.00 |
In order to artificially boost the rank of commercial pages in search engine results, search engine optimizers pay for links to these pages on other websites. Identifying paid links is important for a web search engine to produce highly relevant results. In this paper we introduce a novel method of identifying such links. We start with training a classifier of anchor text topics and analyzing web pages for diversity of their outgoing commercial links. Then we use this information and analyze link graph of the Russian Web to find pages that sell links and sites that buy links and to identify the paid links. Testing on manually marked samples showed high efficiency of the algorithm.