Authoritative sources in a hyperlinked environment
Journal of the ACM (JACM)
PODS '00 Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Finding authorities and hubs from link structures on the World Wide Web
Proceedings of the 10th international conference on World Wide Web
Challenges in web search engines
ACM SIGIR Forum
Proceedings of the 13th international World Wide Web conference on Alternate track papers & posters
Identifying link farm spam pages
WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Topical TrustRank: using topicality to combat web spam
Proceedings of the 15th international conference on World Wide Web
Exploring both Content and Link Quality for Anti-Spamming
CIT '06 Proceedings of the Sixth IEEE International Conference on Computer and Information Technology
DiffusionRank: a possible penicillin for web spamming
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Combating web spam with trustrank
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
From Good to Bad Ones: Making Spam Detection Easier
CITWORKSHOPS '08 Proceedings of the 2008 IEEE 8th International Conference on Computer and Information Technology Workshops
Link based small sample learning for web spam detection
Proceedings of the 18th international conference on World wide web
Automatic seed set expansion for trust propagation based anti-spam algorithms
Information Sciences: an International Journal
Combating Web spam through trust-distrust propagation with confidence
Pattern Recognition Letters
Hi-index | 0.00 |
Previous anti-spamming algorithms based on link structure suffer from either the weakness of the page value metric or the vagueness of the seed selection. In this paper, we propose two page value metrics, AVRank and HVRank. These two "values" of all the web pages can be well assessed by using the bidirectional links' information. Moreover, with the help of bidirectional links, it becomes easier to enlarge the propagation coverage of seed sets. We further discuss the effectiveness of the combination of these two metrics, such as the quadratic mean of them. Our experimental results show that with such two metrics, our method can filter out spam sites and identify reputable ones more effectively than previous algorithms such as TrustRank.