Improved algorithms for topic distillation in a hyperlinked environment
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Automatic resource compilation by analyzing hyperlink structure and associated text
WWW7 Proceedings of the seventh international conference on World Wide Web 7
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Making large-scale support vector machine learning practical
Advances in kernel methods
Authoritative sources in a hyperlinked environment
Journal of the ACM (JACM)
IR evaluation methods for retrieving highly relevant documents
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
The stochastic approach for link-structure analysis (SALSA) and the TKC effect
Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
Improvement of HITS-based algorithms on web documents
Proceedings of the 11th international conference on World Wide Web
Information Retrieval
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Spam, damn spam, and statistics: using statistical analysis to locate spam web pages
Proceedings of the 7th International Workshop on the Web and Databases: colocated with ACM SIGMOD/PODS 2004
Identifying link farm spam pages
WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
Fast webpage classification using URL features
Proceedings of the 14th ACM international conference on Information and knowledge management
The SMART Retrieval System—Experiments in Automatic Document Processing
The SMART Retrieval System—Experiments in Automatic Document Processing
Site level noise removal for search engines
Proceedings of the 15th international conference on World Wide Web
Detecting spam web pages through content analysis
Proceedings of the 15th international conference on World Wide Web
Detecting nepotistic links by language model disagreement
Proceedings of the 15th international conference on World Wide Web
Stanford WebBase components and applications
ACM Transactions on Internet Technology (TOIT)
Combating web spam with trustrank
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Thwarting the nigritude ultramarine: learning to identify link spam
ECML'05 Proceedings of the 16th European conference on Machine Learning
Adversarial Information Retrieval on the Web (AIRWeb 2007)
ACM SIGIR Forum
A study of link farm distribution and evolution using a time series of web snapshots
Proceedings of the 5th International Workshop on Adversarial Information Retrieval on the Web
Web spam identification through language model analysis
Proceedings of the 5th International Workshop on Adversarial Information Retrieval on the Web
Use noisy link analysis to improve web search
Proceedings of the 20th ACM conference on Hypertext and hypermedia
Identifying spam link generators for monitoring emerging web spam
Proceedings of the 4th workshop on Information credibility
Web spam detection: new classification features based on qualified link analysis and language models
IEEE Transactions on Information Forensics and Security
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
Combating link spam by noisy link analysis
ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications: Part I
Foundations and Trends in Information Retrieval
Bridging link and query intent to enhance web search
Proceedings of the 22nd ACM conference on Hypertext and hypermedia
Learning resources in federated environments: a broken link checker based on URL similarity
International Journal of Metadata, Semantics and Ontologies
Hi-index | 0.00 |
The early success of link-based ranking algorithms was predicated on the assumption that links imply merit of the target pages. However, today many links exist for purposes other than to confer authority. Such links bring noise into link analysis and harm the quality of retrieval. In order to provide high quality search results, it is important to detect them and reduce their influence. In this paper, a method is proposed to detect such links by considering multiple similarity measures over the source pages and target pages. With the help of a classifier, these noisy links are detected and dropped. After that, link analysis algorithms are performed on the reduced link graph. The usefulness of a number of features are also tested. Experiments across 53 query-specific datasets show our approach almost doubles the performance of Kleinberg's HITS and boosts Bharat and Henzinger's imp algorithm by close to 9% in terms of precision. It also outperforms a previous approach focusing on link farm detection.