Does “authority” mean quality? predicting expert quality ratings of Web documents
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
A case study in web search using TREC algorithms
Proceedings of the 10th international conference on World Wide Web
Effective site finding using link anchor information
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
The Importance of Prior Probabilities for Entry Page Search
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
On Collection Size and Retrieval Effectiveness
Information Retrieval
Adaptive on-line page importance computation
WWW '03 Proceedings of the 12th international conference on World Wide Web
Combining document representations for known-item search
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Analysis of anchor text for web search
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Engineering a multi-purpose test collection for web retrieval experiments
Information Processing and Management: an International Journal
Replicating Web Structure in Small-Scale Test Collections
Information Retrieval
How valuable is external link evidence when searching enterprise Webs?
ADC '04 Proceedings of the 15th Australasian database conference - Volume 27
Crawling a country: better strategies than breadth-first for web page ordering
WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
Proceedings of the 14th ACM international conference on Information and knowledge management
Estimating average precision with incomplete and imperfect judgments
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Hits on the web: how does it compare?
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Building enriched document representations using aggregated anchor text
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
The impact of crawl policy on web search effectiveness
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Focus and element length for book and wikipedia retrieval
INEX'10 Proceedings of the 9th international conference on Initiative for the evaluation of XML retrieval: comparative evaluation of focused retrieval
Discovering missing click-through query language information for web search
Proceedings of the 20th ACM international conference on Information and knowledge management
Using anchor text for homepage and topic distillation search tasks
Journal of the American Society for Information Science and Technology
Improving MeSH classification of biomedical articles using citation contexts
Journal of Biomedical Informatics
Building enriched web page representations using link paths
Proceedings of the 23rd ACM conference on Hypertext and social media
Twanchor text: a preliminary study of the value of tweets as anchor text
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Incorporating social anchors for ad hoc retrieval
Proceedings of the 10th Conference on Open Research Areas in Information Retrieval
Hi-index | 0.00 |
It is generally believed that propagated anchor text is very important for effective Web search as offered by the commercial search engines. "Google Bombs" are a notable illustration of this. However, many years of TREC Web retrieval research failed to establish the effectiveness of link evidence for ad hoc retrieval on Web collections. The ultimate resolution to this dilemma was that typical Web search is very different from the traditional ad hoc methodology. So far, however, no one has established why link information, like incoming link degree or anchor text, does not help ad hoc retrieval effectiveness. Several possible explanations were given, including the collections being too small for anchors to be effective, and the density of the link graph being too low. The new TREC 2009 Web Track collection is substantially larger than previous collections and has a dense link graph. Our main finding is that propagated anchor text outperforms full-text retrieval in terms of early precision, and in combination with it, gives an improvement in overall precision. We then analyse the impact of link density and collection size by down-sampling the number of links and the number of pages respectively. Other findings are that, contrary to expectations, (inter-server) link density has little impact on effectiveness, while the size of the collection has a substantial impact on the quantity, quality and effectiveness of anchor text. We also compare the diversity of the search results of anchor text and full-text approaches, which show that anchor text performs significantly better than full-text search and confirm our findings for the ad hoc search task.