Effective site finding using link anchor information
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Information Retrieval: Algorithms and Heuristics (The Kluwer International Series on Information Retrieval)
Hi-index | 0.00 |
In this paper, we present an intelligent web retrieval system that is able to rank webpages by using Wikipedia knowledge to enhance a standard vector space model. Our index contains separate information about the frequency of the terms in Wikpedia articles, in home pages, and in other types of web pages, instead of using a generic term frequency for the whole text collection. We also filter out spam. We present results on the ClueWeb collection, for two sets of queries, for an adhoc retrieval task and for a diversity task (which aims at retrieving not only relevant information, but also information for different aspects of the queries).