Effective site finding using link anchor information
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
The Importance of Prior Probabilities for Entry Page Search
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Overview of INEX 2007 Link the Wiki Track
Focused Access to XML Documents
Exploiting potential citation papers in scholarly paper recommendation
Proceedings of the 13th ACM/IEEE-CS joint conference on Digital libraries
Hi-index | 0.00 |
We investigate the task of finding links from Wikipedia pages to external web pages. Such external links significantly extend the information in Wikipedia with information from the Web at large, while retaining the encyclopedic organization of Wikipedia. We use a language modeling approach to create a full-text and anchor text runs, and experiment with different document priors. In addition we explore whether social bookmarking site Delicious can be exploited to further improve our performance. We have constructed a test collection of 53 topics, which are Wikipedia pages on different entities. Our findings are that the anchor text index is a very effective method to retrieve home pages. Url class and anchor text length priors and their combination leads to the best results. Using Delicious on its own does not lead to very good results, but it does contain valuable information. Combining the best anchor text run and the Delicious run leads to further improvements.