Beyond PageRank: machine learning for static ranking
Proceedings of the 15th international conference on World Wide Web
Improving web search ranking by incorporating user behavior information
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Hi-index | 0.00 |
Scholars in life sciences have to process huge amounts of data in a disciplined and efficient way. These data are spread among thousands of databases which overlap in content but differ substantially with respect to interface, formats and data structure. Search engines have the potential of assisting in data retrieval from these structured sources but fall short of providing a relevance ranking of the results that reflects the needs of life science scholars. One such need is to acquire insights to cross-references among entities in the databases, whereby search hits with many cross-references are expected to be more informative than those with few cross-references. In this work, we investigate to what extend this expectation holds. We propose BioXREF, a method that extracts cross-references from multiple life science databases by combining targeted crawling, pointer chasing, sampling and information extraction. We study the retrieval quality of our method and the relationship between manually crafted relevance ranking and relevance ranking based on cross-references, and report on first, promising results.