The quest for correct information on the Web: hyper search engines
Selected papers from the sixth international conference on World Wide Web
WebQuery: searching and visualizing the Web through connectivity
Selected papers from the sixth international conference on World Wide Web
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Authoritative sources in a hyperlinked environment
Journal of the ACM (JACM)
Applying Machine Learning to Text Segmentation for Information Retrieval
Information Retrieval
Characterizing and Mining the Citation Graph of the Computer Science Literature
Knowledge and Information Systems
Link analysis ranking: algorithms, theory, and experiments
ACM Transactions on Internet Technology (TOIT)
Hits on the web: how does it compare?
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Information Processing and Management: an International Journal
Firework visualization: a model for local citation analysis
Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium
Re-ranking with context for high-performance biomedical information retrieval
International Journal of Data Mining and Bioinformatics
Using semantic-based association rule mining for improving clinical text retrieval
HIS'13 Proceedings of the second international conference on Health Information Science
Hi-index | 0.00 |
This paper presents an empirical study of the combination of content-based information retrieval results with linkage-based document importance scores to improve retrieval performance on TREC biomedical literature datasets. In our study, content-based information comes from the state-of-the-art probability model based Okapi information retrieval system. On the other hand, linkage-based information comes from a citation graph generated from REFERENCES sections of a biomedical literature dataset. Three well-known linkage-based ranking algorithms (PageRank, HITS and InDegree) are applied on the citation graph to calculate document importance scores. We use TREC 2007 Genomics dataset for evaluation, which contains 162,259 biomedical literatures. Our approach achieves the best document-based MAP among all results that have been reported so far. Our major findings can be summarized as follows. First, without hyperlinks, linkage information extracted from REFERENCES sections can be used to improve the effectiveness of biomedical information retrieval. Second, performance of the integrated system is sensitive to linkage-based ranking algorithms, and a simpler algorithm, InDegree, is more suitable for biomedical literature retrieval.