Improving biomedical document retrieval using domain knowledge

Authors:
Shuguang Wang;Milos Hauskrecht
Affiliations:
University of Pittsburgh, Pittsburgh, PA, USA;University of Pittsburgh, Pittsburgh, PA, USA
Venue:
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Year:
2008

Citing 4
Cited 1

Probabilistic latent semantic indexing

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Learning to Probabilistically Identify Authoritative Documents

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Term context models for information retrieval

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Knowledge-intensive conceptual retrieval and passage extraction of biomedical literature

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval

An efficient framework for constructing generalized locally-induced text metrics

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two

Quantified Score

Hi-index	0.00

Visualization

Abstract

Research articles typically introduce new results or findings and relate them to knowledge entities of immediate relevance. However, a large body of context knowledge related to the results is often not explicitly mentioned in the article. To overcome this limitation the state-of-the-art information retrieval approaches rely on the latent semantic analysis in which terms in articles are projected to a lower dimensional latent space and best possible matches in this space are identified. However, this approach may not perform well enough if the number of explicit knowledge entities in the articles is too small compared to the amount of knowledge in the domain. We address the problem by exploiting a domain knowledge layer, a rich network of relations among knowledge entities in the domain extracted from a large corpus of documents. The knowledge layer supplies the context knowledge that lets us relate different knowledge entities and hence improve the information retrieval performance. We develop and study a new framework for i) learning and aggregating the relations in the knowledge layer from the literature corpus; ii) and for exploiting these relations to improve the information-retrieval of relevant documents.