Graph-based concept weighting for medical information retrieval

  • Authors:
  • Bevan Koopman;Guido Zuccon;Peter Bruza;Laurianne Sitbon;Michael Lawley

  • Affiliations:
  • Australian e-Health Research Centre, CSIRO, Brisbane, Australia and Queensland University of Technology, Brisbane, Australia;Australian e-Health Research Centre, CSIRO, Brisbane, Australia;Queensland University of Technology, Brisbane, Australia;Queensland University of Technology, Brisbane, Australia;Australian e-Health Research Centre, CSIRO, Brisbane, Australia

  • Venue:
  • Proceedings of the Seventeenth Australasian Document Computing Symposium
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a graph-based method to weight medical concepts in documents for the purposes of information retrieval. Medical concepts are extracted from free-text documents using a state-of-the-art technique that maps n-grams to concepts from the SNOMED CT medical ontology. In our graph-based concept representation, concepts are vertices in a graph built from a document, edges represent associations between concepts. This representation naturally captures dependencies between concepts, an important requirement for interpreting medical text, and a feature lacking in bag-of-words representations. We apply existing graph-based term weighting methods to weight medical concepts. Using concepts rather than terms addresses vocabulary mismatch as well as encapsulates terms belonging to a single medical entity into a single concept. In addition, we further extend previous graph-based approaches by injecting domain knowledge that estimates the importance of a concept within the global medical domain. Retrieval experiments on the TREC Medical Records collection show our method outperforms both term and concept baselines. More generally, this work provides a means of integrating background knowledge contained in medical ontologies into data-driven information retrieval approaches.