An evaluation of corpus-driven measures of medical concept similarity for information retrieval

  • Authors:
  • Bevan Koopman;Guido Zuccon;Peter Bruza;Laurianne Sitbon;Michael Lawley

  • Affiliations:
  • CSIRO, Brisbane, Australia;CSIRO, Brisbane, Australia;Queensland University of Technology, Brisbane, Australia;Queensland University of Technology, Brisbane, Australia;CSIRO, Brisbane, Australia

  • Venue:
  • Proceedings of the 21st ACM international conference on Information and knowledge management
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Measures of semantic similarity between medical concepts are central to a number of techniques in medical informatics, including query expansion in medical information retrieval. Previous work has mainly considered thesaurus-based path measures of semantic similarity and has not compared different corpus-driven approaches in depth. We evaluate the effectiveness of eight common corpus-driven measures in capturing semantic relatedness and compare these against human judged concept pairs assessed by medical professionals. Our results show that certain corpus-driven measures correlate strongly (approx 0.8) with human judgements. An important finding is that performance was significantly affected by the choice of corpus used in priming the measure, i.e., used as evidence from which corpus-driven similarities are drawn. This paper provides guidelines for the implementation of semantic similarity measures for medical informatics and concludes with implications for medical information retrieval.