Keyphrase extraction in biomedical publications using mesh and intraphrase word co-occurrence information

  • Authors:
  • Seong-Yong Bong;Kyu-Baek Hwang

  • Affiliations:
  • Soongsil University, Seoul, South Korea;Soongsil University, Seoul, South Korea

  • Venue:
  • Proceedings of the ACM fifth international workshop on Data and text mining in biomedical informatics
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Document keyphrases are used for various tasks such as indexing, clustering, and summarization. To documents without keyphrases, an automatic extraction method can be applied. In this paper, we propose an enhanced method of extracting keyphrases from biomedical papers, using MeSH (Medical Subject Headings) and intraphrase word co-occurrence information. MeSH terms assigned to biomedical papers can serve, not only as important features for keyphrase extraction, but also for expansion of keyphrase candidates. Intraphrase word co-occurrence information can be exploited for re-ranking keyphrase candidates. Through an experimental evaluation on 1,799 articles from three academic journals in the biomedical literature, we show that the candidate expansion and re-ranking steps of our approach are highly effective for improving the performance of keyphrase extraction.